cd /news/artificial-intelligence/openai-introduces-genebench-pro-for-… · home topics artificial-intelligence article
[ARTICLE · art-45657] src=letsdatascience.com ↗ pub= topic=artificial-intelligence verified=true sentiment=· neutral

OpenAI Introduces GeneBench-Pro for Computational Biology Reasoning

OpenAI released GeneBench-Pro on June 30, 2026, a benchmark measuring AI agents' ability to reason about noisy biological datasets across 129 synthetic problems in genomics, quantitative biology, and translational medicine. OpenAI's strongest model, GPT-5.6 Sol, solved 28.7% of problems at the highest reasoning level, up from below 5% for GPT-5, while open-weight models like GLM 5.2 lagged, suggesting they are optimized for coding rather than scientific reasoning. Each problem would take a human expert 20 to 40 hours to solve.

read1 min views1 publishedJun 30, 2026

For teams building AI-for-science systems, the bottleneck is no longer recalling facts or running a fixed pipeline; it is the higher-order judgment of deciding which analysis a messy dataset can actually support. GeneBench-Pro, released by OpenAI on June 30, 2026, is built to measure exactly that. The benchmark presents an agent with 129 synthetic problems across genomics, quantitative biology, and translational medicine, each pairing a realistic and deliberately noisy dataset with a target estimand tied to a downstream decision. Because every problem is generated from a known causal structure, correctness is graded deterministically, sidestepping the rubric variability that weakens many long-horizon science benchmarks. OpenAI reports its strongest model, GPT-5.6 Sol, solves 28.7 percent of problems at the highest reasoning level and 31.5 percent with Pro mode, up sharply from below 5 percent for GPT-5 when the original GeneBench was built. OpenAI frames the gap to open-weight models such as GLM 5.2 as evidence that open systems are tuned more for coding than for broad scientific reasoning. Reviewers estimated each problem would take a human expert 20 to 40 hours.

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @openai 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/openai-introduces-ge…] indexed:0 read:1min 2026-06-30 ·