{"slug": "openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life", "title": "OpenAI Releases LifeSciBench, a 750-Task Benchmark Grading AI Models on Real Life-Science Research With Expert-Written Rubric", "summary": "OpenAI released LifeSciBench, a benchmark of 750 expert-authored tasks evaluating AI models on real life-science research. The best model, GPT-Rosalind, scored 36.1%, indicating significant room for improvement in reasoning and operational tasks.", "body_md": "OpenAI's LifeSciBench evaluates whether frontier AI can handle real life-science research across 750 expert-authored tasks, seven workflows, and seven biological domains. Built by 173 PhD scientists with 19,020 rubric criteria, it grades reasoning and decisions, not just recall. The best model, GPT-Rosalind, passes 36.1%, leaving large headroom on artifacts, exact outputs, and operational calls.\n\nThe post [OpenAI Releases LifeSciBench, a 750-Task Benchmark Grading AI Models on Real Life-Science Research With Expert-Written Rubric](https://www.marktechpost.com/2026/06/17/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life-science-research-with-expert-written-rubric/) appeared first on [MarkTechPost](https://www.marktechpost.com).", "url": "https://wpnews.pro/news/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life", "canonical_source": "https://www.marktechpost.com/2026/06/17/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life-science-research-with-expert-written-rubric/", "published_at": "2026-06-18 02:28:22+00:00", "updated_at": "2026-06-18 02:55:22.367240+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-research", "ai-products", "large-language-models"], "entities": ["OpenAI", "LifeSciBench", "GPT-Rosalind", "MarkTechPost"], "alternates": {"html": "https://wpnews.pro/news/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life", "markdown": "https://wpnews.pro/news/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life.md", "text": "https://wpnews.pro/news/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life.txt", "jsonld": "https://wpnews.pro/news/openai-releases-lifescibench-a-750-task-benchmark-grading-ai-models-on-real-life.jsonld"}}