Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes

wpnews.pro

cd /news/large-language-models/frontier-llm-based-agents-can-overco… · home › topics › large-language-models › article

[ARTICLE · art-17150] src=arxiv.org ↗ pub=2026-05-29T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes

Frontier large language model-based agents from Anthropic and OpenAI matched the performance of trained human biocurators in linking free-text phenotype descriptions to ontology terms, a task known as phenotype annotation. In a benchmark against a Gold Standard of Entity-Quality annotations, all five agents fell within the range of inter-curator variability of three human experts, while substantially outperforming the Semantic CharaParser NLP tool. The findings suggest that LLM agents can overcome the ontology curation bottleneck that has limited the scaling of cross-study integration of comparative morphological data.

read1 min views7 publishedMay 29, 2026

arXiv:2605.28965v1 Announce Type: new Abstract: Linking free-text phenotype descriptions to ontology terms, typically referred to as phenotype annotation, is essential for the cross-study integration of comparative morphological data. This labor intensive process has heavily relied on highly trained human experts, which makes it challenging to scale and thus a key bottleneck. Dahdul et al. (2018) established a Gold Standard (GS) of Entity-Quality (EQ) annotations across seven phylogenetic studies and used it to evaluate three human curators and the Semantic CharaParser NLP tool with ontology-based semantic similarity metrics; they reported that machine-human consistency was significantly lower than inter-curator (human-human) consistency. Here we revisit that benchmark with five frontier hosted LLMs from Anthropic and OpenAI, each operating as an "agentic curator" within a self-contained workspace that supplies the source publication PDF, the same annotation guide used by the original human curators, the four project ontologies (UBERON, PATO, BSPO, GO), and a validation script. Evaluated against the same Gold Standard, every agent fell within the range of inter-curator variability of the three trained human biocurators of the original study; the best performing agents approached but did not reach the best performing human curator. Agents substantially outperformed Semantic CharaParser on all four metrics.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/frontier-llm-based-agent…

Read original on arxiv.org → arxiv.org/abs/2605.28965

mentioned entities

Anthropic

OpenAI

Semantic CharaParser

UBERON

PATO

BSPO

Dahdul et al.

metadata

slugfrontier-llm-based-agents-can-overcome-the-ontology-curation-bottleneck-for

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevChatGPT glitch is leaking OpenAI…

next →New infosec products of the mont…

── more in #large-language-models 4 stories · sorted by recency

dev.to · 15 Jul · #large-language-models

I Gave an AI Agent a Database, Compute, Storage, and Models From One CLI

zdnet.com · 15 Jul · #large-language-models

I let ChatGPT Work and Claude Cowork loose on my files - only one made me nervous

bomly.dev · 15 Jul · #large-language-models

Show HN: An AI agent fixed 98% of vulnerable deps in one run, 14% in the next

cryptobriefing.com · 15 Jul · #large-language-models

Anthropic eyes $1.2T valuation by end of 2026 amid AI sector boom

── more on @anthropic 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required