cd /news/large-language-models/how-far-do-open-weights-trail-the-fr… · home topics large-language-models article
[ARTICLE · art-32469] src=lesswrong.com ↗ pub= topic=large-language-models verified=true sentiment=· neutral

How far do open weights trail the frontier?

A new analysis using Epoch's ECI metric shows that open-weight AI models continue to trail closed models on the frontier, with the gap persisting over time. The analysis, based on item response theory, provides a more nuanced view than the AA Index, revealing that open-weight models have not yet matched the performance of leading closed models like those from OpenAI and Anthropic.

read1 min views2 publishedJun 18, 2026

I saw this Twitter post today and really liked the idea. But I think the AA Index is a rather crude way and much prefer ECI from Epoch, which uses IRT. The resulting graph does meaningfully diverge from the Twitter post (which seems to weirdly collapse at the end, maybe because of no logistic assumptions being taken into consideration):

[see linkpost to actually interact with graphs, like seeing what model is what, etc]

For context, the two raw frontiers - the running best ECI over time for open-weight vs closed models: Sadly, GLM-5.2 has not been scored yet, but I'll update the website when it is.

You can also generalize to other criteria (though this is probably the most interesting one). One such example would be the OpenAI vs Anthropic rivalry:

── more in #large-language-models 4 stories · sorted by recency
── more on @epoch 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-far-do-open-weig…] indexed:0 read:1min 2026-06-18 ·