The open-source Pareto frontier over time: each point is an open-weights model that beat every open model before it — plotted by how many months earlier a proprietary model reached the same level.
The vertical position is the gap to the earliest proprietary model with the same Artificial Analysis Intelligence Index (hover any point to see which one). The gap peaked near 10 months around DeepSeek V3 (Dec 2024) and has since tightened to ~2–3.5 months as DeepSeek, Kimi, Z AI and MiniMax began shipping close behind the frontier.
Frontier-pusher. A model is included only if its Intelligence Index exceeded every open-weights model released before it. 23 qualify. (Solar Mini, which the naive running-max flags in early 2024, is excluded: at floor-level scores the retroactive v4.1 index ranks it above contemporaneous stronger open models — Mixtral 8×7B, Llama-2-70B, Qwen-72B — so it was not actually the leading open model at release.)
Metric. Artificial Analysis Intelligence Index (v4.1), applied consistently to every model across all dates. It is anchored to today's hard benchmarks (GPQA, HLE, terminal-bench, etc.), so 2023 models score low and their ordering at the floor is noisy.
Gap. For a frontier model released on date D
with index S
, gap =
D − T
, where T
is the earliest date any proprietary model reached index
≥ S
— that proprietary model is shown in each point's tooltip.
Note. This is the gap at a given capability level, not the absolute frontier. GLM-5.2 (II 51) matches GPT-5.4 from 3.4 months earlier, while the absolute closed frontier (Claude Fable 5, II 60) sits further ahead in raw capability.