Frontier Language Model Intelligence, over Time

wpnews.pro

cd /news/large-language-models/frontier-language-model-intelligence… · home › topics › large-language-models › article

[ARTICLE · art-26594] src=artificialanalysis.ai ↗ pub=2026-06-14T00:04Z topic=large-language-models verified=true sentiment=· neutral

Frontier Language Model Intelligence, over Time

Artificial Analysis released its Frontier Language Model Intelligence index, tracking performance, cost, and execution time of leading AI models over time. The index evaluates models on agentic tasks, coding, reasoning, and knowledge, providing independent benchmarks for model selection.

read2 min views16 publishedJun 14, 2026

Understand the AI landscape to choose the best model and provider for your use case

Highlights

Personalized model recommender

Get personalized recommendations based on your priorities for intelligence, speed, and cost

Explore agents for general work, coding, customer support, and more

Compare AI agents across capabilities, pricing, and platform support

Explore premium plans

Access expanded benchmark data, custom visualizations, industry reports, and more

Intelligence #

Intelligence of leading AI models based on our independent evaluations

Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index by Open Weights / Proprietary

Intelligence vs. Cost to Run Artificial Analysis Intelligence Index

Create custom visualizationsCreate your own charts and tables comparing models and providers, save groups of models, and export data.Go to Data Playground

Frontier Language Model Intelligence, Over Time

Performance, cost, and execution time for leading coding agents on end-to-end software engineering tasks

Explore Artificial Analysis Coding Agent Index

Artificial Analysis Coding Agent Index

Image & Video Leaderboards #

Top models from our Image Arena and Video Arena leaderboards, with 95% confidence intervals

Text to Image Leaderboard

See the full leaderboard here.

Intelligence Evaluations

Agentic real-world work tasks, (Elo-500)/2000 Agentic coding & terminal use

Agentic tool use

Long context reasoning

Knowledge

1 - hallucination rate

Reasoning & knowledge

Scientific reasoning

Coding

Instruction following

Physics reasoning

Long-horizon agentic tasks

ITBench-AANew Kubernetes incident root-cause analysis

Visual reasoning

AA-Omniscience is a knowledge and hallucination benchmark that rewards accuracy, punishes bad guesses and provides a comprehensive view of which models produce factually reliable outputs across different domains

AA-Omniscience Index

GDPval-AA evaluates AI models on real-world, economically valuable tasks across a wide range of occupations

GDPval-AA Leaderboard

ITBench-AANew ITBench-AA evaluates AI agents on Kubernetes incident root-cause analysis from offline incident snapshots

ITBench-AA Average precision at full recall

Artificial Analysis Openness Index assesses how 'open' models are on the basis of their availability and transparency across different components.

Artificial Analysis Openness Index: Components

Artificial Analysis Openness Index vs. Artificial Analysis Intelligence Index

Output Tokens #

Output tokens of leading AI models based on our independent evaluations

Output Tokens Used to Run Artificial Analysis Intelligence Index

Cost Efficiency #

Cost of leading AI models based on our independent evaluations

Cost to Run Artificial Analysis Intelligence Index

Speed & Latency #

Comparison of first-party API performance

Output Speed

PriceUpdated #

Price of leading AI models based on our independent evaluations

Pricing: Cache Hit, Input, and Output

NewHardware Benchmarking Comprehensive benchmarking of GPUs for language model inference

Video Arena & Leaderboard Compare leading Text to Video and Image to Video models

Image Arena & Leaderboard Compare leading Image Generation and Image Editing models

Speech Arena & Leaderboard Compare leading Text to Speech models

source & further reading

artificialanalysis.ai — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/frontier-language-model-…

Read original on artificialanalysis.ai → artificialanalysis.ai/?model-creators=anthropic%…

mentioned entities

Artificial Analysis

AA-Omniscience

GDPval-AA

ITBench-AA

Kubernetes

metadata

slugfrontier-language-model-intelligence-over-time

topic#large-language-models

secondary4 topics

sentimentneutral

canonicalartificialanalysis.ai

navigation

← prevMeta reportedly moves to unwind …

next →Is there a name for the type of …

── more in #large-language-models 4 stories · sorted by recency

serpapi.com · 28 Jul · #large-language-models

What Google's New Search Updates Mean for the Future of Data Extraction

cryptobriefing.com · 28 Jul · #large-language-models

Google upgrades Gemini API agents with Gemini 3.6 Flash and scheduled tasks

pub.towardsai.net · 28 Jul · #large-language-models

TAI #215: AI Is Expanding Roles Before Job Titles Change

dev.to · 28 Jul · #large-language-models

OpenAI Study Finds ChatGPT Is Becoming a Generalist AI Tool for Small Businesses

── more on @artificial analysis 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required