# Frontier Language Model Intelligence, over Time

> Source: <https://artificialanalysis.ai/?model-creators=anthropic%2Cmistral#frontier-language-model-intelligence-over-time>
> Published: 2026-06-14 00:04:22+00:00

# Independent analysis of AI

Understand the AI landscape to choose the best model and provider for your use case

Highlights

Personalized model recommender

Get personalized recommendations based on your priorities for intelligence, speed, and cost

Explore agents for general work, coding, customer support, and more

Compare AI agents across capabilities, pricing, and platform support

Explore premium plans

Access expanded benchmark data, custom visualizations, industry reports, and more

## Intelligence

Intelligence of leading AI models based on our independent evaluations

### Artificial Analysis Intelligence Index

### Artificial Analysis Intelligence Index by Open Weights / Proprietary

### Intelligence vs. Cost to Run Artificial Analysis Intelligence Index

[Create custom visualizationsCreate your own charts and tables comparing models and providers, save groups of models, and export data.Go to Data Playground](/data-playground)

### Frontier Language Model Intelligence, Over Time

Performance, cost, and execution time for leading coding agents on end-to-end software engineering tasks

[Explore Artificial Analysis Coding Agent Index](/agents/coding-agents)

### Artificial Analysis Coding Agent Index

## Image & Video Leaderboards

Top models from our Image Arena and Video Arena leaderboards, with 95% confidence intervals

### Text to Image Leaderboard

[See the full leaderboard here.](/image/leaderboard/text-to-image)

### Intelligence Evaluations

Agentic real-world work tasks, (Elo-500)/2000

Agentic coding & terminal use

Agentic tool use

Long context reasoning

Knowledge

1 - hallucination rate

Reasoning & knowledge

Scientific reasoning

Coding

Instruction following

Physics reasoning

Long-horizon agentic tasks

[ITBench-AA](/evaluations/itbench-aa)New

Kubernetes incident root-cause analysis

Visual reasoning

AA-Omniscience is a knowledge and hallucination benchmark that rewards accuracy, punishes bad guesses and provides a comprehensive view of which models produce factually reliable outputs across different domains

### AA-Omniscience Index

GDPval-AA evaluates AI models on real-world, economically valuable tasks across a wide range of occupations

### GDPval-AA Leaderboard

[ITBench-AA](/evaluations/itbench-aa)New

ITBench-AA evaluates AI agents on Kubernetes incident root-cause analysis from offline incident snapshots

### ITBench-AA Average precision at full recall

Artificial Analysis Openness Index assesses how 'open' models are on the basis of their availability and transparency across different components.

### Artificial Analysis Openness Index: Components

### Artificial Analysis Openness Index vs. Artificial Analysis Intelligence Index

## Output Tokens

Output tokens of leading AI models based on our independent evaluations

### Output Tokens Used to Run Artificial Analysis Intelligence Index

## Cost Efficiency

Cost of leading AI models based on our independent evaluations

### Cost to Run Artificial Analysis Intelligence Index

## Speed & Latency

Comparison of first-party API performance

### Output Speed

## PriceUpdated

Price of leading AI models based on our independent evaluations

### Pricing: Cache Hit, Input, and Output

[NewHardware Benchmarking](/benchmarks/hardware)

Comprehensive benchmarking of GPUs for language model inference

[Video Arena & Leaderboard](/video/arena)

Compare leading Text to Video and Image to Video models

[Image Arena & Leaderboard](/image/arena)

Compare leading Image Generation and Image Editing models

[Speech Arena & Leaderboard](/text-to-speech/arena)

Compare leading Text to Speech models