cd /news/ai-chips/aa-agentperf-releases-initial-result… · home topics ai-chips article
[ARTICLE · art-25719] src=cryptobriefing.com pub= topic=ai-chips verified=true sentiment=· neutral

AA-AgentPerf releases initial results for DeepSeek V4 Pro benchmark, showing NVIDIA Blackwell dominance

Artificial Analysis released initial results for its AA-AgentPerf benchmark, showing NVIDIA's Blackwell systems outperforming AMD's Instinct MI355X GPUs on power-efficient agentic inference using DeepSeek V4 Pro. The benchmark measures concurrent agent support per megawatt, highlighting NVIDIA's efficiency lead in power-constrained data centers.

read2 min publishedJun 12, 2026

Artificial Analysis launches the first multi-vendor open benchmark for agentic coding tasks, and the early numbers paint a clear picture of NVIDIA's power efficiency lead over AMD

Artificial Analysis has dropped something the AI hardware world has been quietly waiting for: an actual benchmark that measures how well chips handle agentic AI workloads in the real world. The benchmark is called AA-AgentPerf, and its initial results running DeepSeek V4 Pro tell a story that AMD probably would rather not hear right now.

NVIDIA’s Blackwell systems, specifically the B200 and GB300, consistently outperformed AMD’s Instinct MI355X GPUs on power-efficient agentic inference.

What AA-AgentPerf actually measures #

It’s the first multi-vendor open benchmark from Artificial Analysis designed specifically for hardware performance in agentic coding tasks.

The benchmark evaluates how many concurrent agents a system can support while meeting specific service-level objectives. Those SLOs cover output token speeds ranging from 20 to 300 tokens per second and time-to-first-token (TTFT) targets between 3 and 10 seconds.

Rather than relying on synthetic evaluation methods, the benchmark leverages actual coding trajectories. Results are then normalized per accelerator and per megawatt, which creates a comparison framework that accounts for both raw performance and energy consumption.

DeepSeek V4 Pro enters the chat #

The model at the center of this benchmark is DeepSeek V4 Pro, which has been turning heads since its release around April 2026. It scored 1554 on the GDPval-AA benchmark, placing it firmly among the top-performing open-weights models available today.

DeepSeek V4 Pro (Max) also earned a score of 52 on the Artificial Analysis Intelligence Index, ranking it second among open-weights reasoning models.

NVIDIA vs. AMD and what it means for the data center market #

The initial AA-AgentPerf results paint a clear picture of competitive positioning. NVIDIA’s Blackwell architecture, represented by the B200 and GB300 systems, delivered superior performance per watt compared to AMD’s MI355X across the tested agentic workloads.

The per-megawatt normalization is especially telling. Data centers are increasingly constrained not by rack space or capital budgets but by power availability. A chip that can support more concurrent agents per megawatt of power consumed has a tangible, quantifiable advantage that translates directly to the bottom line.

For NVIDIA, these results reinforce a narrative the company has been building around Blackwell’s efficiency characteristics. The timing is notable: the performance leadership data was reported relative to a June 12, 2026 crawl date, suggesting NVIDIA moved quickly to publicize favorable results through its developer blog. Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our

Editorial Policy.

── more in #ai-chips 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/aa-agentperf-release…] indexed:0 read:2min 2026-06-12 ·