Dear NYTimes: Vibe coding is not a metric

wpnews.pro

cd /news/artificial-intelligence/dear-nytimes-vibe-coding-is-not-a-me… · home › topics › artificial-intelligence › article

[ARTICLE · art-18693] src=coreyguitar.com ↗ pub=2026-05-29T00:00Z topic=artificial-intelligence verified=true sentiment=↓ negative

Dear NYTimes: Vibe coding is not a metric

The New York Times published an article claiming Anthropic's Claude Opus 4.8 outperforms all other publicly available technologies on "vibe coding," a subjective term for AI generating code from conversational prompts. Critics argue "vibe coding" is not a measurable metric, making the comparison meaningless and the reporting lazy. The article's benchmark compared Opus 4.8 only to previous Anthropic models, failing to provide objective proof of superiority.

read2 min views12 publishedMay 29, 2026

May. 29, 2026

Web Development Technology

Update: The digital edition of the New York Times article [1] in question has been updated to include an actual benchmark for the Opus 4.8 model with regard to vibe coding. That's an improvement. However, the cited benchmark compares Opus 4.8 only to the previous Anthropic model. The claim that it outperforms "all other publicly available technologies" at this subjective task remains bogus.

Anthropic's new Claude Opus 4.8 outperforms all other publicly available technologies on vibe coding, which is when A.I. technologies create software in response to prompts written in conversational English.

Anthropic Tops OpenAI to Become the World’s Most Valuable A.I. Start-Up.

The New York Times[[2]] I facepalmed after reading that sentence in a recent article from the Business section of the New York Times.

"Vibe coding" is not a metric, as in you cannot measure it. A.I. models can generate computer code, and code quality is subjective. More isn't necessarily better. Less isn't necessarily better.

A quick example comes to mind: a programmer might intentionally choose to make a section of code "less efficient" in order to make it easier to read. Is this better or worse?

Saying that Opus 4.8 outperforms other models at vibe coding is lazy reporting. It'd be like saying that Argentina is better at "sporting" than France.[3] Which sport? At what age range? During what years?

A.I. companies are currently raising more money than God. (A real metric.) They are also consuming significant finite resources in the form of energy and water.

As investors and the public weigh these costs, journalists should not be handing these companies freebie metrics that sound impressive but mean nothing. If a product is better, or indeed good at all, they should have to prove it.

References:

- [1]NYTimes story: M. Isaac and C. Metz (2026, May 28).
[Anthropic Tops OpenAI to Become the World’s Most Valuable A.I. Start-Up](https://www.nytimes.com/2026/05/28/technology/anthropic-tops-openai-valuation.html) - [2]

Print editionof the above article. - [3]Argentina has won 3 FIFA World Cups: in 1978, 1986 and 2022. France has won 2 FIFA World Cups: in 1998 and 2018.

source & further reading

coreyguitar.com — original article "Vibe coding" is not a metric

~/api · this article 200

$curl api.wpnews.pro/v1/news/dear-nytimes-vibe-coding…

Read original on coreyguitar.com → coreyguitar.com/blog/18/vibe-coding/

mentioned entities

New York Times

Anthropic

Claude Opus 4.8

OpenAI

metadata

slugdear-nytimes-vibe-coding-is-not-a-metric

topic#artificial-intelligence

secondary4 topics

sentimentnegative

canonicalcoreyguitar.com

navigation

← prevClaude Opus 4.8: "a modest but t…

next →Claude Opus 4.8 is out. The benc…

── more in #artificial-intelligence 4 stories · sorted by recency

cryptobriefing.com · 16 Jul · #artificial-intelligence

DeepSeek’s 75% price cut pressures AI market, impacts Anthropic valuation

arena.logic.inc · 16 Jul · #artificial-intelligence

We've adding Inkling and 52 small apps one-shotted by it to our arena.

venturebeat.com · 16 Jul · #artificial-intelligence

The agent security gap: 54% of enterprises have already had an AI agent incident, and most still let agents share credentials

futurism.com · 16 Jul · #artificial-intelligence

Elon Musk’s AI Startup Is a Complete Disaster Behind the Scenes

── more on @new york times 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-chips

D-Matrix launches Corsair AI inference platform, challenging Nvidia’s GPU dominance

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required