How fast is 10 tokens per second really?

wpnews.pro

cd /news/large-language-models/how-fast-is-10-tokens-per-second-rea… · home › topics › large-language-models › article

[ARTICLE · art-3605] src=simonwillison.net ↗ pub=2026-05-20T17:57Z topic=large-language-models verified=true sentiment=· neutral

How fast is 10 tokens per second really?

This article, published on May 20, 2026, highlights a simple HTML tool created by Mike Veerman that simulates different LLM token output speeds, ranging from 5 to 800 tokens per second. The tool is designed to help users visualize and understand the real-world feel of advertised speeds, such as "30 tokens per second."

read1 min views16 publishedMay 20, 2026

20th May 2026 - Link Blog How fast is 10 tokens per second really? (via) Neat little HTML app by Mike Veerman (source code here) which simulates LLM token output speeds from 5/second to 800/second. Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like. Recent articles

Gemini 3.5 Flash: more expensive, but Google plan to use it for everything - 19th May 2026
The last six months in LLMs in five minutes - 19th May 2026
Notes on the xAI/Anthropic data center deal - 7th May 2026

source & further reading

simonwillison.net — original article Quoting Boris Cherny Introducing Claude Opus 5 OpenAI’s accidental attack against Hugging Face is science fiction that happened

~/api · this article 200

$curl api.wpnews.pro/v1/news/how-fast-is-10-tokens-pe…