cd /news/artificial-intelligence/gategpt-56k-tokens-per-second-transf… · home topics artificial-intelligence article
[ARTICLE · art-29765] src=twitter.com ↗ pub= topic=artificial-intelligence verified=true sentiment=↑ positive

GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

A developer implemented a full Transformer with KV cache on an FPGA, achieving over 56,000 tokens per second at only 80 MHz, without using a GPU or CPU. The design was created gate by gate as a custom digital integrated circuit, demonstrating extreme efficiency for AI inference.

read1 min views1 publishedJun 16, 2026

56,000+ tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running

@karpathymicroGPT, spelling out names on a GPT 👇00:00

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @fpga 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/gategpt-56k-tokens-p…] indexed:0 read:1min 2026-06-16 ·