{"slug": "gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz", "title": "GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz", "summary": "A developer implemented a full Transformer with KV cache on an FPGA, achieving over 56,000 tokens per second at only 80 MHz, without using a GPU or CPU. The design was created gate by gate as a custom digital integrated circuit, demonstrating extreme efficiency for AI inference.", "body_md": "56,000+ tokens/sec at just 80 MHz. 🤯\nI burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU)\nJust pure digital silicon running\n\n[@karpathy](https://x.com/karpathy)microGPT, spelling out names on a GPT 👇00:00", "url": "https://wpnews.pro/news/gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz", "canonical_source": "https://twitter.com/fguzmanai/status/2065832668172845209", "published_at": "2026-06-16 16:12:26+00:00", "updated_at": "2026-06-16 16:22:25.289728+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-infrastructure", "ai-chips", "ai-research"], "entities": ["FPGA", "Transformer", "KV cache", "microGPT", "Andrej Karpathy"], "alternates": {"html": "https://wpnews.pro/news/gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz", "markdown": "https://wpnews.pro/news/gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz.md", "text": "https://wpnews.pro/news/gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz.txt", "jsonld": "https://wpnews.pro/news/gategpt-56k-tokens-per-second-transformer-kv-cache-on-fpga-at-80-mhz.jsonld"}}