{"slug": "zeta2-1-3x-fewer-tokens-50ms-faster", "title": "Zeta2.1: 3x Fewer Tokens, 50ms Faster", "summary": "Zed Industries released Zeta2.1, an updated edit prediction model that emits 3x fewer output tokens than its predecessor, reducing response times by up to 50 milliseconds and requiring 30% fewer servers to handle the same traffic. The efficiency gains come from a new \"Multi-Region\" prompt format that outputs only the code region the model intends to change, rather than a large area around the cursor. Zeta2.1 is open-weight, available on Hugging Face, and is now the default edit prediction model in the Zed editor.", "body_md": "We [launched Zeta2](/blog/zeta2), Zed's edit prediction model, in March, and promised more improvements were on the way. Here they are.\n\nZeta2.1 emits 3x fewer output tokens than Zeta2, bringing predictions up to 50ms faster and requiring 30% fewer servers to serve the same traffic:\n\n| Metric | Zeta2 | Zeta2.1 |\n|---|---|---|\n| Output tokens (avg) | ~270 | ~90 (−67%) |\n| Response Time (p50) | 189ms | 136ms (−28%) |\n| Response Time (p90) | 401ms | 350ms (−13%) |\n| Acceptance rate | Baseline | +0.51% |\n| Explicit rejection rate | Baseline | −4.10% |\n\nThese efficiency gains came from a new prompt format we've dubbed \"Multi-Region\". While Zeta2 output a large region around your cursor with its edits applied, with the new Multi-Region format Zeta2.1 only outputs the region around the code it wants to change. This took several iterations to get right, but the result is even faster predictions on every keystroke.\n\nZeta2.1 is open-weight, just like Zeta1 and Zeta2.\nYou can see examples of the new prompt format, and download the model on [Hugging Face](https://huggingface.co/zed-industries/zeta-2.1).\n\nAs with Zeta2, Zeta2.1 was trained entirely on opt-in data in open-source repositories.\nIf you'd like to help contribute to future improvements, you can opt in by [toggling the data collection setting](zed://settings/edit_predictions.allow_data_collection).\n\n[Try It](#try-it)\n\nZeta2.1 is even better for running locally, and works [out of the box](https://zed.dev/docs/ai/edit-prediction#local-and-self-hosted-models).\nAdditionally with this release we've begun to publish bindings for the [Rust code](https://github.com/zed-industries/zed/tree/main/crates/zeta_prompt) we use in production to format prompts to [PyPI](https://pypi.org/project/zed-zeta-bindings/), making it even easier to self host.\n\nZeta2.1 is the default edit prediction model in Zed today.\nYou can try it out for free, or check out [Zed Pro or Zed Business](/pricing) for unlimited edit predictions.\n\n### Related Posts\n\nCheck out similar blogs from the Zed team.\n\n### Looking for a better editor?\n\nYou can try Zed today on macOS, Windows, or Linux. [Download now](/download)!\n\n### We are hiring!\n\nIf you're passionate about the topics we cover on our blog, please consider [joining our team](/jobs) to help us ship the future of software development.", "url": "https://wpnews.pro/news/zeta2-1-3x-fewer-tokens-50ms-faster", "canonical_source": "https://zed.dev/blog/zeta2-1", "published_at": "2026-05-08 00:00:00+00:00", "updated_at": "2026-05-26 06:11:25.454919+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "ai-products", "ai-tools"], "entities": ["Zeta2.1", "Zeta2", "Zeta1", "Zed", "Hugging Face"], "alternates": {"html": "https://wpnews.pro/news/zeta2-1-3x-fewer-tokens-50ms-faster", "markdown": "https://wpnews.pro/news/zeta2-1-3x-fewer-tokens-50ms-faster.md", "text": "https://wpnews.pro/news/zeta2-1-3x-fewer-tokens-50ms-faster.txt", "jsonld": "https://wpnews.pro/news/zeta2-1-3x-fewer-tokens-50ms-faster.jsonld"}}