# Sharded Inference of a 229B-Parameter Moe over the Internet at Interactive Speed

> Source: <https://twitter.com/c0mputeAI/status/2073150789640421537>
> Published: 2026-07-04 11:04:21+00:00

1/ We published our first technical report today.
We ran a 229B model split across five consumer GPUs in five countries over the public internet and measured 12.6 tok/s interactive, 194 tok/s batched.
With cryptographic receipts on every request.
doi.org/10.5281/zenodo…
