Google (@Google)'s Gemma 4 31B is being described as a local inference milestone after an Aligned News post on X said the model benchmarks near Anthropic's Claude Opus 4 while running on consumer laptop hardware through Ollama's new QAT weights. https://x.com/itsPaulAi/status/2062973423303712949 The same post says a smaller E4B variant outperforms OpenAI (@OpenAI)'s GPT 4o on key benchmarks while running on a phone with 2GB of RAM. Those are large claims packed into a short post. The post doe...
OpenRouter: The Unified Interface for LLMs