cd/entity/Gemma 4· home› entities› Gemma 4

grep -l @gemma 4 /news/*.json | wc -l → 227

Gemma 4

mentions 227 type Person page 1/12 feed RSS

// recent coverage 227 mentions

05:09

2026-07-25

sourcefeed.dev

artificial-intelligence

Small Models That Know When to Phone Home

Cactus shipped Cactus Hybrid, a post-trained build of Google's Gemma 4 E2B that returns a calibrated confidence score with every answer, routing queries to a larger model when confidence falls below 0…

11:01

2026-07-23

promptcube3.com

artificial-intelligence

Gemma 4 vs Gemini 3.1 Flash-Lite: The Hybrid Win

Google's Gemma 4 model uses a 68k-parameter probe layer that predicts decoding errors by reading hidden states, achieving a 0.79-0.88 AUROC on audio benchmarks despite being trained on zero audio data…

04:00

2026-07-23

machinebrief.com

large-language-models

BaseRT: Advancing Best-in-Class LLM Inference with Apple M5 Neural Accelerators

Apple's M5 generation introduces a dedicated Neural Accelerator on every GPU core, and BaseRT, a native Metal inference runtime, exploits these units to deliver up to 6.4× higher prompt-processing thr…

19:41

2026-07-22

9to5google.com

artificial-intelligence

Gemini Intelligence and Gemini Nano 4 launch with Galaxy Z Fold 8, Flip 8

Samsung's Galaxy Z Fold 8 and Flip 8 are the first devices to launch with Gemini Intelligence and Gemini Nano 4, Google's latest on-device AI model based on Gemma 4 that supports over 140 languages wi…

07:00

2026-07-22

dotnetperls.com

large-language-models

Laguna XS Model in OpenCode

Poolside AI's Laguna XS model, a small local LLM, performed well in refactoring Rust code via OpenCode, making relatively few errors and handling tool calls correctly except for escaped quotes. The 20…

13:00

2026-07-21

android-developers.googleblog.com

artificial-intelligence

Build intelligent Android apps: On-device inference

Google announced that its Gemini Nano 4 model, now running on over 140 million devices, can be used through ML Kit's Prompt API to build on-device intelligent features in Android apps, such as summari…

03:48

2026-07-21

dev.to

artificial-intelligence

Gemma 4 E2B on a Single TPU v6e Chip: A Serving Deep Dive

Google's Gemma 4 E2B model serves efficiently on a single TPU v6e chip, achieving 213 tok/s for a single user and scaling to ~2,200 output tok/s across concurrent streams, while its QAT variants fail …

00:10

2026-07-21

dev.to

artificial-intelligence

tpu-management: a Claude Code skill for running Gemma 4 on Cloud TPUs

A developer has created tpu-management, a Claude Code skill and MCP server that automates the full lifecycle of serving Gemma 4 on Google Cloud TPUs, from zone selection and VM creation to benchmarkin…

06:00

2026-07-20

pub.towardsai.net

artificial-intelligence

Murati's 975B Model Fine-Tuned Itself on Launch Day — and Cracked the Top US Open-Weight Spot

Mira Murati's Thinking Machines Lab released Inkling, a 975B-parameter open-weight model, on July 15. Within hours, Artificial Analysis scored Inkling at 41 on its Intelligence Index, making it the to…

01:18

2026-07-20

marktechpost.com

large-language-models

Best Local LLMs You Can Run on a Single 24GB GPU in 2026: Qwen, Gemma, Mistral, DeepSeek Compared

A single 24GB GPU like the RTX 3090 or RTX 4090 is the practical floor for serious local inference in 2026, and the best strategy is to run modern 20B–35B-class models at Q4_K_M quantization rather th…

18:51

2026-07-18

dev.to

artificial-intelligence

Smash Stories: The Bug That Whispered for Two Weeks Before I Heard It

A developer building an AI agent that controls Android phones using Gemma 4 discovered that the agent was misreading financial data 20% of the time due to a single bad assumption: treating OCR output …

05:42

2026-07-18

pub.towardsai.net

ai-agents

Running Google ADK with Gemma 4 and Ollama

A developer successfully ran Google's Agent Development Kit (ADK) with the locally hosted Gemma 4 model via Ollama, building a weather agent that uses tool calling to orchestrate Google Maps and Open-…

14:36

2026-07-16

artificialanalysis.ai

artificial-intelligence

Inkling Benchmark Results

Thinking Machines has released Inkling, a 975B-parameter open weights model with 41B active parameters, debuting at 41 on the Artificial Analysis Intelligence Index and becoming the leading open weigh…

09:07

2026-07-16

the-decoder.com

large-language-models

Gemma 4 gets a stealth update that fixes tool calling bugs and truncated responses under the same name

Google shipped a stealth update to its open AI model Gemma 4 that fixes tool calling bugs and truncated responses, while speeding up performance on Nvidia Hopper GPUs by 25 to 70 percent with Flash At…

02:33

2026-07-16

rockyshikoku.medium.com

artificial-intelligence

Running Gemma4 on Apple Neural Engine

A developer successfully ran Google's Gemma 4 E2B and E4B models on Apple's Neural Engine (ANE) using CoreML, achieving a memory footprint under 1 GB on an iPhone 17 Pro. The project overcame ANE's la…

23:46

2026-07-15

dev.to

developer-tools

Post-Mortem: Building a Local MCP Server for Codebase Memory using Ollama and ChromaDB

A developer built a local MCP server for codebase memory using Ollama and ChromaDB, testing mistral:7b and ornith:9b models. The local mode runs entirely on-device, addressing privacy and billing conc…

17:03

2026-07-15

sourcefeed.dev

artificial-intelligence

Old Xeons Can Run Gemma 4 at Reading Speed

A 26-billion-parameter Gemma 4 mixture-of-experts model can run at reading speed on a dual-socket 2013 Ivy Bridge Xeon server with no GPU, achieving ~5.2 tokens per second decode on a Q8_0 quant, acco…

15:34

2026-07-15

neomindlabs.com

artificial-intelligence

Running Gemma 4 26B at 5 tokens/SEC on a 13-year-old Xeon with no GPU

A developer running Google's Gemma 4 26B mixture-of-experts model on a 13-year-old HP StoreVirtual server with dual Xeon E5-2690 v2 CPUs and no GPU achieved approximately 5 tokens per second inference…

17:10

2026-07-14

9to5google.com

artificial-intelligence

Google announces Gemma 4 optimized for the Pixel 10’s TPU

Google announced Gemma 4 E2B for TPU, a model optimized to run natively on the Pixel 10's Tensor G5 TPU, at I/O Connect India. The multimodal model enables offline AI chat, image identification, and a…

16:50

2026-07-14

developers.googleblog.com

artificial-intelligence

Unlocking the Next Era of On-Device AI with Google Tensor and Pixel

Google unveiled Gemma 4 E2B for TPU and Functional Gemma at Google I/O India, demonstrating how Google Tensor's custom SoC and TPU enable 100% private on-device AI for the Pixel 10 family. The models …

page 1 / 12 next →

// co-occurs with top 8 entities

Google 74 Ollama 30 Google DeepMind 22 Hugging Face 22 GitHub 16 llama.cpp 13 Apache 2.0 12 Anthropic 12

// topics top 6 topics

large language models 211 artificial intelligence 174 developer tools 102 machine learning 92 open source 72 ai tools 63