09:55
2026-06-05
letsdatascience.com
large-language-models
Google LiteRT-LM Accelerates Gemma 4 Local Inference
Google added native support for Gemma 4 Multi-Token Prediction (MTP) to LiteRT-LM, its on-device LLM runtime built on LiteRT (formerly TensorFlow Lite). Google reports the integration yields MTP decodβ¦