Google LiteRT-LM Accelerates Gemma 4 Local Inference
Google added native support for Gemma 4 Multi-Token Prediction (MTP) to LiteRT-LM, its on-device LLM runtime built on LiteRT (formerly TensorFlow Lite). Google reports the integration yields MTP decod…