Cactus

mentions 2 type Organization feed RSS

// recent coverage 2 mentions

19:57

2026-07-10

news.ycombinator.com

artificial-intelligence

Show HN: Cactus v2 – On-device AI with cloud fallback

Cactus released version 2 of its on-device AI inference platform, featuring model confidence-based routing to cloud fallback, lossless 4-bit quantization, and GPU acceleration on Apple Metal. The plat…

09:55

2026-06-05

letsdatascience.com

large-language-models

Google LiteRT-LM Accelerates Gemma 4 Local Inference

Google added native support for Gemma 4 Multi-Token Prediction (MTP) to LiteRT-LM, its on-device LLM runtime built on LiteRT (formerly TensorFlow Lite). Google reports the integration yields MTP decod…

// co-occurs with top 8 entities

Google 1 LiteRT-LM 1 Gemma 4 1 InfoQ 1 llama.cpp 1 MLX 1 ONNX 1 Gemma 1

// topics top 6 topics

artificial intelligence 2 ai infrastructure 2 ai tools 2 large language models 1 machine learning 1 ai products 1