Friday Fixes: Housekeeping the Homelab and Hub
A developer updated a homelab's local LLM stack, catching up llama.cpp by 469 builds and upgrading the Qwen generation model from 3.5 to 3.6, the embedding model from nomic v1.5 to v2-moe, and adding …
A developer updated a homelab's local LLM stack, catching up llama.cpp by 469 builds and upgrading the Qwen generation model from 3.5 to 3.6, the embedding model from nomic v1.5 to v2-moe, and adding …
A developer compiled a list of 10 open-source GitHub repositories that collectively replace over $123 per month in paid developer tools, including alternatives to Zapier, Calendly, and DocuSign. The r…
Google released the Google AI Edge Gallery for macOS, enabling Mac users to run Gemma models locally. The company also launched a new Gemma 4 12B variant and the Google AI Edge Eloquent dictation app …
A new open-source benchmark, "apple-silicon-llm-bench," reveals that Google's LiteRT-LM runtime outperforms MLX-Swift on the iPhone 17 Pro for Gemma 4 E2B inference, achieving 55.4 tok/s with 4.5x les…
Google launched Google AI Edge Gallery for macOS, allowing Mac users to run Gemma models locally on their devices. The company also released the Gemma 4 12B model, a multimodal AI with 12 billion para…
Konversio launched Pilot, an open-source AI customer support agent licensed under MIT, designed for self-hosting and deployment in EU data centers. The tool reads help articles, answers chat queries, …
TensorSharp, a new open-source C# inference engine, now enables developers to run large language models locally using GGUF files. The engine supports multiple model architectures including Gemma 4, Qw…
Trustpilot built a real-time streaming pipeline using fine-tuned Google Gemma models to process millions of user reviews under strict latency and cost constraints. The company deployed a suite of spec…
Google has released "gemma-skills," a curated repository of structured skill documents designed to keep AI assistants synchronized with the rapidly evolving Gemma open model ecosystem. The repository'…
Researchers at an undisclosed institution analyzed LoRA fine-tuning in Gemma-2-9B using sparse autoencoders, finding that adapter-specific feature dictionaries show weak geometric alignment with pretr…
Google hosted a Kaggle hackathon challenging developers to train non-reasoning Gemma-2-2B and Gemma-3-1B models into general reasoning models using Tunix and Kaggle TPUs. Over 11,000 entrants and 300+…
Ekorbia v0.2 introduces a comparison-chat mode that runs two to three local large language models against the same prompt in parallel. Testing Gemma 4, IBM Granite 4.1, and Qwen 3.5 on a 32 GB M1 Max …
A new study from arXiv reveals that multimodal large language models (LLMs) frequently produce hallucinated outputs in agricultural imaging tasks, generating biologically inconsistent or agronomically…
Quentin Merle, a web architect with 15 years of experience and founder of Vibrisse Studio, built Ping Prompt, an air hockey game where a small language model (SLM) running entirely locally in the brow…
Ollama and LM Studio now allow users to run open-weight AI models like Qwen 3, Gemma 3, and DeepSeek-R1 locally on consumer hardware without API keys or monthly fees. The tools support quantized model…
A new study of small language models reveals that chain-of-thought prompting for arithmetic relies on a positional shortcut: the model copies whichever number appears last before the answer delimiter,…
SimGemma is an offline-first, AI-powered platform built by tech lead and volunteer teacher Damodharan to generate interactive 3D science simulations using natural language. Designed for the Google Gem…
The article describes a web application called "GP-Online" built to address the gap in patient-facing healthcare software, particularly in Bulgaria where GPs manage over 2,000 patients. The app syncs …
According to the article, Gemma 4's most underrated feature is its built-in "thinking tokens," which provide a visible inner monologue where the model automatically reasons through tasks before genera…
The article describes CodeDNA, a tool built by the author that uses Google's Gemma 4 LLM with its "Thinking Mode" and 128K context window to analyze entire git histories. By feeding it the React repos…