{"slug": "memoria-a-local-ai-reading-companion-powered-by-gemma-4", "title": "Memoria - A Local AI Reading Companion Powered by Gemma 4", "summary": "Memoria is a local AI reading companion powered by the Gemma 4 model that helps readers stay connected to books through features like spoiler-safe recaps, contextual Q&A, and text simplification. The application runs entirely on the user's machine using llama.cpp, eliminating the need for a paid AI subscription or constant internet access. It processes books chapter-by-chapter, storing summaries and character memory to provide grounded, spoiler-free assistance while reading.", "body_md": "This is a submission for the Gemma 4 Challenge: Build with Gemma 4\nReading long books can be difficult even for people who love reading.\nReaders forget characters, lose track of earlier events, struggle with dense prose, or return to a book after a break and feel disconnected from the story. For readers with ADHD, memory difficulties, cognitive fatigue, or accessibility needs, this becomes even harder.\nMemoria is a local AI reading companion powered by Gemma 4 that helps readers stay connected to books through spoiler-safe recaps, contextual Q&A, character memory, speaker attribution, and text simplification — all while running locally on the user’s machine.\nThe app combines an EPUB reader with AI-powered reading support features including:\nEverything runs locally using Gemma 4 through llama.cpp, so readers do not need a paid AI subscription or constant internet access.\nGitHub Repository: https://github.com/Santhoshl2312/Gemma_book_reader\nMemoria uses Gemma 4 as the core local reasoning engine for the entire reading experience.\nI used the Gemma 4 E2B model through a local llama.cpp OpenAI-compatible server, allowing the application to run fully offline without relying on cloud APIs.\nI specifically chose Gemma 4 E2B because it was the best fit for a responsive local reading assistant.\nThe project needed:\nGemma 4 E2B delivered the right balance between speed and capability, making it possible to provide near real-time responses for recaps, contextual Q&A, text simplification, and chapter processing while still running locally through llama.cpp.\nThis was especially important because the app performs many smaller AI tasks continuously in the background while the user reads.\nGemma summarizes chapter chunks into structured summaries and key events that help readers quickly reconnect with the story.\nThe model updates persistent character descriptions and remembers important events tied to each character across chapters.\nGemma helps identify ambiguous dialogue speakers when rule-based systems fail.\nReaders can ask questions about the story, and Gemma answers using chapter-aware retrieval that avoids future spoilers.\nSelected passages can be rewritten into clearer modern English while preserving meaning and tone.\nThe frontend is a lightweight EPUB reader built with vanilla HTML, CSS, and JavaScript. It handles book uploads, chapter navigation, reading controls, themes, typography settings, and the AI interaction panel.\nThe backend is built with FastAPI and SQLite. It manages books, chapters, summaries, embeddings, character memory, retrieval, and streaming responses.\nThe AI stack runs fully locally using llama.cpp:\nThe app processes books chapter-by-chapter instead of trying to load entire novels into context at once. Intermediate artifacts like summaries, character memory, embeddings, and speaker metadata are stored and reused throughout the reading experience.\nThis pipeline-first design makes the system faster, more grounded, and more practical for long-form reading.\nOne of the biggest design goals was preventing accidental spoilers.\nWhen a reader asks a question, Memoria retrieves only information from chapters the user has already completed. The retrieval system filters vector search results using reading progress before sending context to Gemma 4.\nThis allows the app to help readers remember earlier story details without revealing future events.\nFull novels are too large to send directly into a local model context window. I solved this by chunking chapters into smaller sections while carrying forward rolling summaries and character memory.\nLocal models sometimes wrap JSON outputs in extra formatting or explanations. To make the pipeline reliable, prompts were heavily constrained and the backend extracts valid JSON blocks safely before processing.\nDialogue attribution in fiction is difficult because speakers are often implied instead of explicitly named. I used a hybrid approach where rules handle obvious cases while Gemma handles ambiguous dialogue using broader context.\nThe project depends on multiple services including Gemma 4, embedding models, Python environments, and vector databases. I automated the setup process using launcher scripts so the app can be started locally with minimal manual configuration.\nOne of the main goals of this project was accessibility and digital equity.\nReaders should not need:\nBy combining Gemma 4 with llama.cpp and local retrieval, Memoria creates a fully local AI reading companion that respects reader privacy while remaining accessible on consumer hardware.\nThis makes the project useful not only for individual readers, but also for classrooms, libraries, care settings, and offline learning environments.\nMemoria demonstrates how Gemma 4 can power practical, privacy-friendly accessibility tools beyond chatbots.\nInstead of replacing reading, the goal is to support readers — helping them stay connected to stories, remember context, and reduce cognitive load while preserving the experience of reading itself.\nBy combining Gemma 4 E2B, llama.cpp, retrieval, and structured processing pipelines, Memoria turns static EPUB books into adaptive reading experiences that can run entirely offline.", "url": "https://wpnews.pro/news/memoria-a-local-ai-reading-companion-powered-by-gemma-4", "canonical_source": "https://dev.to/santhosh2312/memoria-a-local-ai-reading-companion-powered-by-gemma-4-46l3", "published_at": "2026-05-23 13:09:21+00:00", "updated_at": "2026-05-23 13:34:42.540394+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "open-source", "products"], "entities": ["Gemma 4", "Memoria", "llama.cpp", "Gemma 4 E2B", "Santhoshl2312"], "alternates": {"html": "https://wpnews.pro/news/memoria-a-local-ai-reading-companion-powered-by-gemma-4", "markdown": "https://wpnews.pro/news/memoria-a-local-ai-reading-companion-powered-by-gemma-4.md", "text": "https://wpnews.pro/news/memoria-a-local-ai-reading-companion-powered-by-gemma-4.txt", "jsonld": "https://wpnews.pro/news/memoria-a-local-ai-reading-companion-powered-by-gemma-4.jsonld"}}