{"slug": "i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything", "title": "I Built a Local, GPU-Accelerated Voice Commander—And I Still Type Everything", "summary": "A developer built Voice Commander, a local GPU-accelerated voice transcription system using Whisper.cpp and the Gemini API, but still types everything due to cognitive habits. The tool transcribes speech, cleans filler words, and auto-pastes text, yet the developer struggles to integrate it into daily routine.", "body_md": "As developers, we love building productivity tools. We spend days optimizing our development environments, writing shell scripts, configuring keybindings, and building custom pipelines.\n\nA few months ago, I built [Voice Commander](https://github.com/MasihMoafi/Voice-commander)—a local, GPU-accelerated voice transcription system. It uses a local CUDA-powered **Whisper.cpp** model to transcribe my speech, and hooks into the **Gemini API** to clean up filler words (\"um\", \"uh\"), fix grammar, structure the output, and auto-paste it directly at my cursor.\n\nIt works like a charm. It is fast, private, and precise.\n\nAnd yet, **I still type everything.**\n\nEvery time I need to write a long code comment, draft an issue, or even outline a plan, my hands immediately go to the keyboard. I catch myself typing away, while the microphone hotkey is right there, ready to save me hundreds of keystrokes.\n\nWhy is this? And why is it so hard to break the keyboard habit?\n\nThrough this, I realized that typing isn't just a physical action; it's a cognitive buffer.\n\nRefusing to use a tool you built because it \"feels weird\" is a common trap. To break this, I'm forcing myself to follow a new rule: **if a thought requires more than two sentences, I must dictate it.**\n\nBy pushing through the initial awkwardness, I'm hoping to make voice-to-text as natural as reaching for the trackpad.\n\nHave you built tools that you struggle to integrate into your actual daily routine? How do you overcome the muscle memory of typing?\n\n*If you want to try out the project locally, check it out on GitHub: **[MasihMoafi/Voice-commander](https://github.com/MasihMoafi/Voice-commander)***\n\n*For more of my AI research and developer tools, visit my website: **[masihmoafi.tech](https://masihmoafi.tech)***", "url": "https://wpnews.pro/news/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything", "canonical_source": "https://dev.to/masihmoafi/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything-5aem", "published_at": "2026-06-16 18:12:29+00:00", "updated_at": "2026-06-16 18:17:50.055993+00:00", "lang": "en", "topics": ["developer-tools", "artificial-intelligence", "natural-language-processing", "large-language-models"], "entities": ["Voice Commander", "Whisper.cpp", "Gemini API", "MasihMoafi"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything", "markdown": "https://wpnews.pro/news/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything.md", "text": "https://wpnews.pro/news/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything.txt", "jsonld": "https://wpnews.pro/news/i-built-a-local-gpu-accelerated-voice-commander-and-i-still-type-everything.jsonld"}}