{"slug": "ai-vtuber-for-beginners-non-programmers-easy-to-setup", "title": "AI VTuber For Beginners/non-programmers Easy To setup", "summary": "A new open-source AI VTuber tool allows beginners and non-programmers to set up a fully local, free VTuber using Whisper for speech recognition, Ollama for LLM inference, and Chatterbox TTS, with VTube Studio integration for mouth animations and zero-shot voice cloning.", "body_md": "A 100% local Ai Vuber for Beginners and Non-programmer setup That is 100% free to Run With instant zero‑shot voice cloning That Uses Vtube studio’s api To make the mouth open and close and play animations after setting it up\n\n----- Readme -----\n\nAI VTuber For Begginers/non programmers Easy To setup\n\nAn AI VTuber that uses Whisper for speech recognition, Ollama for LLM inference, and Chatterbox TTS in a continuous listening loop.\n\nThis Was Also Made On a AMD gpu But the code is mainly supported For cpu users So it can be used without amd or nvdia gpus\n\nThis uses Python 3.10.11 if you don’t have it as your main Version do: py -3.10 -m venv venv\n\n(you can check the version with python -V)\n\nFeatures\n\nWhisper (base.en model) - Real-time speech-to-text in English\n\nOllama (llama3.2) - AI model for generating VTuber responses\n\nChatterbox TTS - Text-to-speech to speak responses\n\nAutomatic silence detection - Only records when speech is detected\n\nContinuous listening loop - Runs forever until Ctrl+C\n\nVTube Studio integration - Controls mouth expressions via VTube Studio Api\n\nDependencies\n\n(IMPORTANT!!!) MAKE A VENV FIRST AND MAKE SURE YOU ARE INSIDE THE PROJECT FOLDER for example\n\nOptional RVC Voice Cloning (Advanced - Windows Build Required)\n\n```\n# Uncomment in requirements.txt or install manually (requires C++ build tools)\n# pip install torchaudio librosa onnxruntime onnx fairseq pyworld praat-parselmouth TTS edge-tts\n```\n\nNote: RVC voice cloning is optional. The VTuber works perfectly with just the core dependencies using Chatterbox TTS. RVC voice cloning requires C++ build tools (Visual Studio Build Tools) and can be challenging to install on Windows.\n\nQuick Start\n\nMAKE A VENV FIRST AND MAKE SURE YOU ARE INSIDE THE PROJECT FOLDER for example\n\nListening phase: Waits for speech with automatic silence detection\n\nSpeech detection: Only starts recording after minimum speech duration is confirmed\n\nTranscribing: Uses Whisper to convert speech to text\n\nAI response: Ollama generates a VTuber-appropriate response\n\nSpeaking: Chatterbox TTS speaks the response aloud\n\nMouth control: VTube Studio controls mouth expressions in sync with speech\n\nRepeat: Returns to listening mode\n\nNotes\n\nPress Ctrl+C to stop the VTuber at any time\n\nEnsure proper audio device permissions for microphone access\n\nFor GPU acceleration, install PyTorch CUDA versions\n\nAdjust silence_threshold, silence_duration, and min_speech_duration in the code for different environments\n\nTroubleshooting\n\nCommon Issues\n\n“Ollama not running” error:\n\nMake sure Ollama is installed and running with ollama serve\n\nVerify the model “llama3.2” is pulled\n\nVTube Studio connection failed:\n\nEnsure VTube Studio is running\n\nCheck that VTS_PORT (default: 8001) is correct\n\nMake sure VTube Studio plugins are enabled\n\nAudio permissions:\n\nGrant microphone permissions to this application\n\nOn Linux: pip install pyaudio might require additional system packages\n\nModel loading issues:\n\nWhisper uses “base.en” for faster performance\n\nEnsure all dependencies are installed from requirements.txt\n\nCustomization\n\nAdjusting Silence Detection\n\nEdit Aivtuber.py and modify these constants:\n\n```\nsilence_threshold = 0.01    # Lower = more sensitive, Higher = less sensitive\nsilence_duration = 1.5      # Seconds of silence before stopping recording\nmin_speech_duration = 0.5   # Minimum speech duration to trigger recording\n```\n\nChanging Models\n\nWhisper: Change in line 24: self.whisper_model = whisper.load_model(\"base\")\n\nOllama: Change in line 109: model=\"llama3.2\"\n\nAdding Emotions\n\nEdit the EMOTION_HOTKEYS dictionary in the code and add hotkeys to VTube Studio:\n\n```\nEMOTION_HOTKEYS = {\n    \"happy\": \"your_happy_hotkey_id\",\n    \"sad\": \"your_sad_hotkey_id\",\n    \"angry\": \"your_angry_hotkey_id\",\n    \"thinking\": \"your_thinking_hotkey_id\",\n    \"neutral\": \"your_neutral_hotkey_id\",\n}\n```\n\nFuture Enhancements\n\nPotential future improvements:\n\nLocal LLM alternatives: Support for other Ollama models or local LLM implementations\n\nMulti-language support: Whisper language switching and response localization\n\nContext memory: Maintain conversation history for more coherent interactions\n\nAdvanced emotion system: More nuanced emotion detection and expression control\n\nStream processing: WebSocket streaming for lower latency\n\nPlugin architecture: Easy addition of new features and integrations\n\n##questions\n\nIs this Ali?\n\nNo this is Not ali In Fact Ali is A WAY more complicated program than this.\n\nThis Also Doesn’t use Any of ali’s og code aside from How the mouth api works And Some recreated stuff Like the api being used so you can play music without having issues\n\nDoes This contain Any preMade vtuber models i can Download?\n\nNo But i do Have Older vtuber models you can use for this for example:\n\n-Proby not Unless You Replaced Ollama with something Else\n\nLicense\n\nThis project is open source. Feel free to modify and distribute as long as you give appropriate credit since that’s really important to get a habit out of.", "url": "https://wpnews.pro/news/ai-vtuber-for-beginners-non-programmers-easy-to-setup", "canonical_source": "https://discuss.huggingface.co/t/ai-vtuber-for-beginners-non-programmers-easy-to-setup/177151#post_1", "published_at": "2026-06-25 06:28:46+00:00", "updated_at": "2026-06-25 06:49:17.604151+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-tools", "ai-agents", "developer-tools"], "entities": ["Whisper", "Ollama", "Chatterbox TTS", "VTube Studio", "llama3.2", "Python", "AMD", "NVIDIA"], "alternates": {"html": "https://wpnews.pro/news/ai-vtuber-for-beginners-non-programmers-easy-to-setup", "markdown": "https://wpnews.pro/news/ai-vtuber-for-beginners-non-programmers-easy-to-setup.md", "text": "https://wpnews.pro/news/ai-vtuber-for-beginners-non-programmers-easy-to-setup.txt", "jsonld": "https://wpnews.pro/news/ai-vtuber-for-beginners-non-programmers-easy-to-setup.jsonld"}}