Note
π€ 100% Coded by AI: This entire repository and application was engineered 100% autonomously by Antigravity, an agentic AI coding assistant.
A powerful, self-contained agentic interface built in Rust that allows you to control OBS Studio via natural language. The system features an advanced reasoning loop powered by Claude (claude-3-5-sonnet
) and OpenAI (gpt-4o
), automatically connected to OBS Studio via WebSocket v5.
π Real-Time Interactive Dashboard: Monitors streaming, recording, virtual camera, active program scenes, and audio mixer volume levels dynamically.π₯οΈ Canvas Size & Output Resolution Detection: Automatically queries your OBS Canvas and scaled output resolutions, displaying them in a dedicated status card and feeding them into the AI agent's system prompt context.π§ Intelligent Auto-Fallback Agent Loop:- Routes requests through a robust agent loop with support for both Claude and OpenAI. Dynamic failover: If the selected provider key or network request fails, the backend automatically switches to the alternative provider (Claude β OpenAI) and finishes your request seamlessly without duplicating message history.
π Windows Hotwire & WGC Binding:- Remotely scans, restores, focuses, and resizes target application windows on your host OS.
- Automatically binds windows into OBS using Windows Graphics Capture (WGC / Windows 10 method) to prevent black/grayed-out capture screens. - Uses an intelligent fuzzy resolver to match window titles/classes against OBS's active window pool to ensure 100% reliable binding.
ποΈ Real-Time Voice Activation ("OBSy"):- Uses browser-side VAD (Voice Activity Detection) and OpenAI's Whisper API for high-fidelity speech-to-text.
- Triggers on the wake word
"OBSy"
. Allows natural language control directly from your microphone.
ποΈ Manual Sidebar Controls: Instant buttons to trigger scene changes, volume adjustments, and transitions that stay fully in sync with the AI's state.
OBS Studio(v28.0 or later, featuring native WebSocket support).** Rust Toolchain**(pre-configured in this workspace environment).** GitHub CLI (**(for repository authentication and deployment).gh
)
- Open OBS Studio.
- Navigate to
Toolsβ** WebSocket Server Settings**. - Check
Enable WebSocket server. - Take note of the port (default:
4455
) and the password (or generate a new one).
To start the server, configure your credentials and connection endpoints as environment variables:
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export OPENAI_API_KEY="your-openai-api-key"
export ELEVENLABS_API_KEY="your-elevenlabs-api-key"
export OBS_HOST="host.docker.internal"
export OBS_PORT="4455"
export OBS_PASSWORD="your-obs-websocket-password"
To compile and launch the Axum web server, run the following in the workspace directory:
cargo run
The server will initialize and serve the frontend dashboard at http://localhost:8080
.
Open your browser, connect OBS, and start commanding your streams via text or voice!