{"slug": "i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works", "title": "I Built an AI-Powered Meeting Platform From Scratch — Here’s How It Actually Works", "summary": "A developer built Hoovik, an open-source AI-powered video meeting platform that combines WebRTC signaling, distributed Node.js with Redis, real-time emotion AI, and a Python transcription pipeline. The platform uses a multi-service architecture including a React/WebRTC frontend, distributed backend, transcription service, emotion recognition, and retrieval-augmented search on meeting transcripts. Hoovik detects participant emotions through RTP audio levels and dedicated Socket.IO connections, enabling AI summaries that highlight discrepancies between spoken content and observed emotions.", "body_md": "A complete breakdown of Hoovik: WebRTC signaling, distributed Node.js with Redis, real-time emotion AI, RAG on meeting transcripts, and a Python transcription pipeline — all wired together.\n\n👉 GitHub: [https://github.com/AnupamKumar-1/Hoovik](https://github.com/AnupamKumar-1/Hoovik)\n\n🌐 Live Demo: [https://hoovik.onrender.com](https://hoovik.onrender.com)\n\n🎮 Interactive Demo: [https://app.supademo.com/demo/cmpy5ggyv95b0qmy7ccrkd3ms?utm_source=link](https://app.supademo.com/demo/cmpy5ggyv95b0qmy7ccrkd3ms?utm_source=link)\n\nI've previously written about individual parts of Hoovik, including its emotion analysis system and WebRTC signaling architecture.\n\nThose articles focused on specific subsystems. This one focuses on the complete platform.\n\nHoovik is not a single application. It is a collection of services working together: a React/WebRTC frontend, a distributed Node.js backend, a transcription pipeline, a real-time emotion recognition service, and a retrieval-augmented search system built on meeting transcripts.\n\nThis article walks through how those systems interact, the architectural decisions behind them, and the tradeoffs encountered while building each component.\n\nHoovik is a multi-party video meeting platform that combines real-time communication, AI-assisted analysis, and transcript intelligence.\n\nThe platform includes:\n\nThe system is composed of four primary services.\n\nThe remainder of this article follows the lifecycle of a meeting and explains how each service participates.\n\nThe backend is responsible for:\n\nThe deployment runs as multiple PM2 processes connected through:\n\nRoom state cannot safely live in process memory when multiple Node.js instances are handling requests.\n\nInstead, mutable meeting state is stored in Redis.\n\nParticipants are stored in a Redis Hash:\n\ntext meeting:participants:\n\nEach field contains a serialized participant object.\n\nThis design allows:\n\nJoin order is stored separately and is used for WebRTC role assignment.\n\nJoining a room modifies shared state.\n\nTo prevent race conditions, room joins are serialized using a Redis-backed distributed lock.\n\njs await withRoomLock(meetingCode, async () => { // join logic });\n\nThe lock uses:\n\nThis guarantees that only one join operation mutates room state at a time.\n\nAuthentication uses JWT access tokens and refresh token rotation.\n\nLogin issues:\n\nRefresh tokens are rotated on every refresh request, reducing replay risk while preserving user sessions.\n\nThe frontend is a React application built around specialized hooks that manage independent subsystems.\n\nMajor responsibilities include:\n\nPeer connections are managed through dedicated React hooks and implement the perfect negotiation pattern.\n\nThe application supports:\n\nTwo independent detection paths exist.\n\nWhen available:\n\njs RTCRtpReceiver.getSynchronizationSources()\n\nis used to obtain RTP audio levels directly.\n\nBrowsers without SSRC support use:\n\nThe application selects the appropriate method dynamically.\n\nThe host captures:\n\nCaptured media is sent directly to the emotion service using dedicated Socket.IO connections.\n\nEach participant receives an independent emotion-service connection, allowing participant-level media state tracking and backpressure control.\n\nThe emotion service can instruct the frontend to adjust capture rates through server status and backpressure events.\n\nEmotion events collected during a meeting are stored locally and later submitted when generating an AI summary.\n\nThe backend combines:\n\nThis enables AI summaries to highlight notable discrepancies between spoken content and observed participant emotions.\n\nThe transcript service is implemented in FastAPI.\n\nIts responsibilities include:\n\nThe service uses:\n\nfor transcription and emotion tagging.\n\nMeeting recordings are uploaded after a meeting ends.\n\nThe service immediately returns:\n\nhttp 202 Accepted\n\nand performs processing in a background task.\n\nThe processing pipeline is:\n\n`Audio Upload`\n\n↓\n\nFFmpeg Conversion\n\n↓\n\nWhisper Transcription\n\n↓\n\nSegment Merging\n\n↓\n\nNLP Emotion Classification (DistilRoBERTa)\n\n↓\n\nTranscript Callback To Node Backend\n\nAfter processing completes, the transcript service sends structured transcript data back to the Node.js backend.\n\nRetry logic is used to improve reliability during temporary backend failures.\n\nThe emotion service performs real-time inference on participant media streams.\n\nThe frontend sends:\n\ndirectly to the service.\n\nThe service performs inference using:\n\nand emits:\n\ntext emotion.result\n\nevents back to the frontend.\n\nInference continues even when a participant disables one modality.\n\nExamples:\n\nThis allows emotion tracking to continue without requiring both media streams.\n\nThe service also emits:\n\nevents that allow the frontend to dynamically adjust capture rates and reduce load.\n\nAfter transcripts are stored, they can be indexed for semantic retrieval.\n\nThe indexing pipeline consists of:\n\nWhen speaker segments are available, chunks preserve:\n\nOtherwise, a sliding-window chunking strategy is used.\n\nEmbeddings are generated using:\n\ntext nomic-embed-text-v1.5\n\nEmbedding results are cached in Redis to avoid redundant computation.\n\nTranscript indexing runs asynchronously through BullMQ workers.\n\nThis prevents long-running embedding operations from blocking API requests.\n\nRetrieval combines:\n\nto balance relevance and diversity.\n\nRetrieved context is passed to Groq-hosted language models to generate answers.\n\nSession history is maintained to support multi-turn conversations over meeting data.\n\nAccess control follows the same authorization model as transcript access:\n\nSeveral known tradeoffs remain in the current architecture.\n\nThese decisions were acceptable for the current scale of the platform, but dedicated workers and queue-based processing would be natural next steps.\n\nHoovik evolved from a simple video meeting application into a distributed platform that combines WebRTC, real-time machine learning, transcript intelligence, and retrieval-augmented search.\n\nThe most interesting part of the project was not any single technology. It was designing the boundaries between services and making them work reliably together under real-world constraints.\n\nIf you'd like to explore the implementation, try the interactive demo or browse the source code on GitHub.", "url": "https://wpnews.pro/news/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works", "canonical_source": "https://dev.to/anupam_kumar/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works-31p", "published_at": "2026-06-03 19:33:42+00:00", "updated_at": "2026-06-03 19:41:40.216560+00:00", "lang": "en", "topics": ["ai-products", "ai-infrastructure", "ai-tools", "ai-startups", "natural-language-processing"], "entities": ["Hoovik", "WebRTC", "Redis", "Node.js", "Python", "GitHub", "Render", "Supademo"], "alternates": {"html": "https://wpnews.pro/news/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works", "markdown": "https://wpnews.pro/news/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works.md", "text": "https://wpnews.pro/news/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works.txt", "jsonld": "https://wpnews.pro/news/i-built-an-ai-powered-meeting-platform-from-scratch-heres-how-it-actually-works.jsonld"}}