Enigma Sound : Multi-Modal Emotion-to-Music Architecture Layout (Gradio + CNN/LSTM Walkthrough)

Developer ApurvaDev111 released Enigma Sound, a Gradio-based UI case study for a multi-modal emotion-to-music architecture that maps text, vocal frequencies, and facial micro-expressions into dynamic audio layers using Bi-LSTM, CNN, and Music21. The lightweight interface serves as a visual walkthrough for the heavy research pipeline, hosted on Hugging Face Spaces.

Hey everyone, I wanted to share a UI case study layout I put together for a research project mapping text, vocal frequencies Bi-LSTM , and facial micro-expressions CNN into dynamic audio layers via Music21. Because the underlying models are too heavy for basic free tiers, I built a lightweight Gradio interface to act as a 0-click visual production walkthrough and tech-stack overview. Would love any feedback on the layout structure or optimization tips for multi-stream pipelines Link: Enigma Sound Ai - a Hugging Face Space by ApurvaDev111 https://huggingface.co/spaces/ApurvaDev111/enigma-sound-ai