# Enigma Sound : Multi-Modal Emotion-to-Music Architecture Layout (Gradio + CNN/LSTM Walkthrough)

> Source: <https://discuss.huggingface.co/t/enigma-sound-multi-modal-emotion-to-music-architecture-layout-gradio-cnn-lstm-walkthrough/176936#post_1>
> Published: 2026-06-18 05:24:24+00:00

Hey everyone,

I wanted to share a UI case study layout I put together for a research project mapping text, vocal frequencies (Bi-LSTM), and facial micro-expressions (CNN) into dynamic audio layers via Music21.

Because the underlying models are too heavy for basic free tiers, I built a lightweight Gradio interface to act as a 0-click visual production walkthrough and tech-stack overview.

Would love any feedback on the layout structure or optimization tips for multi-stream pipelines!

Link: [Enigma Sound Ai - a Hugging Face Space by ApurvaDev111](https://huggingface.co/spaces/ApurvaDev111/enigma-sound-ai)