Building Perri: A Comic Strip Generator

A developer built Perri Comic Generator, a lightweight single-panel comic creator that combines an LLM (Meta-Llama-3-8B-Instruct) with a diffusion model (SDXL-Turbo) to generate comics from story seeds. The system uses a Gradio frontend and a serverless backend on Modal, with all models under 32 billion parameters for efficiency.

Meet Perri Comic Generator , a lightweight, single-panel comic creator that merges LLM-driven storytelling with real-time diffusion models. By pairing an Gradio frontend with a high-performance backend, Perri orchestrates a seamless pipeline: it takes a simple story seed, structures it into a panel description, generates the art, and burns the dialogue right onto the final image. The best part? It achieves all of this without massive, resource-heavy infrastructure. Every AI model under Perri's hood is under 32 billion parameters , proving that you don't need giant, compute-heavy models to build something amazing. Here is a look inside the architecture and tech stack that powers Perri. Perri is built using a clean separation of concerns, splitting the heavy lifting of generation away from the user interface. app.py Built using Gradio 6.16.0 , the frontend provides a sleek, user-friendly interface for inputting story seeds. To match the creative spirit of comics, the UI utilizes a custom theme, incorporating a vintage aesthetic complete with star-twinkle CSS overlays. The frontend's main jobs are: orchestrator.py The orchestrator acts as the brain of the operation, executing three distinct phases in the lifecycle of a single comic panel: meta-llama/Meta-Llama-3-8B-Instruct . stabilityai/sdxl-turbo to synthesize the retro comic art.Modern AI development often leans toward massive foundational models, but Perri prioritizes speed, efficiency, and cost-effectiveness by utilizing specialized models that punch well above their weight class. | Model Role | Model Used | Parameter Size | Why It Was Chosen | |---|---|---|---| Story & Scripting | meta-llama/Meta-Llama-3-8B-Instruct | 8 Billion | Delivers highly precise, structured instruction-following for scripting without the latency of larger LLMs. | Art Generation | stabilityai/sdxl-turbo | ~3.5 Billion | A single-step adversarial diffusion model that generates high-quality comic art in a fraction of a second. | By keeping all models well under the 32B threshold, the entire pipeline can run on highly optimized, consumer-accessible cloud GPUs, keeping latency low and the user experience snappy. Perri is configured to run effortlessly in the cloud but is designed with a decoupled infrastructure: MODAL ENDPOINT URL :To bridge the frontend and backend securely, the application relies on two key environment secrets: HF TOKEN : For authenticating requests to Hugging Face hubs and spaces. MODAL ENDPOINT URL : Directs the frontend UI to the serverless backend worker.Want to experiment with the theme or modify the layout? You can spin up the frontend locally in just a few steps. gradio . .env file with your MODAL ENDPOINT URL . python app.py Perri Comic Generator demonstrates how small, specialized models can be chained together to build rich, creative applications. By leveraging an 8B LLM for structuring thoughts and a fast Turbo diffusion model for generation, Perri delivers a nostalgic, automated comic-creation experience without the overhead of massive enterprise AI infrastructure.