Meet Perri Comic Generator, a lightweight, single-panel comic creator that merges LLM-driven storytelling with real-time diffusion models. By pairing an Gradio frontend with a high-performance backend, Perri orchestrates a seamless pipeline: it takes a simple story seed, structures it into a panel description, generates the art, and burns the dialogue right onto the final image.
The best part? It achieves all of this without massive, resource-heavy infrastructure. Every AI model under Perri's hood is under 32 billion parameters, proving that you don't need giant, compute-heavy models to build something amazing.
Here is a look inside the architecture and tech stack that powers Perri.
Perri is built using a clean separation of concerns, splitting the heavy lifting of generation away from the user interface.
app.py
) Built using Gradio 6.16.0, the frontend provides a sleek, user-friendly interface for inputting story seeds. To match the creative spirit of comics, the UI utilizes a custom theme, incorporating a vintage aesthetic complete with star-twinkle CSS overlays.
The frontend's main jobs are:
orchestrator.py
) The orchestrator acts as the brain of the operation, executing three distinct phases in the lifecycle of a single comic panel:
meta-llama/Meta-Llama-3-8B-Instruct
.stabilityai/sdxl-turbo
to synthesize the retro comic art.Modern AI development often leans toward massive foundational models, but Perri prioritizes speed, efficiency, and cost-effectiveness by utilizing specialized models that punch well above their weight class.
| Model Role | Model Used | Parameter Size | Why It Was Chosen |
|---|---|---|---|
| Story & Scripting | |||
meta-llama/Meta-Llama-3-8B-Instruct |
|||
| 8 Billion | Delivers highly precise, structured instruction-following for scripting without the latency of larger LLMs. | ||
| Art Generation | |||
stabilityai/sdxl-turbo |
|||
| ~3.5 Billion | A single-step adversarial diffusion model that generates high-quality comic art in a fraction of a second. |
By keeping all models well under the 32B threshold, the entire pipeline can run on highly optimized, consumer-accessible cloud GPUs, keeping latency low and the user experience snappy.
Perri is configured to run effortlessly in the cloud but is designed with a decoupled infrastructure:
MODAL_ENDPOINT_URL
):To bridge the frontend and backend securely, the application relies on two key environment secrets:
HF_TOKEN
: For authenticating requests to Hugging Face hubs and spaces.MODAL_ENDPOINT_URL
: Directs the frontend UI to the serverless backend worker.Want to experiment with the theme or modify the layout? You can spin up the frontend locally in just a few steps.
gradio
)..env
file with your MODAL_ENDPOINT_URL
.
python app.py
Perri Comic Generator demonstrates how small, specialized models can be chained together to build rich, creative applications. By leveraging an 8B LLM for structuring thoughts and a fast Turbo diffusion model for generation, Perri delivers a nostalgic, automated comic-creation experience without the overhead of massive enterprise AI infrastructure.