cd /news/generative-ai/the-painter-in-the-mist-understandin… · home topics generative-ai article
[ARTICLE · art-14104] src=imagesv2.ai pub= topic=generative-ai verified=true sentiment=↑ positive

The Painter in the Mist — Understanding Diffusion Models Through a Fable

A young painter named Wu Sheng, living in the mist-shrouded town of Bailan, created images not by adding brushstrokes but by erasing chaos from a canvas covered in dark grey paint. Over 49 days, he removed noise step by step to reveal a requested scene of a floating library at dusk with two moons, explaining that he started with everything hidden inside the mist. The fable illustrates the core principle of modern AI diffusion models, which generate images by beginning with pure random noise and gradually removing it until a picture emerges.

read5 min publishedApr 29, 2026

Long ago, in a small town nestled between northern mountains, there was a place forever shrouded in heavy mist, called Bailan.

In Bailan lived a young painter named Wu Sheng — "Born of Mist."

He had a strange gift. Other painters first looked clearly at a thing, then drew it stroke by stroke. He worked the opposite way — he stared into a swirling mass of fog and slowly saw a painting emerge.

People thought it absurd. "How could there be a painting in mist?" Wu Sheng would only smile, and never explain.

A Library on the Sea One day the old town governor summoned him. "I want a painting of something that has never existed: a library floating on the sea, at dusk, with two moons in the sky."

The room burst into laughter. "No such place exists. How could anyone paint it?"

But Wu Sheng simply nodded. "I can."

He took a sheet of white paper — and instead of putting brush to it, he covered the whole sheet in chaotic dark grey paint, like a window after a snowstorm. Nothing was visible. The onlookers were even more puzzled. "You're ruining it."

Wu Sheng replied: "A real painting must first learn to hide."

For the days that followed, he did only one thing: erase a tiny bit of chaos at a time. Not all at once. Not in great strokes. Just a little. One day he uncovered a faint patch of light. The next, a stretch of coastline. The day after, the suggestion of bookshelves. Later, two moons floated up out of the haze. He seemed to be in negotiation with the fog itself. Not creating, but constantly asking: "What was supposed to be here?" When he erased wrong, he reconsidered. When something stayed unclear, he kept watching.

For forty-nine days. In the end, the paper truly held a library floating on the sea. The water was still. The book pages turned. Dusk hung in the sky like a golden breath. Two moons drifted in the distance.

Where Does the Mist Come From? The town was stunned. Someone asked, "How did you do it? You started with nothing."

Wu Sheng shook his head. "No — I started with everything. It was all just hidden inside the mist."

The governor pressed: "Then how did you know what to erase?"

Wu Sheng answered: "Because I first heard the names. Floating library. Two moons. Dusk. Sea. The words were like distant bells. I followed the sound through the fog and found the path."

Years later he took on an apprentice. The boy studied long but never grasped it. He kept thinking, I want to paint the result directly.

So Wu Sheng took him to the mountaintop. The morning fog rolled thick across the slopes.

"Do you see the tower?" Wu Sheng asked.

"No," said the apprentice.

"Then does it not exist?"

The apprentice fell silent.

Wu Sheng said: "Painting is the same. You don't create a world from nothing. You move, step by step, toward the most plausible world inside the chaos. Real painting isn't laying down brushstrokes. It's removing noise."

Years later, the people of Bailan still spoke of him. They said: he wasn't painting at all. He was teaching the world how to slowly grow order out of chaos.

The Real Idea: Diffusion Models This whole story maps onto the core principle behind modern AI image generation — the diffusion model.

Stable Diffusion, Midjourney, and the GPT Image 2 model that powers this site all rely heavily on this idea.

In one sentence:

The model doesn't paint from scratch. It starts with pure random noise and removes noise step by step, until an image emerges.

Just like Wu Sheng: cover the page in chaos first (pure noise), then erase a little at a time (gradual denoising), until the painting is revealed.

Training: Teaching the Model How to Denoise In training, the model learns this way.

Step 1: take a real image — say, a cat.

Step 2: add noise to it, repeatedly:

- Step 1: cat is still clear
- Step 100: starting to blur
- Step 500: barely visible
- Step 1000: pure random TV static

Step 3: train the model to answer: "If it looks this messy now, what did the original image probably look like?"

That is, learn the reverse process: from chaos → clarity.

That's the core of it.

Generation: Actually Painting an Image When generating an image for real, the model has no picture to start from. It only has a blob of random noise, and a prompt:

An orange cat wearing an astronaut helmet, sipping coffee on the moon.

So it begins:

- Step 1: small denoising
- Step 30: a cat silhouette appears
- Step 80: the helmet shows up
  • Step 150: the lunar background takes shape

  • Step 300: details settle in And the image is born.

Why Can Text Steer the Image? Because there's another key module: the Text Encoder.

It turns "orange cat + astronaut + moon + coffee" into a vector of numbers (a conditioning signal), and during each denoising step it keeps reminding the model:

  • "Don't forget — orange cat, not black cat."
  • "On the moon, not in a kitchen."

This is called Conditional Generation.

Why Did Diffusion Beat GANs? Earlier AI image generators relied on GANs (Generative Adversarial Networks). But GANs were notoriously unstable, prone to mode collapse, hard to train, and limited in diversity.

Diffusion is more stable, more controllable, higher quality, and scales better to large models. Which is why it has quietly become the default of the modern era.

The One-Sentence Truth AI image generation isn't "creation." It's:

Searching for the result that looks

most like an imageinside a probability space.

It's like asking, again and again, inside infinite chaos: "What's the most plausible next step here?"

That is the deepest idea in modern generative AI.

── more in #generative-ai 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/the-painter-in-the-m…] indexed:0 read:5min 2026-04-29 ·