What a Neural Net Actually Does — the Intuition, No Math

A developer explains that a neural network is not magic but a mechanical process of weighted detectors stacked in layers, turning raw input into higher-level features to make a decision. The network starts with random weights and learns from examples through backpropagation, allowing useful feature detectors to emerge on their own. An interactive demo on a 5x5 grid illustrates the flow from pixels to features to a vote, mirroring how real image models work.

People say a neural network "learns to see" or "understands images," and it sounds like sci-fi. It isn't. A neural net does something much more mechanical — and once you see the shape of it, the mystery evaporates. No math in this one, just the intuition. This is Day 4 of AIFromZero, my concept-a-day series explaining how AI actually works. Forget the brain metaphor. A single artificial neuron looks at some numbers, weighs them up, and outputs one number that answers: "how much do I see the thing I look for?" That's it — a little detector with a dial for how strongly it fires. Here's the whole trick, in three beats: Pixels → edges → parts → objects → label. Depth builds understanding, one simple step at a time. Every connection carries a weight — a number saying how much that clue counts toward the next detector. "Has a closed loop" might count a lot toward the digit 8 and against the digit 1 . A neural network is, at heart, nothing but a giant pile of these learned importances. Nobody hand-writes those detectors. The network starts random and sees thousands of labelled examples. Each mistake nudges the weights so the helpful clues count more and the misleading ones count less that's backpropagation — I build it from scratch over in DeepLearningFromZero . Train long enough and useful feature detectors emerge on their own . That's the part that still amazes me: we don't program the features, we let them grow. In the interactive demo on this page, you draw a shape on a 5×5 grid. Watch it turn your pixels into a few simple measurements top-heavy? has a centre cross? corners lit? and then cast a vote for which shape it most resembles. It's a faked, hand-coded version — but the flow, pixels → features → vote , is exactly what a real image model does, just with millions of learned features instead of four hand-written ones. A neural network turns raw input into a stack of ever-higher-level features, then votes for an answer — and it learnsthose features from examples rather than being programmed with them. That sentence covers image recognition, speech, and a surprising amount of what's inside a language model too. Not magic. Just detectors, stacked and tuned. 👉 Try the demo draw a shape, watch the features fire and the vote land : https://dev48v.infy.uk/ai/days/day4-neural-net-intuition.html https://dev48v.infy.uk/ai/days/day4-neural-net-intuition.html 🌐 All concepts: https://dev48v.infy.uk/aifromzero.php https://dev48v.infy.uk/aifromzero.php Tomorrow: how a neural net actually learns — training, in plain words.