When Software Started Writing Software: A Developer’s History of AI

wpnews.pro

If you've shipped software in the last three years, you've probably watched your job description quietly rewrite itself. You went from writing code, to writing code with an autocomplete, to writing code with a collaborator, to increasingly writing a spec and watching an agent write, test, and ship the code for you. That didn't happen overnight. It's the latest chapter in a 70-year story that started with researchers trying to teach machines to play checkers. Let's walk through it, not as a dry timeline, but as the story of how "intelligence" kept getting redefined every time machines got good at the last definition.

The founding bet of AI, made official at the 1956 Dartmouth Workshop, was simple and audacious: thought is computation. If you could represent knowledge as symbols and rules, and manipulate those symbols correctly, you'd get intelligence.

This gave us:

IF-THEN

statements. MYCIN could diagnose bacterial infections about as well as a human specialist, using a few hundred hand-written rules.This was weak, narrow intelligence in the most literal sense: a system that was a genius in one box and knew nothing outside it. The fatal flaw was scale, every rule had to be written by hand by a human expert. Knowledge didn't generalize, and it didn't learn from data. When funding agencies realized these systems couldn't handle the messiness of the real world, the money dried up. This was the first AI winter.

While "AI" was a dirty word in grant applications, a different idea was gaining ground: instead of telling a machine the rules, show it examples and let it find the rules itself. This era's intelligence was statistical rather than logical: pattern recognition over labeled data. It worked well on narrow, well-defined tasks but needed mountains of hand-labeled examples and feature engineering done by humans. The "intelligence" was still mostly in the human designing the features the model was just fitting a curve.

The ingredients for the next leap had been sitting around for a decade: more data (the internet), more compute (GPUs originally built for video games), and algorithmic tricks for training deeper networks without them collapsing into noise. None of this looked inevitable at the time. Most of the field had moved on from neural networks; deep nets were a fringe bet kept alive by a small number of labs who kept getting told they were wasting their careers.

The spark was ** AlexNet (2012)**, a deep convolutional neural network that crushed the ImageNet image-classification competition, slashing the error rate compared to the next-best approach. That one result told the field something important: stack enough layers, feed them enough data, and the network finds its own features, no human feature engineering required. It wasn't a smooth continuation of the field's direction; it was closer to a coup. Within a couple of years, techniques that had been a punchline became the default starting point for almost every computer vision paper.

What followed was a five-year sprint: go run main.go

energy) using deep learning plus reinforcement learning plus tree search, on a game once thought too intuitive for machines.Each of these was still task-bound, a vision model couldn't write a sentence, a Go-playing model couldn't recognize a cat — but the skill inside each one was no longer handed to it by a human; the model discovered its own representation of the problem. That shift in mechanism, more than any single result, is what made the next jump possible.

In 2017, a Google paper titled "Attention Is All You Need" introduced the transformer architecture. Instead of processing text sequentially like older recurrent networks, transformers let every word attend to every other word at once. It was a better way to model sequences and it turned out to scale beautifully.

That architectural choice, combined with the realization that you could pretrain a single giant model on a huge slice of the internet and then adapt it to almost any language task, produced the GPT lineage:

This is the point where AI stopped meaning "deep in one box, blank everywhere else" and started meaning compositional: a single set of weights that could combine skills it was never explicitly trained to combine. One model could write a sonnet, debug Python, and explain the sonnet's meter not because it was three different systems, but because language turned out to be a surprisingly good universal interface to a huge range of human tasks.

A language model that just answers questions is a powerful autocomplete. The next phase of the story is about giving that model hands: the ability to call tools, write and execute code, browse the web, remember state across steps, and chain its own reasoning into multi-step plans.

A few threads converged here:

This is the world a developer in 2026 actually lives in. Code isn't just suggested line-by-line; it's planned, written across multiple files, tested, and debugged in a loop that needs less hands-on steering than it used to, though still real review, real guardrails, and real human judgment about when to trust the output. The same pattern shows up outside coding: research agents that browse, synthesize, and cite sources across dozens of pages; operations agents that read a ticket, check a calendar, and draft a response; design agents that take a brief and return a working prototype. None of this is autonomy in the strong sense yet. It's interactive capability, a model in a loop with tools and a feedback signal — and it's powerful precisely because of how tightly that loop is engineered, not in spite of it.

Easy to miss when you're staring at capability charts: every jump here was also an economics story. Expert systems died because expert time didn't scale. Statistical ML rode cheap labeled data and storage. Deep learning rode gaming GPUs. LLMs rode internet-scale data and transformer parallelism. Agents are having their moment because inference got cheap enough to run a model in a loop hundreds of times per task without laughing. The recurring question isn't "can we build it?" but it's "can we afford to run it enough times to be useful?" That bottleneck moved; it didn't disappear.

If you zoom out, the history of AI is a story about where the intelligence lives:

Era	Where the "smarts" lived
Symbolic AI	In rules a human expert wrote by hand
Statistical ML	In features a human engineer chose
Deep learning	In representations the model learned itself
Large language models	In patterns learned from most of the public internet
Agentic systems	In the model's own planning, tool use, and self-correction across time

Each era didn't replace the last so much as absorb it. Today's agents still do statistical pattern matching under the hood; they still occasionally fail in the brittle, overconfident ways the old expert systems did, just less often and less predictably.

If you compress the five eras above, there are really only three discontinuities that mattered: The five-era version tells a better story; the three-version compression is what actually changed underneath.

This isn't a straight march toward AGI. It's researchers repeatedly asking "what if the machine decided this part?" and finding that worked, once the economics finally allowed it.

Whether the next chapter is "agents that reliably run entire businesses" or "another winter while the hype outpaces the engineering" is genuinely an open question and depending on who you ask, both are already happening at once.

source & further reading

dev.to — original article Your AI Marketing Stack Is a GPT Wrapper Wearing a Trench Coat Observability told me exactly how much money my agents wasted. I wanted something that says no. AI & Human Collaboration: Building audit.sh

When Software Started Writing Software: A Developer’s History of AI

Run your AI side-project on zahid.host