cd /news/artificial-intelligence/the-core-of-a-coding-agent-is-128-li… Β· home β€Ί topics β€Ί artificial-intelligence β€Ί article
[ARTICLE Β· art-35537] src=dev.to β†— pub= topic=artificial-intelligence verified=true sentiment=↑ positive

The Core of a Coding Agent Is 128 Lines of Python. So I Built One From Scratch.

A developer built a coding agent from scratch in 128 lines of Python, demonstrating that the core loop powering tools like Claude Code and Cursor is surprisingly simple. The agent autonomously reads files, runs tests, diagnoses failures, fixes code, and re-runs tests without hard-coded steps. The project is open source under MIT license.

read4 min views1 publishedJun 21, 2026

128 lines of Python.

That's the entire core of a coding agent β€” the loop that powers tools like Claude Code and Cursor. I didn't believe it either, so I built one from scratch. Then I pointed it at a failing test, and it read the file, ran the test, saw the traceback, fixed the code, and re-ran it β€” choosing every step itself. No one hard-coded that.

It's open source (MIT), with a phased roadmap you can follow:

πŸ‘‰ github.com/osama96gh/coding-agent-from-scratch

I use coding agents every day. As an AI engineer, I think they're the breakout use case for LLMs right now. But using something and understanding it are different things.

Reading a production agent's source to learn the core is a trap β€” the essential logic is buried under prompt caching, retries, telemetry, and elaborate scaffolding. You can't see the engine for the bodywork.

So I built just the engine. No optimizations. Just the essence.

These surprised me enough that I re-counted:

Piece Size
Entire REPL + agent loop + permission gate (main.py )
128 lines
The system prompt that steers all behavior (prompts.py )
19 lines
Tools β€” read, list, grep, edit, write, run_bash
6 files, smallest is 35
Whole project, incl. 2 swappable providers + streaming
~1,300 lines

The thing that feels like magic β€” an agent autonomously reading files, running your tests, fixing the failure, re-running β€” comes out of about a hundred lines of orchestration. The intelligence lives in the model. Your job is plumbing.

Strip away the streaming, the permission gate, and the UI, and the heartbeat of the whole thing is this:

conversation.append({"role": "user", "content": user_input})

while True:  # keep going until the model stops asking for tools
    turn = llm.call(conversation, tools=TOOL_SCHEMAS, system=SYSTEM_PROMPT)
    conversation.append(turn.to_message())

    if not turn.tool_calls:        # plain text β†’ the model is done
        break

    for call in turn.tool_calls:   # otherwise, run each tool it asked for…
        result = run_tool(call.name, call.args)
        conversation.append({
            "role": "tool", "id": call.id,
            "name": call.name, "content": result,
        })

That's it. That's the agent.

main.py

", "run pytest

").The model decides which tool and in what order; the loop just keeps turning until the model stops asking.

An agent is just an LLM, a loop, and some tools. Everything else in this repo is refinement on top of those three.

This is also where "it can debug itself" comes from β€” for free. When the shell tool feeds exit codes and stderr back into the conversation, the model sees the failure on the next turn and proposes a fix. Nobody wrote if tests fail, edit the code

. It falls out of the loop.

One file each: read_file

, list_files

, grep

, edit_file

, write_file

, run_bash

.

Each is just a function plus a JSON schema describing its arguments β€” and that schema is all the model needs to know the tool exists and how to call it. "Tool calling" sounds advanced; it's really "here's a function signature, fill in the arguments."

run_bash

alone is almost a superpower β€” with a shell you can stand in for most of the others β€” which is exactly why an agent needs a permission gate.

These refinements sit on top of the core, and they're where most of the line count goes:

git status

runs unprompted while git push

still stops to ask. The difference between an assistant and rm -rf

roulette.That failing-test run from the top? I never scripted it. The model chose to read, run, diagnose, fix, and re-run entirely on its own β€” the same shape of behavior I pay for in Claude Code every day, out of ~128 lines I could read in a single sitting.

The gap between "toy" and "real" is smaller than the hype suggests. The production polish β€” caching, retries, sandboxing, a thousand handled edge cases β€” is genuine, hard engineering. But the core that makes an agent an agent is within any engineer's reach in an afternoon.

The repo is a phased roadmap β€” each phase runs on its own and teaches one concept, so you always have a working agent:

read_file

list_files

, grep

)edit_file

, write_file

)run_bash

) β€” where it gets powerful (and dangerous)A learning project: build a simple but real coding agent (think a tiny Claude Code / Cursor / Codex), step by step, from nothing β€” to understand how complex AI agents are actually structured under the hood.

The one-sentence mental model:An agent is just an LLM, a loop, and some tools.Everything else is refinement. ([source])

This repository is an educational, from-scratch Python implementation of a terminal coding agent. It shows the core mechanics behind modern AI coding tools: a model-driven agent loop, tool calling, file exploration, targeted code edits, shell command execution, permission checks, streaming responses, usage reporting, context compaction, and pluggable OpenAI/Gemini providers.

It is meant to be read, modified, and learned from. It is not a production coding agent, but a small reference implementation for understanding how production coding agents are structured under the hood.

Build it, break it, extend it (a new tool, a web UI, a third provider) β€” and tell me how it goes. The fastest way to stop an AI tool from feeling like magic is to build a small one yourself.

── more in #artificial-intelligence 4 stories Β· sorted by recency
── more on @claude code 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/the-core-of-a-coding…] indexed:0 read:4min 2026-06-21 Β· β€”