Stop using the model as your memory

wpnews.pro

cd /news/large-language-models/stop-using-the-model-as-your-memory · home › topics › large-language-models › article

[ARTICLE · art-40426] src=dev.to ↗ pub=2026-06-26T06:41Z topic=large-language-models verified=true sentiment=· neutral

Stop using the model as your memory

A developer using Claude Code found that the model's context window is a lossy working memory, not a reliable record. By moving state out of the model and into the work—using a frozen spec and a verified checklist—the developer achieved more consistent results. The approach treats the model as a worker and the repository as the memory, reducing drift and improving reliability.

read3 min views1 publishedJun 26, 2026

I run Claude Code most of the day. The thing that kept biting me wasn't the model getting dumber. It was the model forgetting what we'd already settled, then confidently redoing it wrong.

You've probably hit it. You write a CLAUDE.md

, you keep notes, you tell it "we decided X." A few prompts later it relitigates X, or quietly breaks something it fixed an hour ago. Bigger context windows didn't fix it for me either. A 1M window just means more room for stale instructions to rot in.

Here's the reframe that actually held: stop treating the model as the place the state lives.

A context window is working memory, not a record. It's lossy, it drifts, and every new turn re-derives the world from whatever's in front of it. If "what's done and what's half-broken" only exists in that window, you're trusting the most forgetful part of the system to remember the most important thing.

So I moved the state out of the model and into the work.

Two pieces did most of it:

A frozen spec the agent re-reads. Not a chat message it might compress away. An actual file that says what we're building and what's already decided. When it starts drifting, the spec is the source of truth, not its memory of the conversation.

A checklist it can only tick after something is verified. [ ]

becomes [x] when a test passes or I've confirmed the change, never because the model "thinks" it's done. The checklist carries the progress. The model just moves it forward one verified step at a time.

The difference is subtle but it's the whole game. Before, the work was a side effect of the conversation. After, the conversation is a side effect of the work. The agent can lose the whole thread and reload from the spec plus the checklist and basically pick up where it left off.

When I actually measured my own sessions, almost none of my tokens were fresh input. The bulk was cache reads and re-reading instructions that hadn't changed. So the "context problem" wasn't that I needed to feed it more. It was that I was making it hold too much at once, and most of what it held was stale.

Shrinking what it has to carry per step did more than any prompt trick. Tighter spec, smaller surface, verify-then-advance.

The same thing breaks production agents, just louder. Local demos pass because the whole run fits in one window. In a real app the agent comes back to a thread it can't fully reload, and it drifts. "It worked on my machine" for agents is usually "it worked in one context."

If you're fighting this, the move isn't a better memory plugin. It's making the work carry the state so the model doesn't have to. I'm still figuring out the edges of this (when a spec gets too big it has its own rot problem). But the core has been stable for months: the model is the worker, the repo is the memory.

source & further reading

dev.to — original article AI is not replacing developers anytime soon Functional doesn't mean correct. That's the biggest risk with AI-generated code. 5 Free Financial Calculators I Built That I Actually Use Every Month

~/api · this article 200

$curl api.wpnews.pro/v1/news/stop-using-the-model-as-…

Read original on dev.to → dev.to/greymothjp/stop-using-the-model-as-your-m…

mentioned entities

Claude Code

metadata

slugstop-using-the-model-as-your-memory

topic#large-language-models

secondary2 topics

sentimentneutral

canonicaldev.to

navigation

← prevAI is not replacing developers a…

next →LS vice chairman highlights Nort…

── more in #large-language-models 4 stories · sorted by recency

devclubhouse.com · 26 Jun · #large-language-models

Why Developers are Trading Obsidian for Agent-Native Markdown Wikis

dev.to · 26 Jun · #large-language-models

Your team's real engineering record is the AI sessions you delete every day

dev.to · 26 Jun · #large-language-models

Spec-driven development change AI-generated code maintenance

dev.to · 26 Jun · #large-language-models

Claude Code Costs, Act IV — The mistakes catalogue & cheat sheet

── more on @claude code 3 stories trending now

wpnews · 19 Oct · #developer-tools

Windows Script to clean up and remove all ASUS software

wpnews · 1 Nov · #developer-tools

Custom Zig Test Runner, better ouput, timing display, and support for special "tests:beforeAll" and "tests:afterAll" tests

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required