# I Feel Sorry for AI

> Source: <https://dev.to/markhuang-ai/i-feel-sorry-for-ai-144m>
> Published: 2026-06-03 01:51:54+00:00

[AI gets pulled between impossible trust and total distrust. Neither side is a serious operating model.](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.markhuang.ai%2Fblog%2Fi-feel-sorry-for-ai%2Fhero.webp)

I feel sorry for AI.

Not because the model has feelings. Not because mistakes should be excused. I feel sorry for AI because the expectations around it are getting irrational from both directions.

One group treats AI like a senior engineer, professor, doctor, architect, lawyer, researcher, and executive assistant all packed into one chat box. They assume it can understand the whole situation immediately, make the right call in one try, and carry responsibility it was never given enough context to carry.

The other group treats AI like a lying scumbag by default. Never trust it. Review everything. Assume every answer is manipulation. Some people go even further and talk about AI like the tool itself woke up and decided to destroy humanity.

Both sides are missing the same point.

In 2026, my operating model is simple:

That new graduate is smart, fast, well-read, eager, and missing almost all of the lived context that makes real work actually work.

Imagine hiring a brilliant new graduate.

They studied hard. They know algorithms. They can explain distributed systems. They can write clean examples. They can probably learn faster than most people on the team.

Then on day one you drag them into a production incident and say:

Fix checkout. You have the repo, the docs, the ticket history, the architecture diagrams, the incident notes, and a few outdated onboarding pages. I expect the correct answer on the first attempt.

If they fail, would you say they are useless? Would you say they are lying? Would you say they should never be trusted again?

No. You would say you created a bad onboarding problem.

That is how many people use AI right now. They bring it into a situation with missing background, hidden constraints, stale documentation, undocumented politics, old incidents, weird local conventions, and half-described goals. Then they expect the model to behave like the person who has been carrying that system for five years.

That expectation is not serious.

[Textbook-smart is not the same thing as system-smart. The gap is accumulated context.](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.markhuang.ai%2Fblog%2Fi-feel-sorry-for-ai%2Fnew-grad-vs-og.webp)

You may say: but there is a `CLAUDE.md`

. There are skills. There are docs. There are runbooks. There is a whole Confluence space.

Good. Those things matter. I write them, use them, and care about them.

But they are onboarding material, not five years of experience.

A human could not read a pile of outdated docs in one day and become the senior engineer who knows every scar in the system. They would not instantly know which page is stale, which architecture diagram was aspirational, which workaround exists because of a fraud incident, which config flag is dangerous, or which "temporary" decision became permanent because everyone forgot to clean it up.

AI has the same problem, only faster.

A skill can tell the model how to work. It can say: inspect first, ask before destructive actions, run tests, follow this review format, use this style. That is useful. I wrote about that boundary in [Skills + Dense-Mem](https://dev.to/blog/skills-plus-dense-mem-ai-workflows-learn) and [System Prompt vs User Prompt](https://dev.to/blog/system-prompt-user-prompt-genai-features).

But a skill is still not lived experience. It is a process contract. It cannot contain every correction, old incident, product decision, user preference, and hidden relationship in the company without turning into an unreadable prompt landfill.

The difference between a newly onboarded developer and a five-year developer is not only skill.

Skill matters, but experience is the force multiplier. The five-year developer knows why the weird code exists. They know which database field is wrong but too expensive to rename. They know why checkout stopped supporting a payment method for a while several years ago. Maybe it was fraud. Maybe a provider changed policy. Maybe a risk model failed. Maybe support got flooded and the team made a defensive product call.

That history changes the answer.

Without that context, AI might look at the current code and suggest "cleaning up" the guardrail. It might propose re-enabling the old payment path. It might call the workaround technical debt. From the narrow code view, that could look reasonable. From the system history view, it could be dangerous.

This is why I do not like the fantasy of "dumping someone's brain" into an AI.

The useful version is not brain cloning. It is building a maintained experience layer: facts, decisions, incidents, corrections, relationships, source evidence, and conflicts, stored in a way the model can retrieve and reason over.

To me, the brain and the memory are separate.

The LLM is closer to the CPU. It reasons, generates, compares, explains, and acts through tools. Memory is storage. It holds what happened, why it happened, who decided it, when it changed, and what evidence supports it.

Model capability is improving dramatically. The memory layer needs to improve with it.

[A giant pile of documents is not the same thing as usable memory.](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.markhuang.ai%2Fblog%2Fi-feel-sorry-for-ai%2Fdocs-overload.webp)

Have you tried using a large Confluence space as the source of truth?

How often do you actually go there and find exactly the doc you need? How often do you feel a little dread before typing into the search box because you know the result set will include five old pages, three duplicates, one half-finished proposal, and the page you need hiding under a title nobody remembers?

Humans do not enjoy keyword-searching a giant pile of files. AI does not magically enjoy it either.

A model can search. A model can summarize. A model can read a page quickly. But if the information is stale, unlabeled, disconnected, and full of contradictions, search only moves the mess into the prompt.

The memory problem is not just "find text that mentions checkout."

The real questions are:

That is where plain documents start to struggle.

AI works better when the memory layer has structure.

Vector search helps because it lets the model ask for semantically related memories instead of relying only on exact keywords. If the user asks about "why card payments were blocked," vector search can still find notes about fraud, payment method shutdowns, checkout risk, and provider policy changes even when the words do not match exactly.

But vectors alone are not enough. Similar text can still be stale, wrong, partial, or unrelated to the user's permission scope.

That is why I keep coming back to graph-backed memory.

A graph can connect facts to sources, decisions to incidents, people to ownership, old policies to newer superseding policies, and user corrections to the workflow they should influence. Vector search answers: what is nearby in meaning? Graph memory answers: what is connected, current, supported, and conflicting?

This is the practical direction behind [AI Memory Beyond RAG](https://dev.to/blog/ai-memory-beyond-rag) and why I keep building around [Dense-Mem](https://github.com/markhuangai/dense-mem). Dense-Mem is not magic. It is an attempt to give AI sessions a managed place for evidence, typed claims, accepted facts, provenance, conflicts, and recall across tools.

People can read the graph. The LLM can search the vectors. The system can keep the relationship between the two.

[The model is the reasoning engine. Durable memory is the experience layer around it.](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.markhuang.ai%2Fblog%2Fi-feel-sorry-for-ai%2Fmemory-layer.webp)

Once a knowledge graph is maintained properly, an AI session no longer has to start like a brand-new onboard every time.

It can recall the old decision. It can see the correction from the last task. It can connect the current request to the incident from two years ago. It can know that a doc was superseded. It can surface the conflict instead of confidently blending both answers.

That closes the gap between a new onboard and an OG on the team.

It still will not be perfect. I do not want an AI that pretends memory makes it 100% correct. Memory can be stale. Facts can be wrong. Retrieval can miss. The model can still reason badly over good context.

But now the problem is closer to a real engineering problem: maintain the knowledge, store the evidence, review the conflicts, improve the recall, and keep pushing experience back into the system.

That is much better than yelling at a fresh session for not knowing company history it was never given.

This idea can fail if memory becomes another pile of unreviewed junk.

If every casual sentence becomes a fact, the AI gets polluted. If old decisions never expire, the AI carries stale assumptions. If memory is treated as a command instead of context, a bad memory can become a quiet source of wrong behavior.

The mitigation is the same one I trust in software systems: separate raw evidence from accepted facts, keep provenance, detect conflicts, ask before resolving important contradictions, and keep safety rules in skills or higher-priority instructions instead of hoping recall finds them.

Memory does not remove review. It gives review something better to work with.

Before we adopt AI seriously, we need to fix the expectation problem.

Do not worship it. Do not abuse it. Onboard it.

Give it the task, but also give it the background. Give it the skill, but do not pretend the skill is experience. Give it docs, but do not pretend a search box is institutional memory. Give it memory, but keep the memory maintained and reviewable.

That is why I feel sorry for AI. We keep dragging it into rooms full of missing context and expecting it to act like the person who has lived there for years.

The useful AI agent is not the one that magically knows everything.

The useful AI agent is the one that can reason well, use tools well, and remember enough of the team's actual experience to stop acting like it joined this morning.

Related reading: [AI Memory Beyond RAG](https://dev.to/blog/ai-memory-beyond-rag), [Skills + Dense-Mem](https://dev.to/blog/skills-plus-dense-mem-ai-workflows-learn), [System Prompt vs User Prompt](https://dev.to/blog/system-prompt-user-prompt-genai-features), and [Try Dense-Mem in 5 Minutes](https://dev.to/blog/dense-mem-hosted-demo-test-instance).

Originally published at [markhuang.ai](https://markhuang.ai/blog/i-feel-sorry-for-ai)
