# AI Agent Failure Modes Beyond Hallucination

> Source: <https://dev.to/maximsaplin/ai-agent-failure-modes-beyond-hallucination-208g>
> Published: 2026-05-22 14:59:09+00:00

AI can make mistakes, models hallucinate, models make stuff up - those are well-known complaints. Yet they are barely practical when it comes to agentic engineering. What does the knowledge that models make mistakes leave you with, except not trusting any output, or expecting every line to be double-checked, killing all the productivity?
I do use agentic tools a lot, and I am fascinated by how much they have improved over the past half year. At the same time, I am often pissed off by how badly many large tasks drift from common sense and the spirit of the task.
Lately, while reading plenty of material about AI agents, I pay more attention to what sort of failure modes people call out. Often those resonate with me heavily. It is gold when someone distills a pattern into a short characteristic of models or AI agents: the "jaggedness." This sort of knowledge helps build your own intuition around AI agent capabilities and reasonable ways to shape your work around agents. It helps with healthy expectations without buying into the over-sold dark factories and other made-up AI capability BS claims around us.
Below is my attempt to categorize and outline the failure modes called out in a few blog posts and conference talks that align with my observations.
Two related problems do not quite belong in the failure-mode table, but they explain why the whole thing gets so tiring so fast.
First, generation outruns review. Mario's "slow the f.ck down" is not just a mood; it is an operational constraint. Once agents can produce code, tests, issues, and PRs faster than humans can read them, the bottleneck moves from typing to judgment. A review agent catches some issues, but it does not restore ownership. If nobody reads the code, nobody knows what is critical, and when users start screaming there is no human understanding left in the room.
Second, the same dynamic leaks outside your repo. AI issues, AI PRs, synthetic comments, generated docs, generic posts: some of them can be useful, but the channel fills with plausible text faster than people can sort it. That is the wider AI slop problem. The cognitive residue is fatigue, cynicism, AI brainrot, and eventually all-caps prompts begging the machine to stop being cute and do the actual job.
This is why "slow down" is not nostalgia or moral scolding. It is a practical rule: keep generated work inside reviewable bounds, use agents where verification is cheap, and preserve enough human understanding to say no.
Mario, the creator of Pi Agent, uses the word "f.ck" too often in his talk. I find myself in a similar position with all caps and lots of F.CK in my prompts. I guess that is the AI fatigue from too many AI outputs manifesting :)