Can an AI Agent Pass the Test We Give 4-Year-Olds?

Shridhar Shah, a senior software engineer at Outshift by Cisco, built two AI agents to test theory of mind using the Sally-Anne false-belief test. One agent, which only tracks reality, fails the test like a toddler, while the other, which models each person's beliefs separately, passes. The project demonstrates how tracking beliefs distinct from reality is key for collaborative AI agents.

Theory of Mind and the Sally-Anne false-belief test, in ~60 lines of Python. TL;DR: There's a famous test that kids pass around age 4. It checks whether you understand that other people can believe things that aren't true. I built two AI agents: one that only knows "what's actually happening" fails, like a toddler and one that keeps track of what each person believes passes . It's ~110 lines, and it's the foundation for agents that can actually work together . If you said basket , nice — you just used something called "theory of mind." Sally never saw the marble move, so in her head it's still in the basket. What's actually true it's in the box and what Sally believes it's in the basket are two different things, and you kept them separate without even thinking about it. A 3-year-old says "box" — they can't yet separate what they know from what Sally knows. A 4-year-old says "basket." It's one of the most famous tests in child psychology, and in 2026 it's become a real test for AI agents too. | ❌ Agent with no "theory of mind" | ✅ Agent that models other minds | | |---|---|---| | What it tracks | only what's actually true | what each person believes, separately | | Where will Sally look? | "box" | "basket" | | Result | FAIL only knows reality | PASS | The only difference between the two agents is one rule: a person's belief only updates when that person is actually in the room to see it happen. python def someone moves the marble new place, who is watching : for person in who is watching: only people in the room beliefs person = new place update THEIR mental picture So when Anne moves the marble while Sally is out, only Anne's mental picture updates. Sally's is frozen at "basket." Ask the simple agent and it just reports reality "box" . Ask the smarter agent and it answers from Sally's point of view "basket" . That's the whole thing. But keeping a separate picture of "what does each other person know" is the difference between an agent that's a good teammate and one that isn't. Almost everything useful about multiple agents or an agent working with a human needs this: Most AI today reasons about the world . The 2026 shift is reasoning about the people in the world — including when they're wrong. That's what turns a smart tool into a real collaborator. Being smart about the world makes a good tool. Being smart about other peoplemakes a good teammate. git clone https://github.com/Shridhar-2205/living-software cd living-software/03-theory-of-mind python demo.py Honest note: real versions have to figure out what someone believes by watching their behavior, which is much harder. Here I just tell the agent who was in the room, so the core idea — track beliefs separately from reality — is as clear as possible. Written by Shridhar Shah , Senior Software Engineer at Outshift by Cisco — AI agents, search, and how they "think." Part 3 of "Toward Living Software." GitHub · LinkedIn Background:the Sally-Anne false-belief test Baron-Cohen, Leslie & Frith, 1985 ; Kosinski, "Evaluating Large Language Models in Theory of Mind Tasks" PNAS 2024 / arXiv:2302.02083 ; and a 2026 follow-up showing how brittle this still is — "Understanding Artificial Theory of Mind" arXiv:2602.22072 .