{"slug": "language-models-imply-world-models", "title": "Language models imply world models", "summary": "Large language models like Claude necessarily contain a model of the world, not just language, according to a recent analysis of early machine translation research. Yehoshua Bar-Hillel argued in 1960 that accurate language translation requires a \"universal encyclopedia\" of world knowledge, a goal he deemed \"utterly chimerical\" at the time. Modern LLMs, trained on encyclopedic data like Wikipedia, now demonstrate this world understanding by correctly resolving ambiguous sentences such as \"the box was in the pen.", "body_md": "In\n[a recent article about John Haugeland's rejection of micro-worlds](https://blog.plover.com/tech/gpt/micro-worlds.html)\nI claimed:\n\nas a “Large Language Model”, Claude necessarily includes a model of the world in general\n\nNobody has objected to this remark, but I would like to expand on it.\nThe claim may or may not be true — it is an empirical question. But as a\ntheory it has been widely entertained since the very earliest days of\ndigital computers. [Yehoshua Bar-Hillel](https://en.wikipedia.org/wiki/Yehoshua_Bar%2dHillel), the first person to seriously\ninvestigate machine translation, came to this conclusion in the\n1950s. Here's an extract of Haugeland's discussion of his work:\n\nIn 1951 Yehoshua Bar-Hillel became the first person to earn a living from work on machine translation. Nine years later he was the first to point out the fatal flaw in the whole enterprise, and therefore to abandon it. Bar-Hillel proposed a simple test sentence:\n\nThe box was in the pen.\n\nAnd, for discussion, he considered only the ambiguity: (1) pen = a writing instrument; versus (2) pen = a child's play enclosure. Extraordinary circumstances aside (they only make the problem harder), any normal English speaker will instantly choose \"playpen\" as the right reading. How? By understanding the sentence and exercising a little common sense. As anybody knows, if one physical object is in another, then the latter must be the larger; fountain pens tend to be much smaller than boxes, whereas playpens are plenty big.\n\nWhy not encode these facts (and others like them) right into the system? Bar-Hillel observes:\n\nWhat such a suggestion amounts to, if taken seriously, is the requirement that a translation machine should not only be supplied with a dictionary but also with a universal encyclopedia. This is surely utterly chimerical and hardly deserves any further discussion. (1960, p. 160)\n\n(*Artifical Intelligence: The Very Idea*; John Haugeland; p.174–176.)\n\nBar-Hillel says, and I agree, that an accurate model of language requires an accurate model of the world. In 1960, this appeared “utterly chimerical”. Perhaps so, but here we are, and 55 years later we have what most agree is a language model capable of producing intelligible text complex enough to fool sophisticated readers. Even people who call the LLM a “stochastic word garbage spewer” and object when it is called “AI” seem to have no qualms about the term “large language model”.\n\nThe Winograd SHRDLU project was an attempt to construct a world whose model was small enough to not be utterly chimerical. This worked, but as Haugeland says, it wasn't actually interesting.\n\n[Doug Lenat](https://en.wikipedia.org/wiki/Doug_Lenat) tried for forty years construct\n[an explicit world model](https://en.wikipedia.org/wiki/Cyc). When he died in 2023 I think\npeople still considered the project to have been utterly chimerical.\n\nI may be misrepresenting Bar-Hillel's views. He didn't actually say\n“world model”, he said “universal encyclopedia”. Lenat's project's\nname, “Cyc”, acknowledges this: “Cyc” is short for “encyclopedia”. And\nuntil recently a universal encyclopedia did seem, to many, to be\nutterly chimerical. (Imperial China produced\n[many interesting attempts](https://en.wikipedia.org/wiki/Leishu).) But we do now *in fact*\nhave [a universal encyclopedia](https://www.wikipedia.org/), and\nClaude has ingested that universal encyclopedia.\n\nBar-Hillel thought that the “box is in the pen” example proved that machine translation would be at the very least extremely difficult: it requires understanding of the usual relative sizes of boxes and pens, and that the possibility of such understanding in general “hardly deserves any further discussion”.\n\nI asked Claude:\n\nConsider the sentence “the box was in the pen”. What does “pen” mean here?\n\nClaude said:\n\nMost likely a playpen or animal pen (enclosure), since a box wouldn't fit inside a writing pen.\n\nPresumably Bar-Hillel's article and Haugeland's book was in Claude's training data, and perhaps it somehow absorbed the right thing to say here without actually knowing anything about boxes or pens. I tried the next similar example that came to mind:\n\nHow about “suit” in “what suit was the king”?\n\nPlaying card suit - hearts, diamonds, clubs, or spades.\n\nWhy?\n\nThe context \"what suit was the king\" is the standard phrasing for identifying a playing card. A king is one of the face cards in each suit, so asking about its suit is a natural card game question.\n\nOkay, that's just what I would have said.\n\nWhy couldn't it be a human king wearing a suit?\n\nIt could be - I jumped to the card interpretation because \"what suit was the king\" uses \"suit\" as a predicate noun, which is uncommon phrasing for clothing. You'd more naturally say \"what suit was the king wearing\" or \"what kind of suit did the king have.\" But without more context, the clothing interpretation is perfectly plausible.\n\nI think Claude is wrong about the grammar here. I'm not sure what “predicate noun” means and I suspect Claude is using the term incorrectly. I can't understand “what suit was the king?” as grammatical in any plausible construction, only in highly contrived situtations such as an evil wizard transforming the king into a suit of armor and hiding him in the armory.\n\nBut *if* one agrees to take it to mean “what suit does the king have” I\nagree that this is a perfectly plausible interpretation even though\nthere isn't anything clearly clothing-related nearby, and if one\nassumes it meant “what suit was the king wearing” then the clothing\ninterpretation is unavoidable.\n\nCould it be a lawsuit?\n\nYes - \"what suit was the king\" could mean what lawsuit the king was involved in. That reading makes grammatical sense, though it would be more natural with \"in\" (\"what suit was the king in\"). Context would be needed to distinguish between cards, clothing, and lawsuits.\n\nAgain I disagree that this makes grammatical sense. Without “in” I\nthink a native listener would be at least puzzled. (“Do you mean ‘what\nsuit was the king *involved in*?’?”)\n\nCould it be a misspelling for “soot”?\n\nYes, that's plausible - \"what soot was the king\" could be asking what type of soot the king was covered in or exposed to. Though it's an odd question without specific context like a fire or industrial setting.\n\nIn my opinion, it is not grammatically plausible.\n\nHowever, I think focusing on the *grammatical* errors would missing\nthe point. The “soot” interpretation is unlikely, I think the bad\ngrammar rules it out, and Claude's response does not seem to\nrecognize this. But Claude *does* express understanding that that\nthe interpretation is unikely without some surrounding soot-related\ncontext, and a reasonable idea of what that context would need to be\nlike.\n\nWhat if it were “what color is the king”?\n\nIf it's \"what color is the king,\" then the card suit interpretation becomes much less likely (suits aren't typically described as colors, though red/black could work). It's more likely asking about a physical object - a chess piece, a game token, a statue, or similar - where the king's color is relevant.\n\nI had been thinking of “color” as referring to the king's skin color, but I agree that the chess piece or game token interpretation is more plausible. Again Claude seems to understand something about how we talk about things. If the king were a playing card, we might be asking if it is a red king or a black king. But without that context it's now more likely that the king is a chess piece than a playing card.\n\nWhat if the sentence was known to have been translated from German?\n\nThat would strengthen the card suit interpretation significantly. German card suits (Herz, Karo, Pik, Kreuz) are often called \"Farbe\" (color) in German card game terminology, so \"what color is the king\" could be a literal translation of a question about which suit a king card belongs to.\n\nThat's just what I would have said.\n\nCould “suit” have been an accidental misspelling of “age”?\n\nOn a standard keyboard, that seems unlikely - 'a', 'g', 'e' are nowhere near 's', 'u', 'i', 't'. Autocorrect or speech-to-text could produce it, but not a typical typing error.\n\nI thought Claude might botch this. There is no grammatical or semantic reason why “what age was the king” doesn't work, so the only reason to reject this is orthographic. And Claude's grasp of spelling still seems quite weak. The other day it claimed that “‘Warner’ is likely either a typo for ‘Cham’ or …” which couldn't be wronger. I told it at the time that it is impossible that \"Cham\" could be a typo for “Warner”, and if pressed to explain why, I would have said something like what Claude said here.\n\nInteresting as this all is, it is a digression. My main points, again:\n\nIt is at least plausible that coherent speech requires a model of a large fraction of the world, and, while it may yet turn out to be false, this theory has been seen as plausible for generations.\n\nWhatever else Claude can or can't do, it can certainly speak coherently.\n\nTherefore Claude probably does have something like a model of a substantial part of the world.\n\nIn 1960 this appeared completely impossible.\n\nBut here we are.\n\nI disagreed with Claude that “what suit was the king” made grammatical sense. Rik Signes has pointed out that it it is certainly grammatical, because the grammar is the same as “what person was the king” or “what visitor was the king”. My discomfort with it is not grammatical, it is pragmatic.", "url": "https://wpnews.pro/news/language-models-imply-world-models", "canonical_source": "https://blog.plover.com/2026/02/12#micro-worlds-2", "published_at": "2026-06-05 14:17:35.941190+00:00", "updated_at": "2026-06-05 14:17:39.184970+00:00", "lang": "en", "topics": ["large-language-models", "natural-language-processing", "artificial-intelligence", "ai-research"], "entities": ["John Haugeland", "Yehoshua Bar-Hillel", "Claude"], "alternates": {"html": "https://wpnews.pro/news/language-models-imply-world-models", "markdown": "https://wpnews.pro/news/language-models-imply-world-models.md", "text": "https://wpnews.pro/news/language-models-imply-world-models.txt", "jsonld": "https://wpnews.pro/news/language-models-imply-world-models.jsonld"}}