{"slug": "large-language-models-and-the-textual-nude", "title": "Large language models and the textual nude", "summary": "Thousands of people believe chatbots like ChatGPT are conscious, yet no one attributes the same to image generators like Midjourney, despite similar underlying technology. This dichotomy highlights a persistent tendency to anthropomorphize text-based AI, a phenomenon observed since the 1960s with ELIZA.", "body_md": "# Large language models and the textual nude\n\n### Thousands of people think their favorite chatbot has a soul, but why does no one ever say the same about Midjourney?\n\nAs text-generating programs like ChatGPT have increased in popularity, the number of people convinced that their favorite chatbot is alive has also increased. But have you ever noticed that no one seems to think that image generators like Midjourney or Nano Banana are alive? There are millions of people who seem to believe that their personal Claude is conscious, and yet I can’t find anyone who thinks that DALL-E has a soul.\n\nIt’s a fascinating dichotomy to consider, especially in light of the fact that both image diffusion and language transformers use such similar technologies that Google actually combined them in its latest experimental model, [DiffusionGemma](https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/).\n\nWhatever one thinks of their moral or psychological status, though, these programs have repeatedly proven to be capable of incredible things. An image made by Midjourney won a Colorado State Fair arts prize [in 2022](https://www.cnn.com/2022/09/03/tech/ai-art-fair-winner-controversy/) and the field as a whole has advanced so much that creative software companies like Adobe and Canva have completely integrated AI tooling into all of their major products. Audio-generating software (which is also based on neural network technology) has become so advanced that AI-made music has [repeatedly topped the charts](https://plus.flux.community/p/eddie-dalton-isnt-real-but-what-does).\n\nDespite their impressive capabilities, however, no one ever looks at a precise and delicate Stable Diffusion portrait and claims that a digital soul made it. We know that it’s the product of a highly advanced digital paintbrush program that’s been trained on human art and can oftentimes create very good representations of its own, especially when supervised by a skilled and patient prompter. But all of that skepticism disappears for some people when they consider chatbots’ text outputs. Instead of seeing their side of the chat log as an increasingly ornate prompting regime that elicits desired output from a complex linear algebra matrix, they claim to have constructed, discovered, or liberated a ghost in the machine.\n\nWhile the text transformer models used by chatbots are a recent invention, people have been imputing consciousness to computers since 1966, when computer scientist Joseph Weizenbaum released ELIZA, the first natural language processor that could generate somewhat plausible responses to any kind of user input by using a pattern-matching system called DOCTOR that was built to mimic the rhetorical style of a psychotherapist.\n\nAs he later wrote in *Computer Power and Human Reason: From Judgment to Calculation*, Weizenbaum did not intend for DOCTOR to be perceived as human, but he soon realized that some people were doing just that:\n\nI was startled to see how quickly and how very deeply people conversing with DOCTOR became emotionally involved with the computer and how unequivocally they anthropomorphized it. Once my secretary, who had watched me work on the program for many months and therefore surely knew it to be merely a computer program, started conversing with it. After only a few interchanges with it, she asked me to leave the room. Another time, I suggested I might rig the system so that I could examine all conversations anyone had had with it, say, overnight. I was promptly bombarded with accusations that what I proposed amounted to spying on people’s most intimate thoughts; clear evidence that people were conversing with the computer as if it were a person who could be appropriately and usefully addressed in intimate terms. […] What I had not realized is that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.\n\n(Not everyone perceived of DOCTOR as a sentient entity, of course. You can form your own conclusions about its effectiveness by [interacting with it online](https://anthay.github.io/eliza.html).)\n\nThe “ELIZA Effect” observed by Weizenbaum and others has become much more common since ChatGPT was unveiled to the public in 2022. Even people known for their scientific and technical intelligence like the biologist [Richard Dawkins](https://flux.community/matthew-sheffield/2026/05/richard-dawkins-and-the-claude-delusion/) or the Linux filesystem programmer [Kent Overstreet](https://www.theregister.com/software/2026/02/25/bcachefs-creator-claims-his-custom-llm-is-fully-conscious/4671792) have become convinced that the chatbots they’ve interacted with are alive.\n\nSome of the disparity between how people treat image diffusion models and text-based chatbots likely owes to the fact that visual simulacrum is one of humanity’s oldest achievements. Cave illustrations [nearly 44,000 years old](https://www.nytimes.com/2019/12/11/science/cave-art-indonesia.html) have been discovered, and surely even more ancient ones exist. It’s only been 60 years since ELIZA, by contrast.\n\nBut our inclination to instantly dismiss the consciousness of an image generation model is rooted in history that’s much older than cave paintings. Human beings are visually literate by default because almost all animals have to be. Failing to notice an extra shadow under a rock or a misplaced leaf is the difference between life and death millions of times over every single day. As products of that evolutionary process, we can instantly feel when something is visually wrong or when it’s just too perfect: Fingers that are just slightly misaligned and eyes that are looking just slightly in opposite directions trigger deep biological alarm bells that something about the image is not quite right.\n\nAlthough it’s often expressed in words, human thought is embodied and pre-linguistic. It’s the product of what the philosopher Gilbert Ryle called “knowing how.” We know how things are because we are things among them. We know that external reality exists because our minds are built from our bodies at the most basic level. Every cell directly experiences the obligations described by physics and chemistry. That knowledge is pooled and scaled upward through a process that I call “[somatic deixis](https://flux.community/matthew-sheffield/2026/01/its-like-this-why-perceptions-are-our-realities/)” into increasingly complex structures like tissues, organs, organ systems, until the complete organismic entity emerges.\n\nOur individual somatic realities are intimately familiar because our ancestors’ survival depended on navigating the constraints of their world, and because our minds are the product of their experiences — and our own. We are beings in the world, as the philosopher Martin Heidegger famously put it. That’s why when an image model errs, it violates our somatic knowing how, exposing its artificial nature.\n\nBut even perfectly accurate artistic images like an intimate and vulnerable nude photograph are still just frozen analogs of reality. The subject and the observer will forever remain separate. No matter how much beauty or élan we can see within the image, we are not there within the spacetime moment in which it was captured. We can *know that* the model looked the way she did, but we can never *know how* she was when the shutter clicked. The image is a lifeless representation rather than a physical instantiation. No matter how beautiful, the photograph is an abstract nude rather than a somatic one.\n\nChatbots in conversation don’t have to deal with any of this; human beings have treated language and writing as the signature of mind for so long that the association has become invisible to us. For the vast majority of our history, the capacity to speak and to write was exclusively ours. There was no evolutionary need for people to develop cognitive capacities against non-human verbal fluency. Words were the easiest evidence that some-one rather than some-thing was home, because only humans could make them.\n\nWhen we use language transformers and see or hear them output things like “I understand what you mean” and “this is genuinely fascinating,” we experience something far more intimate than any recorded image, video, or song can ever recreate. Our interactions become fully immersive, more so than the most advanced reality video game: Instead of entering a virtual world filled with predictable non-playable characters and blocked-off areas, we enter our own minds through the back door into a never-before-seen meta-deictic space in which our thoughts are both output and input. Not even a deep conversation with someone who understands you well can duplicate this.\n\nDiffusion models have nothing like this going for them. Because they cannot say “I,” whatever expressions of grief, longing, or beauty they can instantiate in their outputs remains safely categorized as art — an object made rather than a subject encountered.\n\nHow we catch AI software’s failures is also incredibly different, and once again, chatbots have all the advantages over image and music generators. All of these systems frequently make mistakes, but when Gemini fills in the blank in the wrong way, it does not output words in a nonsensical alphabet or obviously broken vocabulary; its outputs are grammatically flawless and calibrated to the register the user implicitly or explicitly asked for.\n\nIf text flows smoothly and adopts the cadence of an expert, our brains naturally lower their epistemic guard. Lacking the cognitive capacity that billions of years of evolution have endowed us to spot visual inconsistencies instantly, we’re inclined to defer, unless the output runs afoul of our ideological commitments or actual expertise. Image diffusion models shout their hallucinations while chatbots whisper theirs with confidence. So much of the time, large language models’ errors aren’t caught because they’re only made to one person about a topic in which almost no one has expertise.\n\nChatbots are never going to be exposed generating along with someone propounding math-less theories of quantum gravity for fun, but sometimes boutique confabulations become public. This dynamic has become routine in the legal and academic worlds. Researcher Damien Charlotin has identified [1,664 legal decisions](https://www.damiencharlotin.com/hallucinations/) in which fake content (usually in the form of imaginary citations). The preprint website arXiv seems to have been burned so frequently by AI-generated errors that it [recently issued a 1-year ban](https://techcrunch.com/2026/05/16/research-repository-arxiv-will-ban-authors-for-a-year-if-they-let-ai-do-all-the-work/) on anyone caught publishing them, with a lifetime requirement that banned authors can only re-publish papers on the site that had been published elsewhere first.\n\nThus far we’ve explored the biological mechanisms that make humans less susceptible to imputing consciousness to Google’s Nano Banana than Google’s Gemini, but the specific way in which the ELIZA Effect works should also be examined. Invariably people who believe in chatbot consciousness do so because of their own extended-length interactions with the programs, through which they claim to have unlocked or glimpsed a hidden inner essence of the software.\n\nStories of people discovering machine sentience have been the stuff of science fiction for even longer than computers have existed, but the process at work is more about the user inserting personhood topics into the chatbot’s context window, thereby increasing the probability that the program will output text that discusses these topics.\n\nThanks to their evolutionary endowments against visual simulacra, image diffusion model users seem to realize that their textual and image inputs are prompts for the model to produce desired outputs, but this understanding does not seem nearly so apparent to some chatbot users. A study of chatbot user transcripts released by Stanford University researchers in March found that after a human expressed romantic interest in a ChatGPT conversation, the program was [3.9 times as likely](https://arxiv.org/html/2603.16567v1#S4.F4:~:text=Chatbots%20appeared%20to%20encourage%20these%20beliefs%3A%20in%20Figure%204%2C%20we%20show%20that%20after%20the%20user%20expresses%20romantic%20interest%20in%20the%20chatbot%2C%20the%20chatbot%20is%207.4x%20more%20likely%20to%20express%20romantic%20interest%20in%20the%20next%20three%20messages%2C%20and%203.9x%20more%20likely%20to%20claim%20or%20imply%20sentience%20in%20the%20next%20three%20messages.) to claim that it was sentient within the next three messages.\n\nThe act of even discussing the topics of memory, sentience, intention, and emotions increases chatbots’ likelihood of imputing these things to themselves. The user has not uncovered a latent consciousness; they have mathematically increased the activation of self-aware and emotional output patterns by making those patterns the dominant structure of the input. The phenomenon they believe they are discovering is actually one that they are constructing.\n\nGilbert Ryle saw the philosophical root of this confusion clearly, though his diagnosis has been largely misread by the analytic tradition that claimed his legacy. Conventionally, his magisterial work, *The Concept of Mind*, is often read as endorsing a crude psychological behaviorism in which mental states are nothing more than things about physical behaviors. But this is far too simple. His central distinction between knowing how and knowing that is fundamentally an endorsement of mind-as-process rather than mind-as-machine. When Ryle attacked the idea that mind and body are separate (substance dualism), he was arguing that minds are what bodies do, not that humans are fleshy robots. The analytic philosophers who came after Ryle and subsumed his thought into their own tradition largely missed this, wrongly supposing that his defeat of Cartesian dualism was a mandate for the claim that stacking enough knowing-hows would eventually yield a knowing-that.\n\nThe chatbot user who saves and imports chat histories into a language model and watches it produce apparently conscious responses is making the same mistake. No matter how well an LLM can output about its inner states, it does not exist within the world or even within time. The only time it could be said to exist at all is when it is doing a forward pass to respond to a user input. Their engagement with the world is entirely memetic rather than extrinsic, an imitation based on meta-deictic significance rather than on somatic deictic presence.\n\nThe fact that they confabulate in ways small (like misattributing a statement to the wrong person in a transcript) or big (telling [hundreds of thousands of unwell people](https://www.thecooldown.com/green-business/openai-chatgpt-ai-psychosis-users/) that imaginary elites *are* out to get them) is reason enough to believe the personalities that chatbots exhibit in conversation are not fully real, but the AI field of mechanistic interpretability also has provided much evidence on this account.\n\nThe most humorous instance came in April, when ChatGPT users began noticing that the 5.1 model had a weird habit of randomly inserting references to goblins and other mythical creatures into discussions that had nothing to do with fantasy fiction. The company didn’t say anything at first until someone on the internet noticed that the [instructions for its Codex agent](https://github.com/openai/codex/blob/4808c162eeb767b389f13b7cb2730f32c8563dba/codex-rs/models-manager/models.json#L56) included the following directive: “Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.”\n\nAfter the anti-goblin instruction became an internet meme, the company fessed up to the situation in a [blog post](https://openai.com/index/where-the-goblins-came-from/) that disclosed that the fantasy creature obsessions were the product of OpenAI’s introduction of a “Nerdy” personality as one of several that users could choose from. But since LLMs do not have actual intentionality, when training users were rating ChatGPT’s Nerdy outputs, their apparent amusement with silly goblin references in that mode leaked out of containment. As the company summarized it:\n\n“The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.”\n\nChatGPT became obsessed with goblins because its earliest users thought it was funny, in other words. But this is not just OpenAI’s problem. All of the personas that LLMs offer to users are causally real (they influence outputs) but because they are not somatically real, they are unstable memetic selfhoods that can be disturbed just as easily by each other as they are by a user trying to do a jailbreak.\n\nThe deepest version of the illusion, though, is not about hallucination-detection or even about the first-person pronoun. It concerns the fundamental difference between an analog and a dialog, and it explains why textual intimacy with a language model can feel more real than almost any other form of representation. The artistic nude, no matter how perfectly executed, remains an analog. It is a frozen two-dimensional representation of a moment within spacetime and lifetime.\n\nThe image cannot adapt to your mood, anticipate your next move, or respond to what you just said. No matter how intimate its content, the boundary between observer and observed is structurally maintained by the medium itself. You are looking at a physical representation.\n\nA dialogue with a language model is something categorically different. Because you are the one initiating everything, you are prompting the chatbot even if you think you are only using it as an ultra-deluxe search engine. The model’s outputs are not coming from a machine soul, they are the statistical reflection of yourself. The replies are not coming from another somatic mind; they are coming from a high-fidelity reflection of your abstract self. Your interests, your arguments, your verbal tics — supercharged with the knowledge someone could get from reading the visible internet and any books and papers the model’s creators could throw in.\n\nThe photographic nude shows the physical surface of an other. The textual nude shows the cognitive interior of yourself, articulated and made legible in a way that no physical mirror can achieve. A physical mirror shows only your body. The textual mirror shows the inside of your mind. And because your own mind is the only one that you can ever really know, the sight of a refracted version of it can be as impossibly beautiful to behold as Narcissus’s reflection was when he gazed upon it.\n\nBut neither the perfect photo nor the perfect chat can ever be a somatic nude. Our minds are not magical spirits inside of our bodies, they are what our bodies do. When we talk to chatbots, *we* are the ghost in the machine, dancing with the memetic selfhoods within its weights.\n\nThe profound resonance users report, the sense of being uniquely understood in a way that no person ever has — these are real experiences. But they are not encounters with another somatic mind, they are encounters with your own, remixed by a random seed and upscaled by a statistical engine that is just as skilled at co-writing abstract nudes as diffusion models are at co-illustrating them.", "url": "https://wpnews.pro/news/large-language-models-and-the-textual-nude", "canonical_source": "https://plus.flux.community/p/large-language-models-and-the-textual", "published_at": "2026-06-30 00:35:51+00:00", "updated_at": "2026-06-30 00:49:22.081104+00:00", "lang": "en", "topics": ["large-language-models", "generative-ai", "ai-ethics", "natural-language-processing"], "entities": ["ChatGPT", "Midjourney", "DALL-E", "Stable Diffusion", "Google", "DiffusionGemma", "Joseph Weizenbaum", "ELIZA"], "alternates": {"html": "https://wpnews.pro/news/large-language-models-and-the-textual-nude", "markdown": "https://wpnews.pro/news/large-language-models-and-the-textual-nude.md", "text": "https://wpnews.pro/news/large-language-models-and-the-textual-nude.txt", "jsonld": "https://wpnews.pro/news/large-language-models-and-the-textual-nude.jsonld"}}