AI Companies Are Learning an Ironic Lesson as the People They Pay to Improve Their Chatbots Are Just Feeding AI Slop Into Them

AI companies paying workers to generate fresh training data are discovering that many contractors are using other AI chatbots to create that data, a practice known as AI cannibalism that can destabilize large language models. Workers say low pay and poor contracts drive them to cut corners, making it difficult for companies to stop the behavior.

For tech companies racing to be the king of the AI hill, there are few things more precious than raw, original data. To keep the large language models underlying our favorite AI chatbots up to date, tech companies have to feed them reams of fresh inputs. As one study found https://ourworldindata.org/scaling-up-ai , the amount of data being used to train AI has doubled every nine months since 2010 — exponential growth which may soon hit a wall as stores of clean data run critically low https://www.forbes.com/sites/joemckendrick/2026/04/14/ai-may-be-running-out-of-data-stanford-report-warns/ . When there’s no more original content to pilfer, companies have started paying workers to generate fresh training data, offering them low-quality contracts https://www.somo.nl/big-tech-sets-unfair-terms-and-conditions-for-ai-data-workers-globally/ to train AI in hyper-specific tasks like running weekly payroll for Broadway musicians https://www.businessinsider.com/chatgpt-openai-training-niche-occupations-specialized-llm-handshake-data-labeling-2026-4 . Others have been hired for to film themselves doing degrading or menial chores like folding laundry https://ktla.com/morning-news/gig-workers-are-getting-paid-to-train-robots/ or distinctly adult activities https://futurism.com/artificial-intelligence/ai-company-cranking-hog . Predictably, this growing workforce behind the AI boom has started cutting corners en masse, turning to other AI chatbots to supply the data meant to feed AI chatbots. Talking to New Scientist https://www.newscientist.com/article/2531050-people-training-new-ai-models-admit-they-just-get-chatbots-to-do-it/ , numerous insiders said this practice of AI cannibalism — a method experts have long warned can destabilize LLMs https://futurism.com/the-byte/ai-trained-with-ai-generated-data-gibberish — is shockingly commonplace. “It’s very widespread,” a worker identified as Alice told NewSci . “Every company I’ve worked for has had explicit guidelines around it and they clearly do try to catch people out, so I think they do care. But I don’t think they can stop it.” In other words, AI companies are learning an ironic lesson: after purloining everybody else’s https://futurism.com/the-byte/facebook-trained-ai-pirated-books content https://futurism.com/anthropic-shredded-millions-of-physical-books without permission to create a product that threatens employment across the economy, the new precariat they’ve created are using the same tech to do the few human tasks they still need in as lazy a fashion as possible. Though workers have to be careful not to be too obvious, Alice says it isn’t hard to pass AI-generated data off as her own, provided she scrubs the obnoxious linguistic tics https://futurism.com/chatgpt-weird-way-talking-see-it-everywhere of chatbots like ChatGPT before she submits it. “It’s only the sloppiest of users that get caught,” the AI contractor told NewSci . “Anyone with a modicum of awareness around AI hallmarks can tell their output not to use them, and at that point what are you going to do?” “If these companies want quality data, then they should offer quality contracts,” Alice continued. “Instead they’re low-balling struggling people, employing them for the barest possible amount of time and tossing them aside as projects are finished with no warning.” Other contractors told NewSci they use LLMs in order to avoid making mistakes and losing their gig entirely. “I was terrified of not having an income source, and then after that, it just became easier to run everything through LLMs,” one explained. “For a lot of the projects that I do now, it’s creating scenarios, so I will use one LLM to help me create the scenario and then I’ll use a different LLM to help me create the files that go along with the scenario. I do feel guilty but like I said, in the beginning it was more about trying to make sure I wasn’t making any errors.” Whatever the reason, it’s clear workers aren’t above feeding AI companies a taste of their own slop — a situation which could have drastic consequences https://futurism.com/training-data-ai-misinformation-compromised for the AI race as a whole. More on AI: Cop Accused of Using AI to Fake Evidence https://futurism.com/artificial-intelligence/cop-ai-fake-evidence-uk