Everyone Wants to Build AI Using Someone Else’s Work

A group of publishers owning nearly 400 local and regional newspapers sued OpenAI and Microsoft for allegedly stealing hundreds of thousands of articles to train ChatGPT and Copilot. Separately, Anthropic accused Chinese tech firm Alibaba of illicitly using its Claude chatbot to train a new AI model via adversarial distillation, violating terms of service.

Good artists borrow, great artists steal, as the adage goes. In the cutthroat and almost completely unregulated modern AI industry, there are many tech developers who would probably agree. Few of them would come right out and admit it, though. Not long after the generative AI boom took the business world and Wall Street by storm, a chorus of complaints started being leveled against companies like OpenAI, Microsoft, Anthropic, and Google, whose models were trained using a sizable chunk of all the content that’s ever been published on the internet—including, as many subsequent lawsuits would allege, massive quantities of copyrighted materials. Confronted by a potentially existential threat to the business model that had sustained them for decades, some major publishers chose to fight back in court. Others signed content licensing agreements with leading AI labs, trading access to their databases in exchange for a cut of the labs’ profits, custom AI tools, and other perks. By and large, the AI companies have responded to these allegations by arguing that the scraping of online data is permissible under existing laws around “fair use.” Given the financial stakes and the novelty of the technology in question, lawyers and judges will have their hands full for some time before such disputes are finally resolved. In the meantime, legal challenges against AI companies are continuing to mount. On Wednesday, a group of publishers who collectively own close to 400 local and regional newspapers across the country sued OpenAI and Microsoft for what they allege was the “systematic and willful theft of hundreds of thousands of articles” scraped from the internet to train ChatGPT and Copilot. “Those products have generated hundreds of billions of dollars and counting in market value for the Defendants,” the lawsuit https://www.bloomberglaw.com/public/desktop/document/RichnerCommunicationsIncetalvMicrosoftCorporationetalDocketNo126c?doc id=X30U1HD7K89A1B2FPNCSKI08OD , filed in the U.S. District Court for the Southern District of New York, read. “Not a cent of it has gone to the Publishers whose work made it possible.” But media companies and artists aren’t the only ones accusing AI companies of stealing their work. Increasingly, accusations are being lobbed between companies themselves—along a distinctly West-East axis. Also on Wednesday, multiple media outlets reported that Anthropic—currently embroiled in a fresh dispute with the Trump administration https://gizmodo.com/cybersecurity-experts-are-baffled-by-trumps-ban-of-anthropics-new-ai-models-2000771976 over foreign access to its newest models—sent a letter to federal officials accusing the Chinese tech firm Alibaba of “illicitly” using Claude to train a new AI model. Between late April and early June, according to Anthropic, Alibaba allegedly used nearly 25,000 fraudulent Claude accounts to conduct tens of millions of exchanges with the chatbot, which were used as raw training data for Alibaba’s AI system—a process known in the industry as adversarial distillation. “Adversarial” in this context doesn’t have any geopolitical connotations, but rather refers to the technical method used to train a new AI model via its interactions with an existing model. Anthropic has previously accused https://gizmodo.com/anthropic-says-chinese-ai-companies-made-models-by-illicitly-copying-its-capabilities-2000725717 Chinese AI startups DeepSeek, Moonshot, and MiniMax of the same thing. OpenAI has also accused DeepSeek https://www.bloomberg.com/news/articles/2026-02-12/openai-accuses-deepseek-of-distilling-us-models-to-gain-an-edge of illicit distillation of its models. Then as now, the company hasn’t accused its Chinese competitors of anything that’s definitively illegal; the claim is that this kind of large-scale distillation effort violates the company’s terms of service and warrants a coordinated response across the American public and private sector to prevent Chinese companies from gaining a lead in the much-fretted-over AI race. And then, as now, Anthropic hasn’t been in particularly good graces with the very government it’s trying to appeal to. In its new letter directed at Alibaba—which was addressed to Senators Tim Scott and Elizabeth Warren, the chair and ranking member, respectively, of the Senate Committee on Banking, Housing, and Urban Affairs—the company reportedly said it would assist the government in its efforts to prevent these kinds of attacks from happening in the future. In April, White House Office of Science and Technology Policy director Michael Kratsios published a memo https://www.whitehouse.gov/wp-content/uploads/2026/04/NSTM-4.pdf stating that the Trump administration would take several steps, including partnering with private companies, to fight what it described as “industrial-scale campaigns to distill U.S. frontier AI systems.” Kratsios’ memo made a distinction between that kind of mass-distillation—calling out China specifically—and the more small-scale distillation that AI labs routinely use in order to train smaller AI systems using larger, more capable models; not all distillation is illicit, in other words. But even this standard form of distillation comes with risks. For example, a “student” model trained via interactions with a “teacher” model is likely to inherit some dangerous biases that might be hidden in the training data. Microsoft is therefore hoping to boost the appeal of its new MAI-Thinking-1 model https://gizmodo.com/microsoft-is-exploiting-legal-fears-to-sell-its-powerful-new-ai-model-to-businesses-2000766632 , which was trained “with absolutely zero distillation,” Mustafa Suleyman, head of the company’s AI division, said during the opening keynote at the 2026 Microsoft Build conference earlier this month. Like publishers’ legal disputes with AI developers, the U.S. AI industry’s efforts to prevent foreign companies from “illicitly” using their models to train new ones will almost certainly not have a quick or easy solution. But one has to suspect that right now, across the country, editors at small-town newspapers are watching American tech companies complain about what they claim amounts to theft, and feeling that at last, a tiny bit of justice has been served.