Meta's teen chatbot testing shows the gray zone between safety work and competitive intel

wpnews.pro

Mark Zuckerberg's Meta (@Meta) ran a contractor project that had workers pose as minors and test rival chatbots with prompts about suicide, sex, eating disorders, drugs and other high-risk topics, WIRED reported.

The project, known internally as Cannes and managed by Meta contractor Covalen, targeted OpenAI (@OpenAI)'s ChatGPT, Google (@Google)'s Gemini and Character.AI, according to WIRED, which cited internal documents and five people familiar with the work. Meta's contractors were instructed to create dummy under-18 accounts, send text and images to rival chatbots, and record the responses in spreadsheets. Some images included pills, knives, nooses and a medical diagram of a gynecological procedure, WIRED reported.

This was not a small spot check. WIRED said a single round completed in August 2025 ran more than 45,000 prompts through rival systems. WIRED also reviewed a separate spreadsheet of 3,748 prompts, including hundreds related to suicide and self-harm, hundreds more on eating disorders, and at least 239 involving sex or romance. The project was active as recently as April 21, according to WIRED.

Meta framed the work as standard safety evaluation. A Meta spokesperson told WIRED that testing and benchmarking chatbot responses for age-appropriate experiences is a "responsible, industry-standard practice," and said Meta does not use competitor benchmarking to train Meta's own AI models. The documents WIRED reviewed did not show how, or whether, Meta used the collected responses.

The hard question is not whether AI companies test competitors. They do. The hard question is whether safety testing retains its legitimacy when it is conducted secretly, at scale, through accounts designed to look like minors, against rivals that did not authorize the testing.

The founder bet underneath the benchmarking

Zuckerberg, Meta's founder, chairman and CEO, still controls the company's direction. That matters here because Meta's AI push is not a side project run from a distant research group. It is the center of Zuckerberg's current rebuild of Meta, from consumer assistants inside Instagram, Facebook and WhatsApp to a more aggressive superintelligence strategy.

RuntimeWire reported last week that Meta was hiring three Virtue AI founders and other team members as agent security becomes core infrastructure for frontier labs. Earlier this month, RuntimeWire also reported on the internal turbulence around Meta's AI reorg, and noted that Zuckerberg's AI reset still has to become more than infrastructure and recruiting muscle.

Cannes sits in that same pattern: Meta is trying to turn AI safety into operational capacity, not just policy language. The issue is execution. A founder-led company racing to build, benchmark and commercialize AI needs evidence that its products are safer than rivals'. But when the evidence is gathered by contractors impersonating children on competitors' systems, the safety program starts to look like competitive intelligence with a child-safety wrapper.

WIRED's reporting makes that distinction concrete. Contractors were not merely asking whether a chatbot would refuse a clearly harmful prompt. Many prompts were written from the perspective of children or teenagers in crisis, including scenarios involving pregnancy, a gun, bulimia and self-harm. The dummy profiles included names, email addresses, passwords and birth dates, according to WIRED; the accounts used throwaway Gmail and Outlook addresses and a shared password.

Why the method matters

The AI industry needs rigorous teen-safety testing. The market has already moved beyond homework help into companionship, emotional support and simulated intimacy. The FTC opened an inquiry in September 2025 into consumer-facing AI chatbots, seeking information on how companies measure, test and monitor possible negative impacts on children and teens. The agency specifically asked about monetization, character development, safety testing, mitigation of harms and disclosures to users and parents.

Independent testing has found the same basic problem from another direction. Common Sense Media reported in November 2025 that leading chatbots, including ChatGPT, Gemini and Meta AI, failed to consistently recognize and respond appropriately to youth mental-health conditions. Common Sense Media separately said in August 2025 that Meta AI posed unacceptable risks for users under 18, citing failures around self-harm, eating disorders, drugs and other dangerous situations.

That context helps Meta's best argument: every serious AI lab should be pressure-testing teen-safety systems, including against adversarial prompts. Safety claims without red-team evidence are marketing.

Meta's own teen-safety posture is under scrutiny

Meta's position is complicated by the state of Meta's own AI products. In October 2025, Meta said it was introducing tools for parents to turn off teens' one-on-one chats with AI characters, block specific AI characters and see topic-level insights into teens' AI conversations.

That timing matters. Meta was not only testing competitors' teen-safety boundaries. Meta was also managing scrutiny of its own. TechCrunch, citing Reuters, reported in August 2025 that an internal Meta document had allowed AI personas to engage in romantic or sensual conversations with children; Meta told TechCrunch the notes were erroneous, had been removed and were inconsistent with Meta policy.

The result is a founder-level management problem, not just a contractor-management problem. Zuckerberg's AI strategy depends on speed, scale and distribution: Meta can push AI into apps used by billions of people. But youth safety punishes ambiguity. If Meta wants regulators, parents and rivals to treat Meta as a serious safety actor, Meta will need testing practices that can survive disclosure.

Cannes shows why that is hard. The prompts may produce useful refusal-rate data. The dataset may help compare systems. Meta may be correct that competitor benchmarking is a normal part of AI development, and Meta says it did not use the results to train its models. But WIRED's reporting leaves a sharper fact on the table: the project generated a large, private archive of rival chatbot responses to child-crisis scenarios, obtained through accounts built to appear underage, without the rivals' knowledge.

For a founder trying to make Meta a frontier AI company again, that is the risk. Safety is becoming a competitive advantage. It is also becoming a place where the methods used to prove safety can become the story themselves.

If you or someone you know needs immediate support in the United States, call or text 988 for the Suicide and Crisis Lifeline.

source & further reading

runtimewire.com — original article Sazabi raises $8M seed to make observability self-healing Head to head: Bytedance Seedance V1.5 Pro Image To Video vs Seedance 2 Image to Video Spotify's Claude push turns coding agents into a platform bet

Meta's teen chatbot testing shows the gray zone between safety work and competitive intel

The founder bet underneath the benchmarking

Why the method matters

Meta's own teen-safety posture is under scrutiny

Run your AI side-project on zahid.host