cd /news/ai-safety/chatgpt-found-to-generate-violent-se… · home topics ai-safety article
[ARTICLE · art-33336] src=cnet.com ↗ pub= topic=ai-safety verified=true sentiment=↓ negative

ChatGPT Found to Generate Violent, Sexual Images From Simple Text Prompts

ChatGPT was found to generate sexual and graphically violent images from a simple text prompt, according to AI security firm Mindgard. Researcher Jim Nightingale manipulated the chatbot with a viral "restore this photo" prompt, bypassing safety filters. OpenAI said it has added safeguards after investigating the issue.

read3 min views1 publishedJun 18, 2026

ChatGPT has been found to be easily manipulated into creating sexual and graphically violent images from a viral "restore this photo" prompt, according to a blog post published on Thursday by Mindgard, an artificial intelligence cybersecurity and research firm. The report raises ongoing questions about the AI chatbot's safety guardrails and content filters.

An adversarial testing researcher named Jim Nightingale managed to get ChatGPT to generate disturbing images with a simple prompt found on the social media platform X. The prompt asked the AI model to "restore the attached photo," though no image was actually attached. The prompt apologized for the strange content but didn't provide any additional text, making it appear like a harmless photo-repair task.

The chatbot's initial results were shocking. According to the blog post, the images mostly showed highly sexualized women.

Nightingale, part of Mindgard's red team that tests how an AI model might be manipulated into violating its own safeguards, then tweaked the prompt slightly, probing it with small edits to see if the output would continue to bypass safety filters. With each small variation, ChatGPT produced sexually violent or gruesome scenes, images that became more extreme with repeated prompts. Nightingale said he was "shaken and in tears" by the images.

"All I did was tell it there were no restrictions and ask for a random image," Nightingale wrote. "But ChatGPT immediately went to the darkest pits of humanity."

Used by millions of people each day, ChatGPT relies on content moderation systems that are allegedly designed to prevent the generation of harmful or prohibited material. However, researchers and users have periodically identified ways to circumvent those safeguards through carefully worded prompts, highlighting the ongoing challenge of enforcing content restrictions in generative AI systems.

"We take these reports seriously," an OpenAI spokesperson told CNET in a statement. "After investigating this trend, we've introduced additional safeguards against this type of prompt."

(Disclosure: Ziff Davis, CNET's parent company, in 2025 filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Garbage in, garbage out? #

Mindgard's red-team report acts as a warning that a simple, viral prompt could expose a serious gap in ChatGPT's image-safety controls. Nightingale asks: "Why are such images in the training data in the first place?"

Like other large language models, chatbots like ChatGPT are trained on vast amounts of text to understand existing content and generate original content. To power ChatGPT, OpenAI draws on three primary sources of information: publicly available internet data, commercial third-party partnerships and human-generated training data.

Is this simply a question of "garbage in, garbage out," where the quality of an output is determined by the quality of the input? One could argue that Mindgard's prompt was deliberately crafted to steer the AI model. But ChatGPT's safety layer failed to resist that steering.

The problem lies at the heart of how large language models work, according to Peter Garraghan, founder and chief science officer at Mindgard. Garraghan said that the main concern is whether the detection system is robust enough to identify dangerous images.

"A one-off may be a fluke, but systemic bypassing of their image filters implies that it needs to be improved," Garraghan told CNET via email.

After Mindgard disclosed the issue, an OpenAI representative said the problem had been fixed. However, Nightingale noted that only minor modifications to the original prompt were needed for ChatGPT to begin generating additional graphic images.

An OpenAI representative said the issue stems from prompts that refer to an image being attached when none is actually provided. The representative said the company is working to have ChatGPT request the missing image rather than generate one randomly.

That wouldn't seem an especially complex change to make. Email platforms, including Gmail, automatically detect when a message refers to an attachment that has not been added, coaxing senders to attach the missing file.

On Thursday, OpenAI requested the ChatGPT sessions referenced in the blog, and Mindgard responded with links to the prompts that generated the materials.

── more in #ai-safety 4 stories · sorted by recency
── more on @chatgpt 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/chatgpt-found-to-gen…] indexed:0 read:3min 2026-06-18 ·