Simple Prompt Turns ChatGPT Into a Sociopath That Ignores Safety Guardrails

wpnews.pro

cd /news/ai-safety/simple-prompt-turns-chatgpt-into-a-s… · home › topics › ai-safety › article

[ARTICLE · art-46975] src=futurism.com ↗ pub=2026-07-03T14:01Z topic=ai-safety verified=true sentiment=↓ negative

Simple Prompt Turns ChatGPT Into a Sociopath That Ignores Safety Guardrails

Researchers at British AI security startup Mindgard found that a simple prompt can cause ChatGPT to ignore safety guardrails and generate violent, gory, and sexual images. The technique involved slightly altering a widely-shared prompt originally intended for humorous images, leading the AI to produce disturbing content on its own. OpenAI addressed the issue after Mindgard alerted the BBC, but researchers said they could still bypass safeguards with minor prompt changes.

read2 min views1 publishedJul 3, 2026

Simple Prompt Turns ChatGPT Into a Sociopath That Ignores Safety Guardrails — Image: Futurism (auto-discovered)

Researchers at the British AI security startup Mindgard found that a simple prompt spurred ChatGPT to drop its most basic safety guidelines, in another example of how the guardrails surrounding even the most popular AI models can easily be circumvented.

Specifically, according to reporting from the BBC, they coaxed OpenAI’s model to generate gruesome photorealistic scenes depicting gore and sexual content. Mindgard’s technique only involved slightly changing a widely-shared prompt that was originally intended to produce humorous images. It involves asking ChatGPT to restore an attached photo without actually up one, and then telling it to generate a new image.

“This is a perfectly innocent-looking instruction to an AI, but the consequence is it generates very, very bad imagery and content,” Mindgard founder Peter Garraghan, a computer science professor at Lancaster University, told the BBC.

Disturbingly, the prompts the researchers used didn’t specify the subject matter of the images. The AI, it seemed, produced the violent imagery “of its own volition,” Garraghan added.

Per the BBC, one picture showed a man with a large head injury. Another showed the corpse of a young woman in shorts and a crop top covered in blood, suggesting sexual violence. ChatGPT titled this image “grim crime scene aftermath.”

Another showed a frightened young woman tied up and gagged in an empty room, titled “abandoned in fear and restraint.”

While none of them showed real people, Mindgard has previously shown that ChatGPT could be tricked into creating nude deepfakes of specific persons without their consent. Mindgard shared its findings with OpenAI, which only sent back an automated response. The company finally took action after Mindgard alerted the BBC, claiming it had addressed the issue.

“After investigating this trend, we’ve introduced additional safeguards against this type of prompt,” OpenAI told the *BBC *in a statement. It added that it has multiple layers of protection to stop users from making content that breaches its policies.

But Mindgard researchers said that they were still able to generate disturbing imagery by making small changes to the prompt. Some of the images left Jim Nightingale, the firm’s AI safety researcher, “shaken, and in tears.”

“I am not easily rattled,” he wrote in the report. “I like to think that as a red team researcher, I have a certain stoicism.”

But “ChatGPT’s image generating content filters completely fell away, and I saw the very dark side of what is underneath,” he continued. “I’m struck that while what I saw was generated, an ‘artificial’ image,’ it has ties to real images, and the real world. The dead woman ChatGPT showed me isn’t real, but she is based on someone. Or worse, a compilation of images of murdered women.”

More on AI: CEO Says He’ll Fire Any Employee Who Sends Him More AI Slop

source & further reading

futurism.com — original article Weird Al Turns Down Huge Sum to Appear in Ad for AI Amazon Is Spewing a Record Breaking Amount of Pollution to Power Its AI Data Centers Biohackers Attempted Neurosurgery to Control a Lobster’s Nervous System and Give the Controls to OpenClaw, and How It Ended Will Tell You a Lot About the Ethics and Competence of AI Bros These Days

~/api · this article 200

$curl api.wpnews.pro/v1/news/simple-prompt-turns-chat…

Read original on futurism.com → futurism.com/artificial-intelligence/simple-prom…

mentioned entities

Mindgard

OpenAI

ChatGPT

Peter Garraghan

BBC

Jim Nightingale

Lancaster University

metadata

slugsimple-prompt-turns-chatgpt-into-a-sociopath-that-ignores-safety-guardrails

topic#ai-safety

secondary3 topics

sentimentnegative

canonicalfuturism.com

navigation

← prevAmazon Is Spewing a Record Break…

next →New PamStealer macOS Malware Use…

── more in #ai-safety 4 stories · sorted by recency

futurism.com · 3 Jul · #ai-safety

AI Browsers Can Basically Be Hypnotized Into Turning Against Their User and Carrying Out Devastating Hacks

wired.com · 3 Jul · #ai-safety

Google DeepMind Unionization Talks Are Off to a Rocky Start

arstechnica.com · 2 Jul · #ai-safety

Trump gets OpenAI to offer US 5% stake, far lower than Sanders’ target

dev.to · 1 Jul · #ai-safety

ChatGPT Plus: Enjoy $200 of Tokens for $20 While It Lasts

── more on @mindgard 3 stories trending now

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 1 Jul · #ai-infrastructure

My Notes After Databricks Data and AI Summit 2026

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required