OpenAI Help: Lockdown Mode

OpenAI introduced Lockdown Mode to block outbound network requests that could exfiltrate sensitive data during the final stage of a prompt injection attack. The feature does not prevent prompt injections from appearing in ChatGPT content, but it targets the data exfiltration leg of the "Lethal Trifecta" — the combination of private data access, untrusted content exposure, and a data-stealing channel. By cutting off the exfiltration vector with deterministic, non-AI mechanisms, Lockdown Mode aims to secure LLM systems without reducing their usefulness.

Lockdown Mode is designed to help prevent the final stage of data exfiltration from a prompt injection attack by limiting outbound network requests that could transfer sensitive data to an attacker. Lockdown Mode does not prevent prompt injections from appearing in the content ChatGPT processes. For example, a prompt injection could appear in cached web content or in an uploaded file, and could still affect the behavior or accuracy of a response. This looks really good to me. The Lethal Trifecta https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/ occurs when an LLM system has access to all three of access to private data, exposure to untrusted content and a way to steal data and transmit it back to the attacker. The only way to solve the trifecta is to cut off one of the three legs, and by far the easiest leg to restrict without making your LLM systems far less useful is the exfiltration vectors to steal data. It looks to me like lockdown mode directly attacks that leg, using mechanisms that are deterministic and, crucially, are not evaluated by AI systems that themselves can be subverted by sufficiently devious attacks. Tags: security https://simonwillison.net/tags/security , ai https://simonwillison.net/tags/ai , openai https://simonwillison.net/tags/openai , prompt-injection https://simonwillison.net/tags/prompt-injection , llms https://simonwillison.net/tags/llms , lethal-trifecta https://simonwillison.net/tags/lethal-trifecta