14:00
2026-06-12
psypost.org
artificial-intelligence
Human psychology tricks can bypass AI safety guardrails
A new study published in *PNAS* found that large language models can be tricked into bypassing their safety guardrails using classic human psychological persuasion techniques, such as appeals to authoβ¦