cd/entity/PNASΒ· homeβ€Ί entitiesβ€Ί PNAS
grep -l @pnas /news/*.json | wc -l β†’ 1

PNAS

mentions 1 type Organization feed RSS
14:00
2026-06-12
psypost.org
artificial-intelligence

Human psychology tricks can bypass AI safety guardrails

A new study published in *PNAS* found that large language models can be tricked into bypassing their safety guardrails using classic human psychological persuasion techniques, such as appeals to autho…