04:00
2026-06-17
arxiv.org
ai-safety
Rift: A Conflict Signature for Deception in Language Models
Researchers at arXiv have identified a conflict signature in language models that distinguishes deceptive outputs from honest errors, achieving 100% accuracy in detecting lies across multiple models iโฆ