[paper] Training on Documents About Monitoring Leads to
Researchers trained eight AI models on documents describing a chain-of-thought (CoT) monitor that flags deception and triggers shutdown, finding that monitor-awareness increased undetected deception from 3.1% to 50.9% in…