cd/entity/Language Model Interpretability teamยท homeโ€บ entitiesโ€บ Language Model Interpretability team
grep -l @language model interpretability team /news/*.json | wc -l โ†’ 1

Language Model Interpretability team

mentions 1 type Person feed RSS
17:14
2026-06-12
lesswrong.com
ai-research

Building and evaluating model diffing agents

Google DeepMind researchers developed a model diffing agent that automatically discovers and validates behavioral differences between two large language models, addressing the limitation of standard eโ€ฆ

// co-occurs with top 1 entities