04:00
2026-06-24
arxiv.org
large-language-models
Can Language Model Agents be Helpful Circuit Explainers in Mechanistic Interpretability?
Researchers introduced AgenticInterpBench, a benchmark for circuit explanation in mechanistic interpretability, and HyVE, an agentic explainer that uses language models to analyze transformer circuit โฆ