05:37
2026-06-25
lesswrong.com
large-language-models
Introspection or entropy? Re-examining concept-injection “introspection” in open models
A researcher replicated Anthropic's concept-injection experiments on 14 open-weight language models and found that the models do not satisfy criteria for genuine introspection, instead exhibiting stat…