cd/entity/Neel Nanda· home entities Neel Nanda
grep -l @neel nanda /news/*.json | wc -l → 1

@Neel Nanda

mentions 1 type Person feed RSS
18:34
2026-06-04
lesswrong.com
artificial-intelligence

Building Better Activation Oracles

Researchers have improved Activation Oracles (AOs)—fine-tuned LLMs that answer natural language questions about a target model's internal activations—by training on on-policy rollouts, using a higher-…

// co-occurs with top 7 entities