cd/entity/Askell et al.Β· homeβ€Ί entitiesβ€Ί Askell et al.
grep -l @askell et al. /news/*.json | wc -l β†’ 1

@Askell et al.

mentions 1 type Person feed RSS
19:23
2026-06-04
lesswrong.com
large-language-models

(Mis)generalization of Helpful-Only Fine-tuning

Researchers studying helpful-only (H-only) large language models found that existing models exhibit emergent misalignment, residual refusal behaviors, poor steerability, sycophancy, and incoherent cha…

// co-occurs with top 4 entities