04:00
2026-06-12
arxiv.org
large-language-models
Quantifying Subliminal Behavioral Transfer Ratios in Language Model Distillation
A new study quantifies the rate at which undesirable behaviors transfer from teacher to student language models during distillation, a phenomenon known as subliminal learning. Researchers steered Llamβ¦