02:44
2026-06-05
arxiv.org
machine-learning
OPRD: On-Policy Representation Distillation
Researchers have introduced On-Policy Representation Distillation (OPRD), a method that aligns student and teacher model representations across selected layers during training, bypassing the language โฆ