cd/entity/SDPGΒ· homeβ€Ί entitiesβ€Ί SDPG
grep -l @sdpg /news/*.json | wc -l β†’ 1

@SDPG

mentions 1 type Organization feed RSS
04:00
2026-06-04
arxiv.org
machine-learning

Self-Distilled Policy Gradient

Researchers introduced SDPG, a self-distilled policy-gradient framework that combines group-relative verifier advantages with normalized standard deviation and full-vocabulary on-policy self-distillat…

// co-occurs with top 1 entities