DiPOD

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-15

arxiv.org

machine-learning

Diffusion Policy Optimization without Drifting Apart

Researchers identified the double-drift phenomenon causing instability in diffusion policy-gradient methods and proposed DiPOD, a framework that interleaves self-distillation with policy-improving gra…