ATOD

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-29

arxiv.org

large-language-models

ATOD: Annealed Turn-aware On-policy Distillation for Multi-turn Autonomous Agents

Researchers propose ATOD, a hybrid online distillation algorithm that combines on-policy distillation and reinforcement learning to train small language-model agents for multi-turn tasks. ATOD uses an…

// co-occurs with top 5 entities

OPD 1 GRPO 1 ALFWorld 1 WebShop 1 Search-QA 1