RIFT

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-24

arxiv.org

machine-learning

Weight-Space Geometry of Offline Reasoning Training

Researchers compared six offline reinforcement-learning methods for distilling reasoning from large language models into smaller ones, finding that SFT, RFT, and RIFT produce nearly identical weight u…

// co-occurs with top 7 entities

Qwen3-4B 1 GSM8K 1 AIME26 1 SFT 1 RFT 1 DFT 1 DPO 1