AC-ODM

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

05:00

2026-07-01

dev.to

machine-learning

RL-driven data mixing boosts evaluation scores

A reinforcement learning-driven data scheduler, AC-ODM, boosts MMLU performance by 27.5% relative and HumanEval pass@1 by 2.23× on a Pythia-1B model with only a 0.4% per-step wall-clock increase and 2…

// co-occurs with top 3 entities

Pythia-1B 1 MMLU 1 HumanEval 1