02:01
2026-06-23
arxiv.org
large-language-models
VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
Researchers developed VibeThinker-3B, a 3-billion-parameter language model that achieves reasoning performance matching or exceeding models orders of magnitude larger, scoring 94.3 on AIME26 and 80.2 โฆ