IFEval

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

02:01

2026-06-23

arxiv.org

large-language-models

VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

Researchers developed VibeThinker-3B, a 3-billion-parameter language model that achieves reasoning performance matching or exceeding models orders of magnitude larger, scoring 94.3 on AIME26 and 80.2 …

// co-occurs with top 7 entities

VibeThinker-3B 1 DeepSeek V3.2 1 GLM-5 1 Gemini 3 Pro 1 AIME26 1 LiveCodeBench v6 1 LeetCode 1