GPQA-Diamond

mentions 4 type Organization feed RSS

// recent coverage 4 mentions

04:00

2026-07-13

arxiv.org

large-language-models

HALO: Hybrid Adaptive Latent Reasoning for Language Models

Researchers introduced HALO, a hybrid adaptive latent-refinement method that improves frozen pretrained language models by selectively applying extra computation to tokens. On MMLU-Pro and GPQA-Diamon…

16:42

2026-06-26

arxiv.org

large-language-models

Combining LLMs Rarely Beats the Best Single Model, I tested 67 frontier models

A new study testing 67 frontier language models from 21 providers found that combining multiple models rarely outperforms the single best model, with gains capped by a 'co-failure ceiling' where all m…

04:00

2026-06-15

arxiv.org

large-language-models

SuperThoughts: Reasoning Tokens in Superposition

Researchers propose SuperThoughts, a method that compresses pairs of consecutive Chain-of-Thought tokens into single latent representations and decodes two tokens per step, doubling inference throughp…

00:00

2026-05-08

machinelearning.apple.com

machine-learning

RVPO: Risk-Sensitive Alignment via Variance Regularization

Researchers at Duke University introduced Reward-Variance Policy Optimization (RVPO), a risk-sensitive alignment method that penalizes inter-reward variance to prevent language models from neglecting …

// co-occurs with top 8 entities

Ivan Montero 1 Tomasz Jurczyk 1 Bhuwan Dhingra 1 Qwen2.5 1 HealthBench 1 SuperThoughts 1 Qwen2.5-Math-1.5B-Instruct 1 Qwen2.5-Math-7B-Instruct 1