SuperThoughts: Reasoning Tokens in Superposition

wpnews.pro

cd /news/large-language-models/superthoughts-reasoning-tokens-in-su… · home › topics › large-language-models › article

[ARTICLE · art-27557] src=arxiv.org ↗ pub=2026-06-15T04:00Z topic=large-language-models verified=true sentiment=↑ positive

SuperThoughts: Reasoning Tokens in Superposition

Researchers propose SuperThoughts, a method that compresses pairs of consecutive Chain-of-Thought tokens into single latent representations and decodes two tokens per step, doubling inference throughput while reducing CoT length by 20-30% with minimal accuracy loss. The approach finetunes Qwen2.5-Math models and evaluates on benchmarks including MATH500 and GPQA-Diamond.

read1 min publishedJun 15, 2026

arXiv:2606.13862v1 Announce Type: new
Abstract: Long Chain-of-Thought (CoT) reasoning improves LLM problem-solving but is computationally expensive due to sequential token generation. While recent works explore reasoning in continuous latent spaces to bypass discrete token generation, they often struggle with training stability and fail to scale to complex, long-horizon tasks due to lack of supervision signal. We propose SuperThoughts, which compresses pairs of consecutive CoT tokens into single latent representations and decodes two tokens per step via a lightweight Multi-Token Prediction (MTP) module. This preserves discrete token supervision at training time while doubling throughput at inference time. We finetune Qwen2.5-Math-1.5B-Instruct, Qwen2.5-Math-7B-Instruct, Qwen2.5-Math-14B-Instruct, and evaluate on MATH500, AMC, OlympiadBench, and GPQA-Diamond. With a confidence-based adaptive mechanism that falls back to standard decoding when uncertain, SuperThoughts achieves $\sim$20--30\% CoT length reduction while maintaining accuracy with minimal degradation (1-2 points accuracy drop on most tasks).

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/superthoughts-reasoning-…

Read original on arxiv.org → arxiv.org/abs/2606.13862

mentioned entities

SuperThoughts

Qwen2.5-Math-1.5B-Instruct

Qwen2.5-Math-7B-Instruct

Qwen2.5-Math-14B-Instruct

MATH500

AMC

OlympiadBench

GPQA-Diamond

metadata

slugsuperthoughts-reasoning-tokens-in-superposition

topic#large-language-models

secondary3 topics

sentimentpositive

langen

canonicalarxiv.org

navigation

← prevDomain-Specific AI for Pharma, B…

next →Senior engineers are spending th…

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 15 Jun · #large-language-models

Self-Evolving Visual Questioner

arxiv.org · 15 Jun · #large-language-models

TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards

arxiv.org · 15 Jun · #large-language-models

A Multi-Agent AI System for Automated High School Transcript Processing: Collaborative Document Analysis at Scale

arxiv.org · 15 Jun · #large-language-models

Hybrid Classical-Quantum Variational Autoencoder for Neural Topic Modeling

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required