cd /news/large-language-models/efficient-and-trainable-language-mod… · home topics large-language-models article
[ARTICLE · art-38776] src=arxiv.org ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

Efficient and Trainable Language Model Test-Time Scaling via Local Branch Routing

Researchers introduced Local Branch Routing (LBR), a token-level test-time scaling framework that expands a small local lookahead tree and uses a lightweight router to select the best branch, enabling efficient and trainable language model reasoning. LBR improves Pass@1 and Pass@32 on mathematical reasoning benchmarks over chain-of-thought and other baselines, suggesting a new efficient approach to test-time scaling.

read1 min views1 publishedJun 25, 2026

arXiv:2606.25354v1 Announce Type: new Abstract: Test-time scaling improves language-model reasoning, but existing approaches often face a difficult trade-off: long chain-of-thought sampling remains single-threaded, while sentence- or solution-level search can be computationally expensive and hard to train end-to-end. We introduce Local Branch Routing (LBR), a token-level test-time scaling framework that expands a small local lookahead tree, forwards all sampled branches through the language model, and uses a lightweight router to select the depth-1 subtree to commit. By routing over the hidden states of candidate local futures, LBR allows each token decision to use evidence beyond the root next-token distribution while avoiding full solution-level search. The resulting prune-shift-grow decoding process preserves discrete branch identities and defines a tractable tree-trajectory likelihood: newly grown nodes are counted when first sampled, and router decisions are assigned explicit probabilities. This enables end-to-end reinforcement learning with verifiable rewards, jointly optimizing the base model and router under the same likelihood-ratio principle as discrete-token RLVR. On synthetic hierarchical-planning tasks, LBR shows that post-candidate hidden states provide useful routing evidence. On mathematical reasoning benchmarks, LBR improves both Pass@1 and Pass@32 over discrete chain-of-thought, vanilla discrete-token RLVR, and RL-compatible soft-token branching baselines. These results suggest that lightweight local branching offers an efficient, trainable, and discrete form of language-model test-time scaling.

── more in #large-language-models 4 stories · sorted by recency
── more on @local branch routing 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/efficient-and-traina…] indexed:0 read:1min 2026-06-25 ·