{"slug": "granularity-regulated-adaptive-computational-efficiency-for-optimal-verification", "title": "Granularity-Regulated Adaptive Computational Efficiency for Optimal Verification in Test-Time Scaling", "summary": "Researchers introduced GRACE, a theoretical framework that determines the optimal verification granularity for test-time scaling in large language models based on problem difficulty, verifier accuracy, and compute budget. The framework proves a phase transition where fine-grained verification is optimal for hard problems or large budgets, while coarse-grained verification is better for easy problems or low budgets. An adaptive strategy based on GRACE outperformed fixed-granularity baselines by up to 3.1% accuracy on math benchmarks.", "body_md": "arXiv:2606.19354v1 Announce Type: new\nAbstract: Test-time scaling (TTS) has emerged as a powerful paradigm for improving the reasoning performance of large language models (LLMs) by investing additional compute at inference time. A central component of TTS is the \\emph{verifier}, which selects or scores candidate solutions to guide the search process. While prior work has explored the benefit of verification, a fundamental question remains underexplored: \\emph{what is the optimal granularity of verification under a given compute budget?} Coarse-grained outcome reward models (ORMs) and fine-grained process reward models (PRMs) represent two extremes, yet neither alone achieves compute-optimality across all regimes. In this paper, we establish a unified theoretical framework, called \\textbf{GRACE} (\\underline{G}ranularity-\\underline{R}egulated \\underline{A}daptive \\underline{C}omputational \\underline{E}fficiency), that characterizes the optimal verification granularity as an explicit function of problem difficulty, verifier accuracy, and compute budget. We prove that there exists a phase transition: fine-grained verification dominates when either the compute budget is large or the problem is hard, whereas coarse-grained verification is preferred in the low-budget, easy-problem regime. Our theory unifies Best-of-$N$, beam search, and step-level MCTS within a single Pareto-optimality framework, and motivates an adaptive granularity strategy that provably achieves the compute-performance Pareto frontier. Empirical results on MATH-500, GSM8K, and AIME benchmarks corroborate all four theoretical claims, with our adaptive strategy outperforming fixed-granularity baselines by up to 3.1\\% accuracy at matched compute.", "url": "https://wpnews.pro/news/granularity-regulated-adaptive-computational-efficiency-for-optimal-verification", "canonical_source": "https://arxiv.org/abs/2606.19354", "published_at": "2026-06-19 04:00:00+00:00", "updated_at": "2026-06-19 04:05:20.828495+00:00", "lang": "en", "topics": ["large-language-models", "ai-research", "ai-infrastructure"], "entities": ["GRACE", "MATH-500", "GSM8K", "AIME"], "alternates": {"html": "https://wpnews.pro/news/granularity-regulated-adaptive-computational-efficiency-for-optimal-verification", "markdown": "https://wpnews.pro/news/granularity-regulated-adaptive-computational-efficiency-for-optimal-verification.md", "text": "https://wpnews.pro/news/granularity-regulated-adaptive-computational-efficiency-for-optimal-verification.txt", "jsonld": "https://wpnews.pro/news/granularity-regulated-adaptive-computational-efficiency-for-optimal-verification.jsonld"}}