A Spectral Phase Diagram for Binary Few-Shot Classification: Intrinsic Dimensionality, Geometric Saturation, and Representational Diagnosis

wpnews.pro

cd /news/machine-learning/a-spectral-phase-diagram-for-binary-… · home › topics › machine-learning › article

[ARTICLE · art-38811] src=arxiv.org ↗ pub=2026-06-25T04:00Z topic=machine-learning verified=true sentiment=· neutral

A Spectral Phase Diagram for Binary Few-Shot Classification: Intrinsic Dimensionality, Geometric Saturation, and Representational Diagnosis

Researchers introduced the saturation index S(K) to determine when to stop collecting labeled examples in binary few-shot classification, proving it falls below a threshold when the covariance estimator is well-concentrated. Across 246 observations from 17 tasks, the index showed a median Spearman correlation of 0.811 with marginal accuracy gain and achieved AUC of 0.752 as a stopping rule, enabling annotation decisions without test labels or trained classifiers.

read1 min views1 publishedJun 25, 2026

arXiv:2606.24903v1 Announce Type: new Abstract: Deciding when to stop collecting labeled examples is a fundamental but undertheorized problem in applied machine learning. The saturation index $S(K) = \operatorname{erank}(\widehat{\Sigma}_W^{(K)}) / K$ measures the ratio of the effective rank of the pooled within-class sample covariance to the shot count; we prove it falls below a threshold precisely when the covariance estimator is well-concentrated around the population covariance and the linear discriminant has stabilized. The index is computable in $O(d^3)$ time from support features alone, requiring no test labels or trained classifier. Evaluated across $N = 246$ doubling-pair observations from seventeen binary tasks and six datasets, sixteen of seventeen tasks have a positive within-task Spearman correlation between $S(K)$ and marginal accuracy gain (median $\rho = 0.811$). The pooled Spearman correlation is $\rho = 0.548$ ($p = 1.1 \times 10^{-20}$, $N = 246$). A three-phase diagram (exploration, transition, saturation) with mean marginal gains of $3.48%$, $2.40%$, and $0.82%$ is supported by all pairwise significance tests ($p \leq 0.008$). As a binary stopping rule, the index achieves AUC $= 0.752$, providing meaningful probabilistic guidance for annotation decisions. Asymptotic effective rank and peak accuracy show no significant monotone relationship across tasks (Spearman $r_s = 0.380$, $p = 0.133$, $N = 17$). A small saturation index paired with low accuracy diagnoses representational inadequacy. All results are for binary classification with a fixed linear classifier; extensions to $N$-way settings and pretrained backbone representations are discussed as future work.

source & further reading

arxiv.org — original article

── more in #machine-learning 4 stories · sorted by recency

arxiv.org · 25 Jun · #machine-learning

AgentOdyssey: Open-Ended Long-Horizon Text Game Generation for Test-Time Continual Learning Agents

arxiv.org · 25 Jun · #machine-learning

Neural Network Quantization by Learning Low-Loss Subspaces

arxiv.org · 25 Jun · #machine-learning

Dustin: Draft-Augmented Sparse Verification for Efficient Long-Context Generation with Speculative Decoding

arxiv.org · 25 Jun · #machine-learning

Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required