Spectral Asymptotics of Neural Network Loss Landscapes: An Exact Decomposition of the Curvature Exponent

wpnews.pro

cd /news/machine-learning/spectral-asymptotics-of-neural-netwo… · home › topics › machine-learning › article

[ARTICLE · art-19866] src=arxiv.org ↗ pub=2026-06-03T04:00Z topic=machine-learning verified=true sentiment=· neutral

Spectral Asymptotics of Neural Network Loss Landscapes: An Exact Decomposition of the Curvature Exponent

A new study proves the Spectral Alignment Decomposition, which explains why the curvature exponent $\alpha$ — governing how Hessian eigenvalues scale with gradient singular values — varies across neural network layers, with $\alpha \approx 2$ for convolutions, $\approx 1$ for transformer attention, and $< 1$ for MLP up-projections. The decomposition reduces the variation to a geometric question about alignment between Kronecker factor eigenbases and gradient singular directions, and yields a spectral transfer identity linking curvature exponent, effective gradient rank-decay, and Hessian decay exponent that predicts $s$ to ~2% median error across 93 layers with no free parameters. As a proof of concept, the researchers derive an architecture-adaptive preconditioner and show that Spectral Newton outperforms AdamW on vision benchmarks where $\alpha \approx 2$.

read1 min views17 publishedJun 3, 2026

arXiv:2606.02596v1 Announce Type: new
Abstract: The curvature exponent $\alpha$ in $h_k \propto \sigma_k^\alpha$ -- governing how Hessian eigenvalues scale with gradient singular values -- varies systematically across layer types ($\alpha \approx 2$ for convolutions, $\approx 1$ for transformer attention, $< 1$ for MLP up-projections). Why? We prove the Spectral Alignment Decomposition: $\alpha = 2 + d\log\Phi_k / d\log\sigma_k$, where $\Phi_k$ measures alignment between Kronecker factor eigenbases and gradient singular directions. This reduces "why does $\alpha$ vary?" to a geometric question we answer for LayerNorm, residual connections, and softmax heads. The decomposition implies a spectral transfer identity $s = \alpha\gamma$ linking curvature exponent, effective gradient rank-decay $\gamma$, and Hessian decay exponent $s$. The identity is algebraic; its empirical content is that $\alpha$ and $\gamma$, fit on independent data (HVPs vs. SVD), recover $s$ to ~2% median error across 93 layers, five architectures, and three datasets -- with no free parameters. A zeta-function bound on participation ratio shows curvature concentrates onto effectively one direction per layer. As a proof of concept, we derive the architecture-adaptive preconditioner $T(\sigma;\alpha)$ and show that Spectral Newton -- implementing $T$ in the gradient singular basis -- outperforms AdamW on vision benchmarks where $\alpha \approx 2$.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/spectral-asymptotics-of-…

Read original on arxiv.org → arxiv.org/abs/2606.02596

mentioned entities

Spectral Alignment Decomposition

Spectral Newton

AdamW

LayerNorm

Kronecker factor

metadata

slugspectral-asymptotics-of-neural-network-loss-landscapes-an-exact-decomposition-of

topic#machine-learning

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevAI Agent Deployment Architecture…

next →Achei interessante, talvez você …

── more in #machine-learning 4 stories · sorted by recency

github.com · 18 Jul · #machine-learning

Show HN: Forward-Only, Autograd-Free PINN with 0ns Zero-Copy Memory Interlock

discuss.huggingface.co · 18 Jul · #machine-learning

Hierarchical Agentic Memory with Hyperbolic Embeddings

fastcompany.com · 18 Jul · #machine-learning

You can get Apple’s iOS 27 on your iPhone today. Try these 5 features first

github.com · 18 Jul · #machine-learning

Timeline Studio

── more on @spectral alignment decomposition 3 stories trending now

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #large-language-models

Gemini 3.5 Pro Delayed to July 17: Architectural Rebuild Explained

wpnews · 8 Jul · #ai-chips

D-Matrix launches Corsair AI inference platform, challenging Nvidia’s GPU dominance

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required