{"slug": "muon-p-muon-with-fractional-spectral-powers", "title": "Muon$^p$: Muon with Fractional Spectral Powers", "summary": "Researchers introduced Muon$^p$, a new optimizer that uses fractional spectral-power updates to interpolate between Muon and gradient descent, improving finetuning performance on billion-scale models. The method preserves matrix-multiplication-only structure and compute complexity while maximizing linear improvement in loss under the Schatten q-norm.", "body_md": "arXiv:2606.13867v1 Announce Type: new\nAbstract: Muon is an increasingly widely used optimizer that replaces a gradient $G=USV^\\top$ with its polar factor $UV^\\top$, thereby flattening the singular spectrum. However, full flattening discards singular-value information that may matter for adaptation. We introduce Muon$^p$, a Muon-style optimizer that instead uses fractional spectral-power updates $US^pV^\\top$ for rational $p\\in(0,1)$, interpolating between Muon and gradient descent. To make it practical, we prove that fractional spectral powers cannot be computed by any fixed univariate polynomial iteration, and furthermore derive low-degree odd bivariate recurrences that approximate $US^pV^\\top$ using only matrix multiplications, preserving Muon's matrix-multiplication-only structure and compute complexity. We show that Muon$^p$ maximizes the linear improvement in loss under the Schatten $q$-norm for $q=1+\\frac{1}{p}$. Empirically, Muon$^p$ is especially effective for finetuning: on billion-scale models, Muon$^p$ improves validation perplexity and downstream task performance. We further analyze when Muon$^p$ is less suitable, through the lens of spectral geometry. Our results reveal important insights on when preserving the singular spectrum can bring significant gains, and introduce a principled way to achieve them.", "url": "https://wpnews.pro/news/muon-p-muon-with-fractional-spectral-powers", "canonical_source": "https://arxiv.org/abs/2606.13867", "published_at": "2026-06-15 04:00:00+00:00", "updated_at": "2026-06-15 04:20:29.362129+00:00", "lang": "en", "topics": ["machine-learning", "neural-networks", "ai-research"], "entities": ["Muon$^p$", "Muon", "arXiv"], "alternates": {"html": "https://wpnews.pro/news/muon-p-muon-with-fractional-spectral-powers", "markdown": "https://wpnews.pro/news/muon-p-muon-with-fractional-spectral-powers.md", "text": "https://wpnews.pro/news/muon-p-muon-with-fractional-spectral-powers.txt", "jsonld": "https://wpnews.pro/news/muon-p-muon-with-fractional-spectral-powers.jsonld"}}