{"slug": "gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks", "title": "Gradient Descent with Large Step Size Restores Symmetry in Deep Linear Networks with Multi-Pathway", "summary": "A new study shows that discrete Gradient Descent (GD) with a large step size restores symmetry in multi-pathway Deep Linear Networks, counteracting the \"winner-takes-all\" specialization predicted by Gradient Flow. Researchers proved that single-path solutions are sharp minima, while distributing signals across pathways reduces sharpness, causing oscillations at the Edge of Stability to override early symmetry breaking and drive signal redistribution. These findings explain why large-step GD favors shared representations over persistent single-pathway dominance, clarifying how depth shapes pathway competition.", "body_md": "arXiv:2606.05219v1 Announce Type: new\nAbstract: Recent analyses of multi-pathway Deep Linear Networks use Gradient Flow to predict a \"winner-takes-all\" specialization in which path symmetry breaks and each feature concentrates in a single pathway. In this work, we show that discrete Gradient Descent (GD) with a large step size tells a different story. We prove that single-path solutions are sharp minima, whereas distributing signals across pathways reduces sharpness by a factor that decreases with both the number of pathways and depth. Consequently, while early training reproduces the depth-driven symmetry breaking predicted by GF, oscillations at the Edge of Stability subsequently override this tendency and drive the network into a re-balancing phase, where signals redistribute across pathways. Together, these results clarify how depth shapes pathway competition and explain why large-step GD favors shared representations rather than persistent single-pathway dominance.", "url": "https://wpnews.pro/news/gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks", "canonical_source": "https://arxiv.org/abs/2606.05219", "published_at": "2026-06-05 04:00:00+00:00", "updated_at": "2026-06-05 04:36:28.019971+00:00", "lang": "en", "topics": ["machine-learning", "neural-networks", "ai-research"], "entities": ["Gradient Descent", "Deep Linear Networks", "Edge of Stability"], "alternates": {"html": "https://wpnews.pro/news/gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks", "markdown": "https://wpnews.pro/news/gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks.md", "text": "https://wpnews.pro/news/gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks.txt", "jsonld": "https://wpnews.pro/news/gradient-descent-with-large-step-size-restores-symmetry-in-deep-linear-networks.jsonld"}}