Yann LeCun’s paper reveals conditions for LeJEPA to learn world models

A new formal proof submitted to arXiv on May 25, 2026, by David Klindt, Yann LeCun, and Randall Balestriero establishes the precise mathematical conditions under which LeJEPA, a variant of the Joint Embedding Predictive Architecture, can recover the true hidden variables driving observations. The theorem proves that LeJEPA achieves linear identifiability of world models only when latent variables follow an isotropic Gaussian distribution and evolve under stationary, additive-noise transitions. The findings strengthen LeCun's argument that self-supervised prediction in embedding space can produce AI systems with provable world-modeling capabilities, while also highlighting the architecture's limitations when real-world latent dynamics are non-Gaussian.

Yann LeCun’s paper reveals conditions for LeJEPA to learn world models A new formal proof shows that the Joint Embedding Predictive Architecture can reliably recover hidden causes of the world, but only when specific mathematical conditions are met. Yann LeCun has been talking about world models for years. Now his team has a formal proof showing exactly when their architecture actually learns one. A paper submitted to arXiv on May 25, 2026, titled “When Does LeJEPA Learn a World Model?” lays out the precise mathematical conditions under which LeJEPA, a variant of the Joint Embedding Predictive Architecture, can recover the real hidden variables driving observations. The short version: the latent variables need to be Gaussian and evolve under stable, predictable dynamics. If those conditions hold, LeJEPA doesn’t just learn useful representations. It learns the actual structure of the world generating the data. What the paper actually proves The core result is a theorem about “linear identifiability.” In English: if you feed LeJEPA nonlinear observations think raw sensor data, pixels, or any messy real-world input , it can untangle the underlying causes and recover them up to a simple linear transformation. That’s a strong guarantee. Most self-supervised learning methods can learn representations that are useful for downstream tasks, but they can’t promise those representations correspond to anything real. This paper says LeJEPA can, under the right circumstances. The catch, and it’s a meaningful one, is the “if and only if” nature of the theorem. The latent variables must follow an isotropic Gaussian distribution. They must also evolve according to stationary, additive-noise transitions. The authors of the paper are David Klindt, Yann LeCun, and Randall Balestriero. LeCun is Meta’s Chief AI Scientist. The “if and only if” structure is worth pausing on. It doesn’t just say “Gaussian latents are sufficient.” It says they’re necessary. If your hidden variables aren’t Gaussian and stationary, this particular architecture loses its guarantee of recovering them faithfully. Building on earlier work LeJEPA was first introduced in 2025 by LeCun and Balestriero, combining a predictive loss function with a technique called Gaussian regularization, specifically SIGReg. The regularization was designed to solve one of the oldest headaches in self-supervised learning: representation collapse, where the model learns to map everything to the same useless point in embedding space. The 2025 version showed that Gaussian regularization worked empirically. It kept training stable and produced useful embeddings. But the theoretical question lingered: does it actually recover the true latent structure, or just a convenient approximation? The new paper answers that question definitively for the Gaussian-stationary case. The team also launched LeWorldModel in 2026, a full end-to-end implementation of the JEPA framework that learns directly from pixel inputs. It uses the same Gaussian regularization approach to maintain training stability. Why this matters beyond academia LeCun has argued repeatedly that self-supervised prediction in embedding space, not scaling up language models, is the path to more capable AI systems. This paper strengthens his hand by showing that at least one architecture in the JEPA family has provable world-modeling capabilities, not just empirical ones. The Gaussian assumption is both the paper’s greatest strength and its most obvious vulnerability. Real-world latent dynamics are often non-Gaussian. Financial markets have fat tails. Physical systems have phase transitions. Biological processes have nonlinear feedback loops. The theorem tells us precisely where LeJEPA’s guarantees hold, and by implication, where they might not. For researchers and engineers building autonomous systems, robotics platforms, or predictive models, the practical takeaway is nuanced. If your domain’s latent structure plausibly approximates Gaussian stationary dynamics, LeJEPA offers unusually strong theoretical guarantees for representation learning. If it doesn’t, you’re back to empirical validation without the safety net of a formal proof. The “if and only if” nature of the current theorem suggests that extending identifiability to non-Gaussian or non-stationary settings would require architectural changes, not just different hyperparameters. Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy https://cryptobriefing.com/editorial-policy/ .