arXiv:2605.24113v1 Announce Type: new Abstract: Classical archetypal analysis is appealing for its interpretability, but its linear geometry can limit performance on data with strongly non-linear structure; at the same time, existing neural extensions improve flexibility while often weakening the geometric meaning of archetypes and interpolations. In this work, we develop a Riemannian version of archetypal analysis based on data-driven pullback geometry for real-valued data, with the goal of combining the interpretability of classical archetypal analysis with the expressive power of modern non-linear models. We introduce a class of deformed star distributions together with associated pullback Riemannian geometry to provide a statistical interpretation of the resulting manifold mappings, define the Riemannian archetypal mapping (RAM) as a projection onto the manifold of geodesically convex combinations of archetypes, and propose a practical optimization scheme based on convex relaxation followed by non-convex refinement. We further propose a learning scheme that yields reasonable, albeit generally suboptimal, deformed star distributions from data. Experiments on synthetic examples and MNIST show that the resulting framework produces meaningful geodesics, useful denoising projections, and geometry-aware classifications, while also clarifying where current optimization limitations remain.
Riemannian Archetypal Analysis: Interpretable non-linear data analysis on deformed star distributions
Researchers developed Riemannian Archetypal Analysis, a method that combines the interpretability of classical archetypal analysis with the flexibility of non-linear models by using data-driven pullback geometry on deformed star distributions. The approach, which defines a Riemannian archetypal mapping as a projection onto geodesically convex combinations of archetypes, produced meaningful geodesics, denoising projections, and geometry-aware classifications in experiments on synthetic data and MNIST. This work addresses the trade-off between interpretability and expressive power in data analysis, offering a framework that preserves geometric meaning while handling strongly non-linear structures.
Run your AI side-project on zahid.host
EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.