cd /news/machine-learning/neural-voxel-dynamics-learning-impli… · home topics machine-learning article
[ARTICLE · art-40257] src=arxiv.org ↗ pub= topic=machine-learning verified=true sentiment=↑ positive

Neural Voxel Dynamics: Learning Implicit 3D Physics via Volumetric Feature Advection

Researchers introduced Neural Voxel Dynamics, a self-supervised framework that learns implicit 3D physics from video by lifting 2D features into a volumetric latent space. The method achieves long-term structural stability and physical plausibility on benchmarks without relying on explicit simulators, offering a scalable path toward general-purpose dynamic world models.

read1 min views1 publishedJun 26, 2026

arXiv:2606.26410v1 Announce Type: new Abstract: We present a self-supervised framework for learning implicit 3D physical dynamics directly from video-derived supervisory signals. While current generative video models achieve high visual fidelity, they lack a 3D geometric foundation, often resulting in physical inconsistencies and a failure to maintain object permanence. We address this by shifting the predictive bottleneck from 2D image space to a `lifted' 3D Volumetric Latent Space. Our method unprojects semantic features from a Video Joint-Embedding Predictive Architecture (V-JEPA) into a voxelized grid, grounded by monocular depth priors. This lifting enables a Volumetric Feature Advection to learn an action-conditioned transition operator that treats physics as a spatio-temporal state advection problem, i.e., learn implicit 3D physics. Unlike state-of-the-art hybrid models that rely on explicit classical simulators for training and/or inference, our architecture tracks material states implicitly within high-dimensional V-JEPA features. This allows for the emergent simulation of heterogeneous phenomena (e.g., rigid body motion in fluid flow) within a single, unified pipeline. Supervised solely via end-to-end video-derived signal plus action conditions, without access to physics engine internal states, labels, or surrogate models, our model demonstrates good long-term structural stability and physical plausibility on multiple benchmarks (CLEVERER, PhysInOne, PhysGaia). We believe that this work opens a scalable pathway toward general-purpose dynamic world models that internalize the 3D invariants of the physical world solely through passive observation of monocular videos.

── more in #machine-learning 4 stories · sorted by recency
── more on @neural voxel dynamics 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/neural-voxel-dynamic…] indexed:0 read:1min 2026-06-26 ·