Beyond MoCap: Scaling Motion Tokenizers with Synthetic Human Motion for Generative Modeling

wpnews.pro

cd /news/generative-ai/beyond-mocap-scaling-motion-tokenize… · home › topics › generative-ai › article

[ARTICLE · art-42897] src=arxiv.org ↗ pub=2026-06-29T04:00Z topic=generative-ai verified=true sentiment=↑ positive

Beyond MoCap: Scaling Motion Tokenizers with Synthetic Human Motion for Generative Modeling

Researchers propose a framework to expand human motion generation by using large-scale synthetic motion data and a redesigned VQ-VAE tokenizer, overcoming the limited diversity of motion capture datasets. The approach improves coverage and compositionality of motion vocabulary, leading to gains in text-to-motion and motion continuation tasks.

read1 min views1 publishedJun 29, 2026

arXiv:2606.27547v1 Announce Type: new Abstract: Human motion generation models are fundamentally constrained by the limited diversity of motion capture datasets, which predominantly contain common, repetitive actions and fail to cover the long tail of complex human movements, resulting in a restricted motion vocabulary in learned latent representations and poor generalization to rare, compositional, and highly dynamic motions. In this work, we propose a framework for expanding the motion representation space by leveraging large-scale synthetic human motion, introducing a data generation pipeline that produces diverse, physically plausible motion sequences beyond the distribution of existing datasets and integrating it with a redesigned VQ-VAE tokenizer that adapts to this expanded motion space. Unlike conventional tokenizers trained on narrow data distributions, our approach jointly scales both the training distribution and the discrete codebook, enabling the model to capture a significantly richer set of motion primitives. We demonstrate that training with synthetic motion substantially improves the coverage and compositionality of the learned motion vocabulary, leading to consistent gains across motion generation tasks such as text-to-motion and motion continuation, while remaining fully compatible with existing frameworks including MotionGPT. Our results suggest that the primary bottleneck lies in the limited support of the learned motion representation, rather than model architecture alone. Scaling synthetic motion in tandem with representation learning offers a principled path toward more expressive, controllable, and generalizable human motion synthesis.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/beyond-mocap-scaling-mot…

Read original on arxiv.org → arxiv.org/abs/2606.27547

mentioned entities

MotionGPT

VQ-VAE

metadata

slugbeyond-mocap-scaling-motion-tokenizers-with-synthetic-human-motion-for-modeling

topic#generative-ai

secondary3 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevv0.5.6

── more in #generative-ai 4 stories · sorted by recency

arxiv.org · 29 Jun · #generative-ai

CoIn: Comprehensive 2D-3D Inpainting with Gaussian Splatting Guidance

arxiv.org · 29 Jun · #generative-ai

Large Language Model Teaches Visual Students: Cross-Modality Transfer of Fine-Grained Conceptual Knowledge

arxiv.org · 29 Jun · #generative-ai

TruEye: Fine-Grained Detection of AI-Generated Human Subjects in Images

arxiv.org · 29 Jun · #generative-ai

Tessellating The Earth

── more on @motiongpt 3 stories trending now

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

wpnews · 28 Jun · #ai-agents

OpenCode v1.17: Session Snapshots Undo Your AI Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required