LooseControlVideo: Directorial Video Control using Spatial Blocking

wpnews.pro

cd /news/computer-vision/loosecontrolvideo-directorial-video-… · home › topics › computer-vision › article

[ARTICLE · art-33500] src=arxiv.org ↗ pub=2026-06-19T04:00Z topic=computer-vision verified=true sentiment=↑ positive

LooseControlVideo: Directorial Video Control using Spatial Blocking

Researchers introduced LooseControlVideo, a framework that uses sparse, oriented 3D boxes as a proxy for spatial blocking in text-to-video generation, enabling intuitive control over layout and trajectory. Fine-tuned on a Wan 2.2 backbone with DNOCS encoding, it outperforms existing models by 1.2x to 3x in trajectory error and 2x in occlusion accuracy on nuScenes, HO-3D, and BEHAVE benchmarks.

read1 min views5 publishedJun 19, 2026

arXiv:2606.19495v1 Announce Type: new Abstract: Precise 3D spatial orchestration in text-to-video generation remains a significant challenge, particularly for multi-object scenes where semantic layout and temporal dynamics are often entangled. While existing depth-conditioned models achieve good structural fidelity, they necessitate dense, frame-accurate guidance that is labor-intensive to author for dynamic events involving deformable objects. We present LooseControlVideo, a framework that enables intuitive and expressive control by using sparse, oriented 3D boxes as a "blocking" proxy. This allows users to author high-level layout and trajectory while leveraging a video generative model to generate realistic occlusions, dynamics and interactions. We achieve this by fine-tuning a Wan 2.2 backbone on a video dataset annotated with DNOCS, a novel encoding for 3D size, orientation and depth-ordered occlusions. Furthermore, our method allows for localized refinement, such as adjusting a jump trajectory or adding an interaction, with minimal disruption to the global scene context. Extensive evaluations on the nuScenes, HO-3D, and BEHAVE benchmarks demonstrate that LooseControlVideo significantly outperforms existing 2D-box and flow-based baselines. Our findings indicate a 1.2x to 3x improvement in Trajectory Error; 2x improvement in Rigid Motion Consistency; and a 1.5x to 2x increase in Occlusion Accuracy over current state-of-the-art layout-conditioned models, demonstrating that oriented 3D primitives provide good geometric prior for complex, multi-agent video authoring.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/loosecontrolvideo-direct…

Read original on arxiv.org → arxiv.org/abs/2606.19495

mentioned entities

LooseControlVideo

Wan 2.2

DNOCS

nuScenes

HO-3D

BEHAVE

metadata

slugloosecontrolvideo-directorial-video-control-using-spatial-blocking

topic#computer-vision

secondary4 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevNewegg deal drops RTX 5060 Ti 16…

next →Stop Saying "It Works on My Mach…

── more in #computer-vision 4 stories · sorted by recency

arxiv.org · 19 Jun · #computer-vision

LEAP: Layer-skipping Efficiency via Adaptive Progression for Vision Transformer Distillation

arxiv.org · 19 Jun · #computer-vision

Learning When to Denoise: Optimizing Asynchronous Schedules for Latent Diffusion

arxiv.org · 19 Jun · #computer-vision

Performance Analysis and Optimization of 3D Generative Diffusion Models across GPU Architectures

arxiv.org · 19 Jun · #computer-vision

TeleMorpher: Toward Robust Simultaneous Motion-Location Editing

── more on @loosecontrolvideo 3 stories trending now

wpnews · 18 Jun · #large-language-models

ICYMI: ZAI launches GLM-5.2 open model with 1M context

wpnews · 18 Jun · #ai-chips

Apple and Intel join forces in Trump’s push to bring chipmaking home

wpnews · 18 Jun · #ai-agents

How to Automate Business Reports With an AI Agent Instead of Dashboards

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required