cd /news/computer-vision/unpaired-rgb-thermal-gaussian-splatt… · home topics computer-vision article
[ARTICLE · art-22166] src=arxiv.org pub= topic=computer-vision verified=true sentiment=· neutral

Unpaired RGB-Thermal Gaussian-Splatting Using Visual Geometric Transformers

Researchers have developed a framework for unpaired RGB-thermal novel view synthesis that uses a 3D feed-forward transformer to independently estimate camera poses for each modality, then aligns them with the Procrustes algorithm. The method enables multi-modal 3D Gaussian Splatting without requiring precisely calibrated image pairs, achieving competitive thermal view synthesis while maintaining RGB fidelity. The team also introduced a benchmarking framework to evaluate both per-modality image synthesis and cross-modal coherence in reconstructed scenes.

read1 min publishedJun 5, 2026

arXiv:2606.05491v1 Announce Type: new Abstract: Multi-modal novel view synthesis (NVS) combining RGB and thermal imagery enables precise 3D scene reconstruction with visual and thermal information. However, existing methods typically rely on precisely calibrated RGB-thermal image pairs or stereo setups, limiting scalability and practical deployment. To address this, we introduce a framework for unpaired RGB-thermal NVS that leverages VGGT, a 3D feed-forward transformer architecture, to independently estimate camera poses for each modality. The pose sets are then aligned using the Procrustes algorithm with a cross-modal feature matcher, enabling joint registration without paired calibration. Building on this alignment, we further propose a multi-modal 3D Gaussian Splatting approach that learns directly from unpaired RGB and thermal images. Experiments on diverse scenes demonstrate that our method achieves competitive performance in thermal view synthesis while maintaining RGB fidelity. Moreover, we show that existing reconstruction approaches can produce modality-specific reconstructions that lack cross-modal consistency. We thus introduce a benchmarking framework to rigorously evaluate both per-modality image synthesis and the multi-modal coherence of reconstructed scenes.

── more in #computer-vision 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/unpaired-rgb-thermal…] indexed:0 read:1min 2026-06-05 ·