{"slug": "geometry-consistent-endoscopic-representations-for-image-guided-navigation-via", "title": "Geometry-Consistent Endoscopic Representations for Image-Guided Navigation via Structured Foundation Model Adaptation", "summary": "Researchers propose a unified framework for learning geometry-consistent and domain-robust image representations in monocular endoscopy, combining a synthetic data pipeline with Hierarchy-Aware Geometry-Semantic Adaptation to improve pose estimation, depth prediction, and image-to-anatomy alignment. The method outperforms existing approaches on public and proprietary datasets, demonstrating strong synthetic-to-real transfer for clinical navigation tasks.", "body_md": "arXiv:2606.17340v1 Announce Type: new\nAbstract: Accurate vision-based navigation in monocular endoscopy is difficult due to limited depth cues, weak tissue texture, non-rigid deformation, and substantial appearance variation across domains, all of which complicate pose estimation, depth prediction, and image-to-anatomy alignment. Although recent vision foundation models have shown promise, their learned representations often remain insufficiently geometry-consistent, hindering stable feature correspondence and limiting their reliability for downstream navigation tasks. We propose a unified framework for learning geometry-consistent and domain-robust image representations for monocular endoscopy. The framework combines a synthetic data pipeline that provides accurate geometric supervision with Hierarchy-Aware Geometry-Semantic Adaptation, a structured alternative to standard LoRA that inserts low-rank adapters selectively across the transformer hierarchy and couples them with layer-wise training objectives to encourage geometric correspondence in intermediate features and semantic consistency in deeper features. Experiments on public and proprietary datasets show improved geometric and semantic representation quality, leading to better performance on downstream navigation tasks including pose estimation and monocular depth estimation. The learned representations show favorable synthetic-to-real transfer on clinical bronchoscopy and provide a useful initialization for adaptation to sinus endoscopy and colonoscopy under limited supervision. The framework also shows favorable scaling with model size and training data. These results support hierarchy-aware, geometry-guided adaptation as a practical approach for endoscopic representation learning.", "url": "https://wpnews.pro/news/geometry-consistent-endoscopic-representations-for-image-guided-navigation-via", "canonical_source": "https://arxiv.org/abs/2606.17340", "published_at": "2026-06-17 04:00:00+00:00", "updated_at": "2026-06-17 04:25:54.417593+00:00", "lang": "en", "topics": ["computer-vision", "machine-learning", "ai-research"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/geometry-consistent-endoscopic-representations-for-image-guided-navigation-via", "markdown": "https://wpnews.pro/news/geometry-consistent-endoscopic-representations-for-image-guided-navigation-via.md", "text": "https://wpnews.pro/news/geometry-consistent-endoscopic-representations-for-image-guided-navigation-via.txt", "jsonld": "https://wpnews.pro/news/geometry-consistent-endoscopic-representations-for-image-guided-navigation-via.jsonld"}}