{"slug": "querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval", "title": "QueryGaussian: Scalable and Training-Free Open-Vocabulary 3D Instance Retrieval", "summary": "Researchers propose QueryGaussian, a training-free framework for open-vocabulary 3D instance retrieval that decouples semantic understanding from geometric representation, enabling efficient retrieval in city-scale scenes. The method reduces GPU memory usage by over 70% and accelerates inference by 180x compared to state-of-the-art approaches, allowing operation on consumer-grade hardware.", "body_md": "arXiv:2606.19733v1 Announce Type: new\nAbstract: Efficiently retrieving specific 3D instances from large-scale scenes via natural language prompts remains a formidable challenge in multimedia analysis. Existing approaches predominantly follow a \"scene-level embedding\" paradigm, which requires distilling high-dimensional semantic features into every 3D primitive. This strategy suffers from a fundamental architectural bottleneck: memory and computational costs scale linearly with scene complexity, inevitably triggering out-of-memory (OOM) failures in city-scale environments. To address this barrier, we propose QueryGaussian, a training-free framework for expeditious and scalable open-vocabulary 3D instance retrieval. Unlike holistic semantic distillation, QueryGaussian employs an instance-level query mechanism that decouples semantic understanding from geometric representation. Specifically, we leverage pre-trained 2D vision models to interpret user prompts and lift segmentation masks into 3D via a concurrent maximum-weight association strategy, ensuring semantic-visual consistency. To mitigate projection ambiguity, we introduce a temporal fusion module with multi-stage adaptive density clustering. Experimental results demonstrate that QueryGaussian not only matches the accuracy of state-of-the-art methods but also delivers a decisive efficiency leap, reducing GPU memory usage by over 70% and accelerating inference by 180x. Crucially, QueryGaussian enables expeditious instance retrieval on city-scale scenes containing tens of millions of Gaussians using consumer-grade hardware.", "url": "https://wpnews.pro/news/querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval", "canonical_source": "https://arxiv.org/abs/2606.19733", "published_at": "2026-06-19 04:00:00+00:00", "updated_at": "2026-06-19 04:01:35.873974+00:00", "lang": "en", "topics": ["computer-vision", "natural-language-processing", "ai-research", "ai-infrastructure"], "entities": ["QueryGaussian"], "alternates": {"html": "https://wpnews.pro/news/querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval", "markdown": "https://wpnews.pro/news/querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval.md", "text": "https://wpnews.pro/news/querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval.txt", "jsonld": "https://wpnews.pro/news/querygaussian-scalable-and-training-free-open-vocabulary-3d-instance-retrieval.jsonld"}}