{"slug": "intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via", "title": "Intra-Modal Neighbors Never Lie: Rectifying Inter-Modal Noisy Correspondence via Graph-Based Intra-Modal Reasoning", "summary": "Researchers have developed IN2R, a new framework that corrects mismatched image-text pairs in large web-harvested datasets by synthesizing continuous supervision signals from intra-modal data relationships rather than relying on discrete label selection. The method uses a Graph Refiner to reason over neighboring data points in a cross-modal memory, producing soft prototypes that reduce alignment errors. In tests on Flickr30K, MS-COCO, and CC152K, IN2R outperformed existing approaches for cross-modal retrieval tasks.", "body_md": "arXiv:2606.04061v1 Announce Type: new\nAbstract: Large-scale web-harvested datasets have fueled the progress of cross-modal retrieval but inevitably suffer from noisy correspondence, which severely degrades model generalization. Existing methods primarily address this by filtering out noise or seeking a substitute label, yet they predominantly remain bound by a \"Discrete Selection\" paradigm. We argue that relying on a single discrete proxy induces Single-Point Fragility and Discretization Error. To overcome these limitations, we propose a novel framework, Intra-modal Neighbor-aware Noise Rectification (IN2R), which shifts the paradigm from searching for a substitute to synthesizing a reliable supervision target. Leveraging the intrinsic geometric stability of intra-modal data, IN2R employs a Graph Refiner to perform relational reasoning over neighbors retrieved from a dynamic Cross-Model Memory. Instead of propagating discrete labels, our method synthesizes a continuous, soft prototype that reflects the consensus of the local semantic neighborhood, effectively rectifying inter-modal misalignment. Extensive experiments on Flickr30K, MS-COCO, and CC152K demonstrate that IN2R significantly outperforms state-of-the-art methods. Our code and pre-trained models are publicly available at https://github.com/liuyyy111/IN2R.", "url": "https://wpnews.pro/news/intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via", "canonical_source": "https://arxiv.org/abs/2606.04061", "published_at": "2026-06-04 04:00:00+00:00", "updated_at": "2026-06-04 04:18:53.600127+00:00", "lang": "en", "topics": ["machine-learning", "computer-vision", "natural-language-processing", "neural-networks", "ai-research"], "entities": ["IN2R", "Flickr30K", "MS-COCO", "CC152K", "GitHub"], "alternates": {"html": "https://wpnews.pro/news/intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via", "markdown": "https://wpnews.pro/news/intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via.md", "text": "https://wpnews.pro/news/intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via.txt", "jsonld": "https://wpnews.pro/news/intra-modal-neighbors-never-lie-rectifying-inter-modal-noisy-correspondence-via.jsonld"}}