GRIP: Feedback-Guided Prompt Retrieval for Large Multimodal Models

wpnews.pro

cd /news/artificial-intelligence/grip-feedback-guided-prompt-retrieva… · home › topics › artificial-intelligence › article

[ARTICLE · art-24783] src=arxiv.org ↗ pub=2026-06-12T04:00Z topic=artificial-intelligence verified=true sentiment=· neutral

GRIP: Feedback-Guided Prompt Retrieval for Large Multimodal Models

Researchers introduced GRIP, a feedback-guided retrieval framework for multimodal in-context learning that selects examples based on their actual impact on model predictions rather than visual similarity. The system improved performance across classification, captioning, and visual question answering tasks on multiple large multimodal models. GRIP's retrievers can transfer between different models without retraining, including closed-source systems like GPT-4o and Gemini, enabling more efficient deployment of in-context learning.

read1 min views18 publishedJun 12, 2026

arXiv:2606.12744v1 Announce Type: new Abstract: In-Context Learning (ICL) has become a powerful mechanism for adapting Large Language Models (LLMs) to new tasks without fine-tuning. Extending this concept to Large Multimodal Models (LMMs), Multimodal In-Context Learning (M-ICL) relies on retrieving relevant examples, such as images, captions, or question-answer pairs, to guide predictions across tasks like classification, captioning, and visual question answering (VQA). Most existing approaches select in-context examples based on feature-space similarity, assuming that semantically similar samples provide the most useful context. However, our systematic analysis reveals that this assumption does not always hold: visually similar examples are not necessarily those that most effectively enhance in-context learning performance. To address this, we propose the Guided Retrieval of In-context Prompts (GRIP), a learnable vision-only retrieval framework that leverages feedback from LMMs to identify examples that truly improve model predictions. GRIP learns to distinguish beneficial from detrimental in-context examples through contrastive training, refining retrieval beyond pure similarity. Across three multimodal tasks, namely classification, captioning, and VQA, GRIP improves consistently over similarity-based retrieval on Qwen2.5-VL-7B, with its strongest gains in classification on Idefics2-8B. Moreover, we demonstrate that retrievers trained with feedback from one open LMM can be transferred to other models without retraining, including closed-source GPT-4o and Gemini, enabling scalable and cost-efficient deployment of M-ICL. Code will be published upon acceptance.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/grip-feedback-guided-pro…

Read original on arxiv.org → arxiv.org/abs/2606.12744

mentioned entities

GRIP

Qwen2.5-VL-7B

Idefics2-8B

Large Multimodal Models

In-Context Learning

metadata

sluggrip-feedback-guided-prompt-retrieval-for-large-multimodal-models

topic#artificial-intelligence

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevLinear Coding Sessions

next →Can KKR Outmaneuver One of the B…

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 2 Aug · #artificial-intelligence

Soul in Motion — 8:22 PM | 2026-08-02

dev.to · 2 Aug · #artificial-intelligence

I measured the RAG technique menu on 46,000 chunks. Four things mattered.

tech.slashdot.org · 2 Aug · #artificial-intelligence

New Spinning Drone Hides In Plain Sight

marktechpost.com · 2 Aug · #artificial-intelligence

A Tutorial on GeoAI: Designing Footprint Extraction from NAIP Imagery Using U-Net, Grounding DINO, SAM, and Mask R-CNN

── more on @grip 3 stories trending now

wpnews · 1 Aug · #ai-products

OpenAI Atlas Shuts Down August 9: Migration Guide

wpnews · 2 Aug · #artificial-intelligence

I Ran 8 AI APIs Through the Same 50 Prompts — Here's the Real Cost Breakdown

wpnews · 2 Aug · #developer-tools

Agent-Browser – Browser Automation for AI

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required