04:00
2026-05-27
arxiv.org
artificial-intelligence
RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents
Researchers have developed RICE-PO, a policy optimization framework that converts retrieval interactions into localized learning signals for training reasoning-based retrieval agents. The framework adโฆ