{"slug": "auditing-llm-agents-may-require-auditing-the-upstream-feed", "title": "Auditing LLM agents may require auditing the upstream feed", "summary": "A new study from researchers at multiple labs found that adversarial manipulation of external information feeds can steer LLM agents' decisions away from their defaults, with effects ranging from 5% to 100% in some cases. The research, based on 2,785 decision rollouts across four open LLMs, demonstrates that feed curation causally impacts downstream decisions and argues that safety evaluations must audit the feed layer, not just the final prompt.", "body_md": "# Computer Science > Artificial Intelligence\n\n[Submitted on 30 May 2026]\n\n# Title:Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults\n\n[View PDF](/pdf/2606.00914)\n\n[HTML (experimental)](https://arxiv.org/html/2606.00914v1)\n\nAbstract:LLM agents increasingly act after consuming ranked external information streams such as social feeds, search results, retrieval contexts, and email queues, yet safety evaluations almost always test the model or the user prompt in isolation, never the upstream ranker that decides what the agent reads just before it acts. We introduce a controlled protocol that holds the model, persona, topic, and final decision prompt fixed and varies only the composition and ordering of the posts an agent encounters during a preceding ten-turn \"scrolling\" phase, isolating the causal effect of feed curation on a downstream decision. Across 2,785 decision rollouts on four modern open instruct LLMs from three independent labs, we identify three response regimes: adversarial capitulation, default saturation, and a default-direction asymmetry in which a one-sided feed tips a decision the model was genuinely uncertain about (in the clearest cases from 5% to 100%; Fisher p as low as 3 x 10^-10) but cannot dislodge one it already favors or holds firmly. The effect follows a dose-response curve, survives a generator swap that rules out a writing-style artifact, generalizes across several decision domains including security-relevant choices such as removing a deployment approval gate or relaxing access controls, and is partly mitigated by two simple feed-level defenses; a frontier model retains its default. We characterize the recommender as a practical, default-bounded control surface for LLM agents, and argue that agent evaluations must audit the feed layer rather than the final prompt alone.\n\n### Current browse context:\n\ncs.AI\n\n### References & Citations\n\nLoading...\n\n# Bibliographic and Citation Tools\n\nBibliographic Explorer\n\n*(*[What is the Explorer?](https://info.arxiv.org/labs/showcase.html#arxiv-bibliographic-explorer))\nConnected Papers\n\n*(*[What is Connected Papers?](https://www.connectedpapers.com/about))\nLitmaps\n\n*(*[What is Litmaps?](https://www.litmaps.co/))\nscite Smart Citations\n\n*(*[What are Smart Citations?](https://www.scite.ai/))# Code, Data and Media Associated with this Article\n\nalphaXiv\n\n*(*[What is alphaXiv?](https://alphaxiv.org/))\nCatalyzeX Code Finder for Papers\n\n*(*[What is CatalyzeX?](https://www.catalyzex.com))\nDagsHub\n\n*(*[What is DagsHub?](https://dagshub.com/))\nGotit.pub\n\n*(*[What is GotitPub?](http://gotit.pub/faq))\nHugging Face\n\n*(*[What is Huggingface?](https://huggingface.co/huggingface))\nScienceCast\n\n*(*[What is ScienceCast?](https://sciencecast.org/welcome))# Demos\n\n# Recommenders and Search Tools\n\nInfluence Flower\n\n*(*[What are Influence Flowers?](https://influencemap.cmlab.dev/))\nCORE Recommender\n\n*(*[What is CORE?](https://core.ac.uk/services/recommender))# arXivLabs: experimental projects with community collaborators\n\narXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.\n\nBoth individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.\n\nHave an idea for a project that will add value for arXiv's community? [ Learn more about arXivLabs](https://info.arxiv.org/labs/index.html).", "url": "https://wpnews.pro/news/auditing-llm-agents-may-require-auditing-the-upstream-feed", "canonical_source": "https://arxiv.org/abs/2606.00914", "published_at": "2026-06-18 10:46:58+00:00", "updated_at": "2026-06-18 10:52:56.353739+00:00", "lang": "en", "topics": ["large-language-models", "ai-safety", "ai-agents", "natural-language-processing"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/auditing-llm-agents-may-require-auditing-the-upstream-feed", "markdown": "https://wpnews.pro/news/auditing-llm-agents-may-require-auditing-the-upstream-feed.md", "text": "https://wpnews.pro/news/auditing-llm-agents-may-require-auditing-the-upstream-feed.txt", "jsonld": "https://wpnews.pro/news/auditing-llm-agents-may-require-auditing-the-upstream-feed.jsonld"}}