Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry

wpnews.pro

cd /news/large-language-models/adversarial-concept-search-predictin… · home › topics › large-language-models › article

[ARTICLE · art-27525] src=arxiv.org ↗ pub=2026-06-15T04:00Z topic=large-language-models verified=true sentiment=· neutral

Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry

Researchers at arXiv have developed a method using LLM representational geometry to predict compositional errors, finding that near-orthogonal concept encodings lead to successful composition while close encodings cause interference and failure. The approach anticipates failure modes across tasks without evaluating specific inputs, offering a scalable foundation for active learning and targeted stress testing.

read1 min publishedJun 15, 2026

arXiv:2606.13934v1 Announce Type: new Abstract: Humans cannot always intuit what scenarios are most challenging to LLMs. Hoping to capture challenging edge cases, developers either design problems to be difficult for humans or curate extensive benchmarks. What if we could instead anticipate which scenarios a model will fail on? In this paper, we use an LLM's representational geometry to predict which concept combinations it will fail on. We attribute this compositional failure to interference between salient features. In tasks that require systematic composition - toy programmatic settings, multihop reasoning, multilingual factual recall - we find that when a pair of concepts is encoded near-orthogonally, the model reliably composes them. When their linear encodings are close, producing interference, the model fails to compose them. Our method reliably anticipates failure modes across different compositional tasks, without evaluating specific inputs. These results lay the groundwork to use representational geometry to identify high-risk examples, construct targeted stress tests, and provide a scalable foundation for active learning in real-world deployment.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/adversarial-concept-sear…

Read original on arxiv.org → arxiv.org/abs/2606.13934

mentioned entities

arXiv

metadata

slugadversarial-concept-search-predicting-compositional-errors-from-feature-geometry

topic#large-language-models

secondary2 topics

sentimentneutral

langen

canonicalarxiv.org

navigation

← prevDomain-Specific AI for Pharma, B…

next →5 Claude Automation Tricks That …

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 15 Jun · #large-language-models

RT-VLA: Real-Time Vision-Language-Action Models via Knowledge Distillation

arxiv.org · 15 Jun · #large-language-models

WorkBench Revisited: Workplace Agents Two Years On

arxiv.org · 15 Jun · #large-language-models

Minim: Privacy-Aware Minimal View for Agents via Trusted Local Sanitization

arxiv.org · 15 Jun · #large-language-models

Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required