Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

wpnews.pro

cd /news/large-language-models/why-llms-hallucinate-on-structured-k… · home › topics › large-language-models › article

[ARTICLE · art-14921] src=arxiv.org ↗ pub=2026-05-27T04:00Z topic=large-language-models verified=true sentiment=· neutral

Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

A new study reveals that large language models hallucinate on structured knowledge tasks due to systematic internal failures, not random errors. Researchers found that attention mechanisms disproportionately focus on shortcut-like cues while feed-forward layers fail to ground provided information, causing models to revert to parametric memory. These mechanistic patterns, which generalize across graphs and tables, enable effective hallucination detection in structured knowledge formats.

read1 min views3 publishedMay 27, 2026

arXiv:2605.26362v1 Announce Type: new Abstract: In many reasoning tasks, large language models (LLMs) rely on structured external knowledge, such as graphs and tables, which is typically linearized into sequential token representations. However, even when sufficient knowledge is available, LLMs can still produce hallucinated outputs, and the underlying mechanisms behind such failures remain poorly understood. We investigate these mechanisms and find that hallucinations arise from systematic internal dynamics rather than random noise. First, attention disproportionately concentrates toward shortcut-like structural cues rather than distributing across the full context. Second, feed-forward representations fail to ground the provided knowledge, causing the model to revert to parametric memory. Moreover, our results indicate that hallucination is consistently associated with failures in semantic grounding within feed-forward layers, while attention allocation exhibits greater task-dependent variability. Finally, we show that these mechanistic patterns generalize beyond single-hop graphs to multi-hop and tabular settings, enabling effective hallucination detection across structured knowledge formats.

source & further reading

arxiv.org — original article

── more in #large-language-models 4 stories · sorted by recency

pub.towardsai.net · 15 Jul · #large-language-models

Evolution of NLP: TF-IDF to Agents

github.com · 15 Jul · #large-language-models

Sokoban Speedrun for RL

arxiv.org · 15 Jul · #large-language-models

Rzk: A Proof Assistant for Synthetic ∞-Categories

wired.com · 15 Jul · #large-language-models

AI Isn’t Smarter Than a Baby—Yet

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required