cd /news/large-language-models/why-llms-hallucinate-on-structured-k… · home topics large-language-models article
[ARTICLE · art-14921] src=arxiv.org pub= topic=large-language-models verified=true sentiment=· neutral

Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

A new study reveals that large language models hallucinate on structured knowledge tasks due to systematic internal failures, not random errors. Researchers found that attention mechanisms disproportionately focus on shortcut-like cues while feed-forward layers fail to ground provided information, causing models to revert to parametric memory. These mechanistic patterns, which generalize across graphs and tables, enable effective hallucination detection in structured knowledge formats.

read1 min publishedMay 27, 2026

arXiv:2605.26362v1 Announce Type: new Abstract: In many reasoning tasks, large language models (LLMs) rely on structured external knowledge, such as graphs and tables, which is typically linearized into sequential token representations. However, even when sufficient knowledge is available, LLMs can still produce hallucinated outputs, and the underlying mechanisms behind such failures remain poorly understood. We investigate these mechanisms and find that hallucinations arise from systematic internal dynamics rather than random noise. First, attention disproportionately concentrates toward shortcut-like structural cues rather than distributing across the full context. Second, feed-forward representations fail to ground the provided knowledge, causing the model to revert to parametric memory. Moreover, our results indicate that hallucination is consistently associated with failures in semantic grounding within feed-forward layers, while attention allocation exhibits greater task-dependent variability. Finally, we show that these mechanistic patterns generalize beyond single-hop graphs to multi-hop and tabular settings, enabling effective hallucination detection across structured knowledge formats.

── more in #large-language-models 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/why-llms-hallucinate…] indexed:0 read:1min 2026-05-27 ·