cd /news/large-language-models/trajectory-dynamics-in-language-mode… · home topics large-language-models article
[ARTICLE · art-22182] src=arxiv.org pub= topic=large-language-models verified=true sentiment=· neutral

Trajectory Dynamics in Language Model Hidden States Predict Human Processing Costs Beyond Surprisal

Researchers introduced trajectory extrapolation error, a measure of how much a language model's hidden states deviate from a linear path during word processing, and found it independently predicts human reading times beyond surprisal. In experiments with GPT-2 and Pythia models on the Natural Stories corpus, the measure showed near-zero correlation with surprisal and was especially predictive for garden-path sentences. The findings reveal two dissociable components of processing cost: word-level prediction error and sensitivity to the local momentum of unfolding interpretation.

read1 min publishedJun 5, 2026

arXiv:2606.05346v1 Announce Type: new Abstract: Human language comprehension unfolds sequentially: each word is processed in the context of those that came before, and the interpretation builds incrementally over time. Surprisal, the negative log probability of a word given its context, has been the dominant predictor of incremental processing cost. But surprisal reduces rich sequential representations to a single scalar at each word, discarding information about the direction in which the interpretation has been evolving. Dynamical-systems approaches suggest that the trajectory of the evolving interpretive state, not just its position at each moment,should shape processing, and language itself may have local momentum, since speakers plan utterances a few words at a time. We introduce trajectory extrapolation error: at each word, we fit a linear trajectory to the preceding hidden states of a transformer language model and measure deviation from the extrapolated path. On the Natural Stories corpus, this measure is nearly orthogonal to surprisal (r = .044) and independently predicts self-paced reading times. The effect is especially pronounced in garden-path sentences, strengthens with model scale (GPT-2 Small to Large), and replicates across architectures with different positional encoding schemes (GPT-2 vs. Pythia/RoPE). A displacement control shows the effect is not reducible to representational change magnitude: displacement and extrapolation error predict in opposite directions. These findings reveal two dissociable components of processing cost: word-level prediction error (surprisal) and sensitivity to the local momentum of the unfolding interpretation (trajectory extrapolation error).

── more in #large-language-models 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/trajectory-dynamics-…] indexed:0 read:1min 2026-06-05 ·