cd /news/large-language-models/i-analyzed-hidden-state-dynamics-acr… · home topics large-language-models article
[ARTICLE · art-42833] src=discuss.huggingface.co ↗ pub= topic=large-language-models verified=true sentiment=· neutral

I analyzed hidden-state dynamics across 7 open-weight LLMs and found recurring functional patterns. Looking for feedback

An independent researcher analyzed hidden-state dynamics across seven open-weight LLMs and found recurring functional patterns encoded in representation geometry rather than individual neurons, with functional proxy states detectable across architectures. The findings suggest internal computation is organized through functional regimes evolving through computation, not fixed layers, and raise questions about how LLMs organize computation internally.

read3 min views1 publishedJun 29, 2026

I’ve spent the last few months trying to answer a question that initially looked much simpler than it actually is:

What actually happens inside an LLM while it is generating a response?

Most work evaluates language models through their outputs (benchmarks, perplexity, reasoning scores…). I decided to look at something different: the evolution of the hidden representations themselves.

I built a runtime framework that records hidden states layer-by-layer during inference and started running the same experiments across multiple open-weight models (GPT-2, DistilGPT2, OPT-125M, Qwen2.5-0.5B-Instruct, TinyLlama, Phi-1.5 and Llama-3.2-1B).

I expected a relatively straightforward result.

Instead, every new experiment generated a new question.

Some of the observations so far are:

• Hidden-state trajectories are not random. They exhibit reproducible internal dynamical regimes across architectures.

• Functional proxy states (syntax-like processing, decision-like behavior and output stabilization) can be detected consistently enough to cluster models according to their internal dynamics rather than simply their parameter count.

• These functional signatures remain reasonably stable across different prompt families, although not perfectly, suggesting that prompt content modulates the dynamics without completely changing the internal organization.

• Linear probes can decode several functional categories directly from hidden representations with surprisingly high accuracy.

At that point the obvious question became:

Are we just overfitting labels?

So I started adding progressively stronger negative controls.

First:

Then:

Then:

Finally:

The results became much more interesting.

Random labels collapse the decoding performance.

Random Gaussian representations also collapse it.

Feature permutation destroys most of the signal.

However…

Orthogonal rotations preserve almost all decoding performance.

This strongly suggests that the relevant information is not encoded in individual neurons or embedding dimensions.

Instead, it appears to be encoded in the relative geometry of the representation.

That was not the result I expected.

Another unexpected finding concerns depth.

Initially I was looking for something like “syntax layers” or “semantic layers”.

The data doesn’t really support such a simple picture.

Instead, the same functional signatures seem capable of appearing at different absolute layers depending on the architecture.

This led me to think less in terms of fixed layers and more in terms of functional regimes evolving through computation.

At this stage I am not claiming to have discovered a universal law of transformers.

These are empirical observations obtained on a limited set of open-weight models.

What I do believe is that they raise interesting questions about how computation is actually organized inside modern LLMs.

I’d really appreciate feedback from people working on:

mechanistic interpretability

representation learning

probing methods

transformer internals

geometry of representations

In particular I’d like your opinion on three questions:

Which control experiment would you absolutely require before taking these observations seriously?

Have you seen previous work showing comparable evidence that functional information is primarily encoded in representation geometry rather than individual dimensions?

If you were extending this project, what would be your next experiment? I’m not affiliated with a research lab—this is an independent research project. I’m sharing it because I would genuinely value critical feedback more than validation.

If there’s enough interest, I’m happy to share the methodology, code, and experimental reports.

── more in #large-language-models 4 stories · sorted by recency
── more on @gpt-2 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/i-analyzed-hidden-st…] indexed:0 read:3min 2026-06-29 ·