{"slug": "i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring", "title": "I analyzed hidden-state dynamics across 7 open-weight LLMs and found recurring functional patterns. Looking for feedback", "summary": "An independent researcher analyzed hidden-state dynamics across seven open-weight LLMs and found recurring functional patterns encoded in representation geometry rather than individual neurons, with functional proxy states detectable across architectures. The findings suggest internal computation is organized through functional regimes evolving through computation, not fixed layers, and raise questions about how LLMs organize computation internally.", "body_md": "I’ve spent the last few months trying to answer a question that initially looked much simpler than it actually is:\n\n**What actually happens inside an LLM while it is generating a response?**\n\nMost work evaluates language models through their outputs (benchmarks, perplexity, reasoning scores…). I decided to look at something different: the evolution of the hidden representations themselves.\n\nI built a runtime framework that records hidden states layer-by-layer during inference and started running the same experiments across multiple open-weight models (GPT-2, DistilGPT2, OPT-125M, Qwen2.5-0.5B-Instruct, TinyLlama, Phi-1.5 and Llama-3.2-1B).\n\nI expected a relatively straightforward result.\n\nInstead, every new experiment generated a new question.\n\nSome of the observations so far are:\n\n• Hidden-state trajectories are not random. They exhibit reproducible internal dynamical regimes across architectures.\n\n• Functional proxy states (syntax-like processing, decision-like behavior and output stabilization) can be detected consistently enough to cluster models according to their internal dynamics rather than simply their parameter count.\n\n• These functional signatures remain reasonably stable across different prompt families, although not perfectly, suggesting that prompt content modulates the dynamics without completely changing the internal organization.\n\n• Linear probes can decode several functional categories directly from hidden representations with surprisingly high accuracy.\n\nAt that point the obvious question became:\n\n**Are we just overfitting labels?**\n\nSo I started adding progressively stronger negative controls.\n\nFirst:\n\nThen:\n\nThen:\n\nFinally:\n\nThe results became much more interesting.\n\nRandom labels collapse the decoding performance.\n\nRandom Gaussian representations also collapse it.\n\nFeature permutation destroys most of the signal.\n\nHowever…\n\nOrthogonal rotations preserve almost all decoding performance.\n\nThis strongly suggests that the relevant information is **not encoded in individual neurons or embedding dimensions**.\n\nInstead, it appears to be encoded in the **relative geometry of the representation**.\n\nThat was not the result I expected.\n\nAnother unexpected finding concerns depth.\n\nInitially I was looking for something like “syntax layers” or “semantic layers”.\n\nThe data doesn’t really support such a simple picture.\n\nInstead, the same functional signatures seem capable of appearing at different absolute layers depending on the architecture.\n\nThis led me to think less in terms of fixed layers and more in terms of **functional regimes evolving through computation**.\n\nAt this stage I am **not claiming to have discovered a universal law of transformers**.\n\nThese are empirical observations obtained on a limited set of open-weight models.\n\nWhat I do believe is that they raise interesting questions about how computation is actually organized inside modern LLMs.\n\nI’d really appreciate feedback from people working on:\n\nmechanistic interpretability\n\nrepresentation learning\n\nprobing methods\n\ntransformer internals\n\ngeometry of representations\n\nIn particular I’d like your opinion on three questions:\n\nWhich control experiment would you absolutely require before taking these observations seriously?\n\nHave you seen previous work showing comparable evidence that functional information is primarily encoded in representation geometry rather than individual dimensions?\n\nIf you were extending this project, what would be your next experiment?\n\nI’m not affiliated with a research lab—this is an independent research project. I’m sharing it because I would genuinely value critical feedback more than validation.\n\nIf there’s enough interest, I’m happy to share the methodology, code, and experimental reports.", "url": "https://wpnews.pro/news/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring", "canonical_source": "https://discuss.huggingface.co/t/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring-functional-patterns-looking-for-feedback/177217#post_1", "published_at": "2026-06-29 01:30:02+00:00", "updated_at": "2026-06-29 02:06:13.815024+00:00", "lang": "en", "topics": ["large-language-models", "machine-learning", "neural-networks", "ai-research"], "entities": ["GPT-2", "DistilGPT2", "OPT-125M", "Qwen2.5-0.5B-Instruct", "TinyLlama", "Phi-1.5", "Llama-3.2-1B"], "alternates": {"html": "https://wpnews.pro/news/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring", "markdown": "https://wpnews.pro/news/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring.md", "text": "https://wpnews.pro/news/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring.txt", "jsonld": "https://wpnews.pro/news/i-analyzed-hidden-state-dynamics-across-7-open-weight-llms-and-found-recurring.jsonld"}}