Evidence for feature-specific error correction in LLMs

wpnews.pro

cd /news/large-language-models/evidence-for-feature-specific-error-… · home › topics › large-language-models › article

[ARTICLE · art-38822] src=arxiv.org ↗ pub=2026-06-25T04:00Z topic=large-language-models verified=true sentiment=· neutral

Evidence for feature-specific error correction in LLMs

Researchers propose an empirical test for error correction in large language models, finding that residual-stream activations are robust to small perturbations and that feature-specific directions are privileged over generic ones, consistent with computation in superposition. The results replicate across six LLMs including Gemma-2-9B and Llama-3.1-8B, with p>2 for feature directions and p≈2 for controls.

read1 min views1 publishedJun 25, 2026

arXiv:2606.24964v1 Announce Type: new
Abstract: Understanding the features of large language models (LLMs) is a central goal of interpretability. LLMs are commonly assumed to use superposition to represent more features than they have dimensions. They may not only represent features in superposition but also perform computation in superposition. Theory predicts that computing in superposition requires error correction that privileges feature directions over generic ones, but this prediction has not been tested empirically. We propose an empirical test of error correction in LLMs based on activation perturbations. Perturbing residual-stream activations, we find that they are robust to small perturbations--forming activation plateaus consistent with error correction--but less robust along candidate feature directions ("pure" directions, constructed from contrastive prompt pairs) than along mixtures of two such directions, indicating that the pure directions are privileged. We quantify this privilegedness by modeling the perturbation effect as a function of the $L^p$-norm of its decomposition into feature components. For $p=2$ the response is a quadratic form with at most as many nonzero eigenvalues as the residual-stream dimension, which cannot privilege the many feature directions superposition requires. $p>2$ lifts this constraint and is consistent with feature-specific error correction. We find $p>2$ for contrastive, MELBO, and SAE-decoder directions, and $p\approx2$ for random and PCA directions (controls). These results replicate across Gemma-2-9B, Qwen3-1.7B, Llama-3.1-8B, Mistral-7B-v0.3, Aya-Expanse-8B, and Yi-1.5-9B. We further validate our method on a toy model of error correction with known ground-truth features, recovering $p>2$ for true feature directions, degrading toward $2$ as we rotate away from them.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/evidence-for-feature-spe…

Read original on arxiv.org → arxiv.org/abs/2606.24964

mentioned entities

Gemma-2-9B

Qwen3-1.7B

Llama-3.1-8B

Mistral-7B-v0.3

Aya-Expanse-8B

Yi-1.5-9B

MELBO

SAE

metadata

slugevidence-for-feature-specific-error-correction-in-llms

topic#large-language-models

secondary2 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevChinese models are sometimes bet…

next →As large language models enter C…

── more in #large-language-models 4 stories · sorted by recency

arxiv.org · 25 Jun · #large-language-models

Graph-Based Phonetic Error Correction of Noisy ASR

arxiv.org · 19 Jun · #large-language-models

Closing the Social-Semantic Gap: SPSD for Edge-Based Prompt Compression in Cloud LLM Inference

dev.to · 18 Jun · #large-language-models

Speculative decoding shifted our output distribution and evals missed it

dev.to · 14 Jun · #large-language-models

A Chinese 8B model beat the Western 8B models at Japanese RAG. I still wouldn't put it in the default deployment — and that distinction is the point.

── more on @gemma-2-9b 3 stories trending now

wpnews · 22 Jun · #generative-ai

Bain tests software takeover targets using vibecoding AI replicas

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

wpnews · 24 Jun · #ai-policy

An AI startup is suing the US government for taking away Anthropic's new model

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required