Pythia-160m

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-19

arxiv.org

machine-learning

How Linear Is a Transformer Feed-Forward Block? Per-Block Linear Recoverability Is Learned, Not Architectural

A new study measures the linearity of transformer feed-forward blocks, finding that linear recoverability (R²_lin) varies widely across blocks and is a learned property, not an architectural one. The …

// co-occurs with top 3 entities

GPT-2 1 llama-160m 1 arXiv 1