Pruning via Causal Attribution Preserves Reasoning Performance in Large Language Models

wpnews.pro

cd /news/large-language-models/pruning-via-causal-attribution-prese… · home › topics › large-language-models › article

[ARTICLE · art-33539] src=arxiv.org ↗ pub=2026-06-19T04:00Z topic=large-language-models verified=true sentiment=↑ positive

Pruning via Causal Attribution Preserves Reasoning Performance in Large Language Models

Researchers introduced Causal Attribution Pruning (CAP), a training-free method that identifies critical attention heads in large language models by measuring their causal impact on reasoning tasks. CAP achieved up to 61% relative accuracy gains over Wanda on ARC-Challenge at 20% sparsity, preserving reasoning performance better than correlational pruning criteria. The method was evaluated on Llama-3-8B-Instruct and Mistral-7B-Instruct across GSM8K, StrategyQA, and ARC-Challenge at various sparsity levels.

read1 min views1 publishedJun 19, 2026

arXiv:2606.19350v1 Announce Type: new Abstract: Large language models (LLMs) excel at multi-step reasoning but incur substantial inference cost. We introduce Causal Attribution Pruning (CAP), a training-free method that identifies critical attention heads by measuring their causal impact on reasoning tasks and uses these head-level scores to guide fine-grained weight pruning. For each attention head, CAP estimates the expected performance degradation when the head is masked during forward passes on a small calibration set of reasoning problems. These causal scores are then converted into weight-level importance values for the corresponding projection matrices. Unlike magnitude-only or activation-based criteria, CAP's interventional measurement directly captures each head's functional contribution, yielding relative accuracy gains of up to 61% over Wanda on ARC-Challenge at 20% sparsity. We evaluate CAP on GSM8K, StrategyQA, and ARC-Challenge using Llama-3-8B-Instruct and Mistral-7B-Instruct at 10%, 20%, and 50% sparsity. At moderate sparsity (10-20%), CAP improves over Wanda in most model-benchmark configurations. with especially large gains on ARC-Challenge for Llama-3. Our results suggest that attention-head-level causal attribution can better preserve reasoning performance on downstream benchmarks than correlational pruning criteria at equivalent sparsity, while remaining limited by coarse MLP attribution at 50% sparsity.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/pruning-via-causal-attri…

Read original on arxiv.org → arxiv.org/abs/2606.19350

mentioned entities

Causal Attribution Pruning

Wanda

Llama-3-8B-Instruct

Mistral-7B-Instruct

GSM8K

StrategyQA

ARC-Challenge

metadata

slugpruning-via-causal-attribution-preserves-reasoning-performance-in-large-language

topic#large-language-models

secondary2 topics

sentimentpositive

canonicalarxiv.org

navigation

← prevNewegg deal drops RTX 5060 Ti 16…

next →Stop Saying "It Works on My Mach…

── more in #large-language-models 4 stories · sorted by recency

discuss.huggingface.co · 19 Jun · #large-language-models

When Should LLMs Verify Instead of Think Longer?

nbcnews.com · 19 Jun · #large-language-models

AI helped diagnose 18 children whose rare diseases had stumped doctors

letsdatascience.com · 19 Jun · #large-language-models

Zai Chief Predicts China Mythos-Class Model Before 2027

letsdatascience.com · 19 Jun · #large-language-models

Amazon explores selling Trainium chips to data centres

── more on @causal attribution pruning 3 stories trending now

wpnews · 18 Jun · #large-language-models

ICYMI: ZAI launches GLM-5.2 open model with 1M context

wpnews · 18 Jun · #ai-chips

Apple and Intel join forces in Trump’s push to bring chipmaking home

wpnews · 18 Jun · #ai-agents

How to Automate Business Reports With an AI Agent Instead of Dashboards

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required