SAGE: Retain-Aware Post-Hoc Sanitization of Final Unlearning Vector

wpnews.pro

cd /news/machine-learning/sage-retain-aware-post-hoc-sanitizat… · home › topics › machine-learning › article

[ARTICLE · art-32099] src=arxiv.org ↗ pub=2026-06-18T04:00Z topic=machine-learning verified=true sentiment=· neutral

SAGE: Retain-Aware Post-Hoc Sanitization of Final Unlearning Vector

Researchers propose SAGE, a post-hoc method to sanitize the final update vector in LLM unlearning, reducing the trade-off between forgetting and retention without rerunning the original pipeline. SAGE uses spectral activation geometry to suppress components aligned with retained knowledge while preserving forgetting capability, consistently improving retention across multiple methods and benchmarks.

read1 min views2 publishedJun 18, 2026

arXiv:2606.18309v1 Announce Type: new Abstract: Large Language Model (LLM) unlearning aims to remove undesirable knowledge or behaviors while preserving retained capabilities. Current unlearning methods all involve a trade-off between unlearning and retention. We have found that the retention activation bias can also be used to quantify the damage an unlearning method inflicts on retention, without considering the specific implementation of the unlearning process. This allows us to restore retention performance for any unlearning method using a post-hoc approach. Therefore, we propose a complementary post-hoc setting to sanitize the final update vector without rerunning the original unlearning pipeline. In this setting, we design SAGE, Spectral Activation-GEometry Sanitization, a source-agnostic correction for final unlearning updates. SAGE collects real module inputs from a small retain proxy, extracts their dominant activation geometry, and solves a source-anchored optimization objective in closed form, which suppresses update components aligned with high-energy retained directions while preserving the source method's forgetting carrier. Across multiple unlearning methods, model scales, and benchmarks, SAGE consistently relieves the retain-forget trade-off, identifying post-hoc sanitization of final vectors as a practical and underexplored axis for machine unlearning.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/sage-retain-aware-post-h…

Read original on arxiv.org → arxiv.org/abs/2606.18309

mentioned entities

SAGE

arXiv

metadata

slugsage-retain-aware-post-hoc-sanitization-of-final-unlearning-vector

topic#machine-learning

secondary2 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevIs AI Getting Quietly Dumber? A …

next →Most agentic AI projects in prod…

── more in #machine-learning 4 stories · sorted by recency

arxiv.org · 18 Jun · #machine-learning

CaVe-VLM-CoT: An Interpretable Vision-Language Model Framework

arxiv.org · 18 Jun · #machine-learning

PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning

letsdatascience.com · 18 Jun · #machine-learning

ML-Predicted Nitrate Improves Phytoplankton Forecasts in Shelf Sea

letsdatascience.com · 18 Jun · #machine-learning

XAI Analyses Drivers and Interdependencies in European Electricity Markets

── more on @sage 3 stories trending now

wpnews · 17 Jun · #developer-tools

CircleCI MCP Server: Debug Build Failures Without Leaving Your AI Coding Agent

wpnews · 17 Jun · #artificial-intelligence

How I Build Production AI Apps on Cloudflare with Claude Code

wpnews · 16 Jun · #large-language-models

I'm building CortexDB — an agent-native context database for AI agents

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required