# CHEEM Improves Continual Learning Without Forgetting

> Source: <https://letsdatascience.com/news/cheem-improves-continual-learning-without-forgetting-85a3f67d>
> Published: 2026-05-27 06:01:37.974312+00:00

Photo: 
dataconomy.com
 
· rights & takedowns
Per a revised arXiv paper and university press coverage, the framework 
CHEEM
 (Continual Hierarchical-Exploration-Exploitation Memory) is an exemplar-free, class-incremental continual learning method that adapts model structure during a stream of tasks. According to the arXiv submission (v5), 
CHEEM
 uses a Hierarchical Exploration-Exploitation guided neural architecture search (
HEE-NAS
) with four primitive operations - reuse, new, adapt, and skip - to update selected components across tasks. The authors report evaluations on vision transformer backbones across the 
MTIL
 and 
VDD
 benchmarks, with results that the paper describes as significantly outperforming prompting-based continual learning baselines and closely approaching the full fine-tuning upper bound. North Carolina State University coverage and multiple news outlets quote corresponding author Tianfu Wu on CHEEMs dual aims: reducing catastrophic forgetting and improving adaptive computation by modifying or skipping layers depending on task complexity.
What happened
Per the arXiv paper (v5) titled "CHEEM: Continual Learning by Reuse, New, Adapt and Skip -- A Hierarchical Exploration-Exploitation Approach," the authors introduce 
CHEEM
 (Continual Hierarchical-Exploration-Exploitation Memory) as an exemplar-free class-incremental continual learning framework. The submission states that 
CHEEM
 implements a Hierarchical Exploration-Exploitation guided neural architecture search, 
HEE-NAS
, which exposes four primitive operations: 
reuse
, 
new
, 
adapt
, and 
skip
, to dynamically update selected components when a model observes a stream of tasks. The paper reports experiments using Tiny and Base 
vision transformer
 backbones on the 
MTIL
 and 
VDD
 benchmarks and describes performance that significantly outperforms prompting-based continual learning baselines while closely approaching the full fine-tuning upper bound. North Carolina State University press materials and multiple media reports quote Tianfu Wu, corresponding author, on the frameworks aims to address both continual learning and adaptive intelligence.
Technical details
Per the arXiv submission, 
HEE-NAS
 functions as an internal memory that guides an efficient architecture search with the four primitive operations. The authors report that 
reuse
 lets the model leverage existing layers, 
new
 adds capacity, 
adapt
 modifies existing components, and 
skip
 omits components for efficiency. The paper also proposes a holistic Figure-of-Merit (FoM) metric to aggregate continual learning performance and compute-efficiency tradeoffs. Reported evaluations use both Tiny and Base vision transformer backbones; the submission states the method is exemplar-free and adopts an external memory of task centroids for task ID inference, following prior work.
Industry context
Editorial analysis: Continual learning research has two recurring technical needs, stability versus plasticity and compute-efficiency under variable task difficulty. Methods that combine architectural adaptation with selection mechanisms, as 
CHEEM
 does, fit within a growing trend toward adaptive computation and modular networks in both vision and language models. Observers in the field have increasingly prioritized exemplar-free approaches and task-agnostic evaluation metrics, which the arXiv paper explicitly addresses with its ExfCCL (exemplar-free class-incremental continual learning) framing and FoM proposal.
Context and significance
Editorial analysis: The claim that 
CHEEM
 closely approaches a full fine-tuning upper bound is meaningful for practitioners because closing that gap while avoiding data replay or large exemplar stores reduces operational and privacy overhead in deployed systems. If reproductions confirm the papers results, the approach could influence how teams design updateable models for long-lived vision deployments, where incremental labels arrive over time and compute budgets vary by request.
What to watch
Editorial analysis: Key indicators for assessing 
CHEEM
 beyond the paper are:
•
reproducibility of results across independent code releases and third-party benchmarks
•
wall-clock compute and memory cost of 
HEE-NAS
 during streaming updates
•
behavior on task sequences with abrupt distributional shifts versus gradual shifts. The arXiv submission indicates code availability; observers will want a full code release and replication reports to evaluate engineering tradeoffs in production settings
Takeaway for practitioners
Editorial analysis: 
CHEEM
 exemplifies a class of methods that combine neural architecture adaptation with continual learning desiderata. Teams evaluating continual learning options should treat the paper as a promising research direction but rely on reproduced benchmarks and measured update costs before adopting similar architecture-search-driven update pipelines in production. For academic and applied research, 
CHEEM
 contributes concrete primitives and an FoM metric that can be compared against replay, parameter-isolation, and prompting-based continual learning baselines.
Scoring Rationale
The paper presents a substantive improvement in exemplar-free continual learning and introduces practical primitives and a FoM metric, which matter to practitioners designing long-lived models. The result is notable research rather than a paradigm shift, and the score reflects importance balanced against the need for reproduction and engineering-cost validation.
Practice with real 
Ad Tech
 data
90
 SQL & Python problems · 15 industry datasets
Used by DS/ML engineers at top companies
Active Search Campaigns by Budget
Easy
High CPC Clicks & Poor Landing Pages
Medium
Campaign ROAS by Attribution Model
Hard
250 free problems · No credit card
See all 
Ad Tech
 problems