Metadata Routing

wpnews.pro

A couple of months ago, I stumbled upon this video by Vincent D. Warmerdam about metadata routing in scikit-learn. I'll be honest, I had no idea what "metadata routing" even meant, but Vincent's explanation completely changed how I think about building ML pipelines.

The video showed me that one of the most frustrating problems in scikit-learn; passing sample weights and groups through complex pipelines finally had an elegant solution. It piqued my curiosity enough that I dove deep into the feature, tested it extensively, and honestly, I was surprised by how little coverage this gets in technical blogs and articles. So I figured, why not write about it myself and share what I learned?

If you've ever struggled with imbalanced datasets, grouped cross-validation, or just wanted to pass custom information through your pipelines, this article is for you. Let's start from the very beginning.

Let's start with a concrete example. You're building a credit card fraud detection model with this data:

X = transaction_features  # Amount, merchant, time, location, etc.
y = is_fraud             # 0 = legitimate, 1 = fraud

sample_weights = [1.0, 1.0, 10.0, 1.0, ...]  # Fraud transactions weighted 10x
customer_ids = [101, 102, 101, 103, ...]      # Which customer made each transaction

Metadata is the "extra information" beyond your features (X) and labels (y):

sample_weight

groups

Imagine you're building a fraud detection system for a financial company. You have:

The Challenge: Your model needs to:

The problem?

This "metadata" (weights, groups) isn't part of your feature matrix X or labels y. It's auxiliary information that needs to flow through your entire ML pipeline.

Before scikit-learn 1.3, this was nearly impossible. Let's see why.

Prior to metadata routing, you'd face multiple interconnected problems:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

pipe = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', LogisticRegression())
])

fraud_weights = np.where(y == 1, 10.0, 1.0)

pipe.fit(X, y, sample_weight=fraud_weights)  # Error: unexpected keyword argument
python
from sklearn.model_selection import cross_val_score, GroupKFold

customer_groups = df['customer_id'].values

scores = cross_val_score(
    pipe, X, y,
    cv=GroupKFold(n_splits=5),
    groups=customer_groups  # Pipeline doesn't know what to do with this
)
python
from sklearn.model_selection import GridSearchCV

grid = GridSearchCV(pipe, param_grid, cv=GroupKFold(n_splits=5))

grid.fit(X, y, sample_weight=fraud_weights, groups=customer_groups)  # Doesn't work

So you can begin to see the problem by now. Pipelines had no way to route this metadata to specific components. You'd have to use hacky workarounds like clf__sample_weight

, which was inconsistent, broke with nested pipelines, and completely failed with cross-validation.

Metadata routing solves ALL three problems at once with a clean, explicit API. Here's how it transforms our fraud detection pipeline:

from sklearn import set_config
from sklearn.model_selection import GridSearchCV, GroupKFold

set_config(enable_metadata_routing=True)

pipe = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', LogisticRegression())
])

pipe['clf'].set_fit_request(sample_weight=True)
pipe['clf'].set_score_request(sample_weight=True)

pipe.fit(X, y, sample_weight=fraud_weights)

scores = cross_val_score(
    pipe, X, y,
    cv=GroupKFold(n_splits=5),
    groups=customer_groups  # Works perfectly!
)

grid = GridSearchCV(pipe, param_grid, cv=GroupKFold(n_splits=5))
grid.fit(X, y, sample_weight=fraud_weights, groups=customer_groups)  # Both work!

print(f"Best model handles imbalance AND respects customer grouping!")

What changed? Each component explicitly declares what metadata it needs using set_*_request()

methods. The pipeline then automatically routes metadata to the right places. Simple, explicit, powerful.

Here's what you need to know:

set_fit_request()

fit()

set_score_request()

score()

set_predict_request()

predict()

Important:

The pipeline doesn't pass metadata to every step. Only components that explicitly call set_*_request(metadata=True)

will receive that metadata. Components that don't request metadata won't receive it, even if you pass it to the pipeline.

pipe = Pipeline([
    ('scaler', StandardScaler()),        # Doesn't request sample_weight
    ('clf', LogisticRegression())        # Requests sample_weight
])

pipe['clf'].set_fit_request(sample_weight=True)  # Only clf gets weights

pipe.fit(X, y, sample_weight=weights)

Let's build a custom transformer that uses sample weights during fitting. This is useful for weighted feature scaling or selection.

import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin

class WeightedStandardScaler(BaseEstimator, TransformerMixin):
    """StandardScaler that respects sample weights during fitting."""

    def __init__(self):
        self.mean_ = None
        self.std_ = None

    def fit(self, X, y=None, sample_weight=None):
        """Fit scaler using weighted mean and std."""
        if sample_weight is None:
            sample_weight = np.ones(X.shape[0])

        sample_weight = sample_weight / sample_weight.sum()

        self.mean_ = np.average(X, axis=0, weights=sample_weight)
        variance = np.average((X - self.mean_) ** 2, axis=0, weights=sample_weight)
        self.std_ = np.sqrt(variance)

        return self

    def transform(self, X):
        """Transform using fitted statistics."""
        return (X - self.mean_) / self.std_

    def get_metadata_routing(self):
        """Configure metadata routing for this transformer."""
        return (
            super()
            .get_metadata_routing()
            .add_self_request(self)
            .fit(sample_weight=True)  # Request sample_weight in fit()
        )

from sklearn import set_config
set_config(enable_metadata_routing=True)

X = np.random.randn(100, 5)
weights = np.random.rand(100)

scaler = WeightedStandardScaler()
X_scaled = scaler.fit_transform(X, sample_weight=weights)

Here's what matters when building custom estimators:

sample_weight

parameter in fit()

methodget_metadata_routing()

to declare routing requirementsadd_self_request()

and chain routing configurationNone

case when metadata isn't providedNow let's use our custom transformer in a pipeline with multiple metadata consumers.

from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

X = np.random.randn(1000, 10)
y = (X[:, 0] + X[:, 1] > 0).astype(int)
sample_weights = np.random.rand(1000)

X_train, X_test, y_train, y_test, w_train, w_test = train_test_split(
    X, y, sample_weights, test_size=0.2, random_state=42
)

pipe = Pipeline([
    ('scaler', WeightedStandardScaler()),
    ('classifier', LogisticRegression(max_iter=1000))
])

pipe.set_fit_request(sample_weight=True)
pipe['classifier'].set_fit_request(sample_weight=True)

pipe.fit(X_train, y_train, sample_weight=w_train)

pipe['classifier'].set_score_request(sample_weight=True)
score = pipe.score(X_test, y_test, sample_weight=w_test)

print(f"Weighted accuracy: {score:.3f}")

Pipeline Routing Rules:

set_*_request()

Metadata routing shines in hyperparameter tuning scenarios where you need to pass weights or groups to cross-validation.

from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification

X, y = make_classification(
    n_samples=1000, n_features=20, n_informative=15,
    n_redundant=5, weights=[0.9, 0.1], random_state=42
)

sample_weights = np.where(y == 1, 10.0, 1.0)

pipe = Pipeline([
    ('scaler', WeightedStandardScaler()),
    ('clf', LogisticRegression(max_iter=1000))
])

pipe['scaler'].set_fit_request(sample_weight=True)
pipe['clf'].set_fit_request(sample_weight=True)
pipe['clf'].set_score_request(sample_weight=True)

param_grid = {
    'clf__C': [0.1, 1.0, 10.0],
    'clf__penalty': ['l1', 'l2']
}

grid_search = GridSearchCV(
    pipe,
    param_grid,
    cv=5,
    scoring='accuracy',
    n_jobs=-1
)

grid_search.fit(X, y, sample_weight=sample_weights)

print(f"Best params: {grid_search.best_params_}")
print(f"Best weighted score: {grid_search.best_score_:.3f}")

best_pipe = grid_search.best_estimator_

GridSearchCV Routing Features:

groups

parameter for GroupKFold and similar splittersUsing Groups for Cross-Validation:

from sklearn.model_selection import GroupKFold

groups = np.repeat(np.arange(100), 10)  # 100 groups, 10 samples each

grid_search = GridSearchCV(
    pipe,
    param_grid,
    cv=GroupKFold(n_splits=5),
    n_jobs=-1
)

grid_search.fit(X, y, groups=groups, sample_weight=sample_weights)

Sometimes you need to pass different metadata values to different pipeline steps. Metadata aliasing lets you route metadata under different names.

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression


pipe = Pipeline([
    ('scaler', WeightedStandardScaler()),
    ('clf', LogisticRegression())
])

pipe['scaler'].set_fit_request(sample_weight='feature_weights')  # Alias
pipe['clf'].set_fit_request(sample_weight='sample_weights')      # Alias

pipe.fit(
    X, y,
    feature_weights=feature_importance_weights,  # Goes to scaler
    sample_weights=class_balance_weights         # Goes to classifier
)

Use cases for aliasing:

Important:

The parameter name you use in fit()

must match the alias, not the internal parameter name.

Metadata routing works seamlessly with nested pipelines, automatically propagating metadata through all levels.

from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.decomposition import PCA
from sklearn.feature_selection import SelectKBest

preprocessing = Pipeline([
    ('scaler', WeightedStandardScaler()),
    ('features', FeatureUnion([
        ('pca', PCA(n_components=10)),
        ('select', SelectKBest(k=5))
    ]))
])

main_pipe = Pipeline([
    ('preprocess', preprocessing),
    ('clf', LogisticRegression())
])

main_pipe['preprocess']['scaler'].set_fit_request(sample_weight=True)
main_pipe['clf'].set_fit_request(sample_weight=True)

main_pipe.fit(X, y, sample_weight=weights)

What happens:

A few things to remember about nested pipelines:

pipe['outer']['inner']

Complex example with FeatureUnion:

feature_union = FeatureUnion([
    ('weighted_pca', WeightedPCA()),      # Needs sample_weight
    ('standard_select', SelectKBest())    # Doesn't need sample_weight
])

pipe = Pipeline([
    ('features', feature_union),
    ('clf', LogisticRegression())
])

pipe['features'].transformer_list[0][1].set_fit_request(sample_weight=True)
pipe['clf'].set_fit_request(sample_weight=True)

pipe.fit(X, y, sample_weight=weights)

1. Always Enable Metadata Routing Explicitly

from sklearn import set_config
set_config(enable_metadata_routing=True)

2. Use Descriptive Metadata Names

estimator.set_fit_request(sample_weight=True, class_prior=True)

estimator.set_fit_request(metadata=True)

3. Configure Routing at Pipeline Creation

pipe = Pipeline([...])
pipe['step1'].set_fit_request(sample_weight=True)
pipe['step2'].set_fit_request(sample_weight=True)

4. Handle None Gracefully in Custom Estimators

def fit(self, X, y=None, sample_weight=None):
    if sample_weight is None:
        sample_weight = np.ones(len(X))

Pitfall 1: Forgetting to Enable Metadata Routing

pipe.fit(X, y, sample_weight=weights)  # Metadata routing not enabled!

Pitfall 2: Not Configuring All Steps

pipe['clf'].set_fit_request(sample_weight=True)
pipe.fit(X, y, sample_weight=weights)  # Scaler doesn't get weights!

Pitfall 3: Mixing Old and New APIs

pipe.fit(X, y, clf__sample_weight=weights)  # Old way
pipe['clf'].set_fit_request(sample_weight=True)  # New way

Pitfall 4: Forgetting to Request Metadata in score()

pipe['clf'].set_fit_request(sample_weight=True)
pipe['clf'].set_score_request(sample_weight=True)
pipe.score(X, y, sample_weight=weights)  # Weights ignored in scoring!

Check Routing Configuration:

print(pipe['clf'].get_metadata_routing())

Verify Metadata is Being Used:

def fit(self, X, y=None, sample_weight=None):
    print(f"Received sample_weight: {sample_weight is not None}")

Test with and without Metadata:

estimator.fit(X, y)  # Without metadata
estimator.fit(X, y, sample_weight=weights)  # With metadata

n_jobs

in GridSearchCV and similarmemory

parameter in Pipeline for cachingUse metadata routing when you need to:

Don't use metadata routing when:

When I first started working with metadata routing, I struggled to clearly demarcate what should be a feature versus what should be metadata. For instance, in the earlier credit card fraud use case we saw, I kept asking myself: "Should customer fraud history be a feature? What about transaction timestamps? Customer IDs?"

The line felt blurry, and I made several mistakes before understanding the distinction. Let me share what I learned, so you can avoid the confusion I went through.

Features (X): Information the model uses to make predictions

Metadata: Information about how to train/evaluate the model, but not used for predictions

Let's look at some ambiguous cases:

X = [[100.50, 'online', 'electronics'],  # Amount is a feature
     [25.00, 'store', 'groceries']]
y = [0, 1]  # Fraud labels

Decision:

Feature - The model uses amount to predict fraud

X = [[100.50, 'online', 'electronics'],
     [25.00, 'store', 'groceries']]
y = [0, 1]

sample_weights = [1.0, 5.0]

Decision:

Metadata - Tells the model "pay more attention to this sample" but isn't used for prediction

But wait! You could also make this a feature:

X = [[100.50, 'online', 'electronics', 0.0],   # Added fraud_history
     [25.00, 'store', 'groceries', 0.5]]
y = [0, 1]

Decision:

Feature - Now the model learns "customers with high fraud history are risky"

Ask yourself: "Should the model learn patterns from this, or does it tell the model how to learn?"

Scenario	Feature or Metadata?	Why?
Transaction amount	Feature
Model predicts based on amount
Customer ID	Metadata (groups)
For grouping in CV, not prediction
Time of day	Feature
Model learns "3 AM transactions are suspicious"
Data quality score	Metadata (weight)
"Trust this sample more/less"
Previous fraud count	Could be either!
See below
Geographic location	Feature
Model learns regional patterns
Sample collection date	Metadata (groups)
For time-based CV splits

Some information genuinely could be either. Here's how to decide:

Option 1: As a Feature

X = [[100, 'online', 2],  # 2 previous frauds
     [50, 'store', 0]]     # 0 previous frauds

Option 2: As Metadata (Sample Weight)

X = [[100, 'online'],
     [50, 'store']]
sample_weights = [5.0, 1.0]  # Weight based on fraud history

Option 3: Both!

X = [[100, 'online', 2],
     [50, 'store', 0]]
sample_weights = [5.0, 1.0]

Use as Feature when:

Use as Metadata when:

Use as Both when:

Mistake 1: Using customer ID as a feature

X = [[101, 100, 'online'],  # Customer ID as feature
     [102, 50, 'store']]

Problem: Model memorizes customers instead of learning patterns. Use as metadata (groups) instead!

Mistake 2: Using sample importance as a feature

X = [[100, 'online', 5.0],  # Importance score as feature
     [50, 'store', 1.0]]

Problem: Importance score won't be available at prediction time. Use as metadata (sample_weight)!

Better Approach: Separate concerns

X = [[100, 'online'],
     [50, 'store']]
sample_weights = [5.0, 1.0]  # Importance
groups = [101, 102]           # Customer IDs

Features = What the model learns from

Metadata = How the model learns

When in doubt, ask yourself: "Will this be available when making predictions on new data?" If no, it's probably metadata!

Looking back at everything we've covered, metadata routing really changes the game for building ML pipelines in scikit-learn. No more hacky workarounds with clf__sample_weight

or struggling to pass groups through cross-validation. You just declare what each component needs, and the routing system handles the rest. It's cleaner, more explicit, and honestly just makes sense.

What you should remember:

set_config(enable_metadata_routing=True)

set_*_request()

methods to declare metadata requirementsNone

gracefully in custom estimatorsNext Steps:

Author's Note: This article covers scikit-learn 1.3+. The metadata routing API is stable and recommended for all new projects. Legacy parameter passing (e.g., clf__sample_weight

) still works but is discouraged.

source & further reading

dev.to — original article I Trained My OpenClaw to Dream. Here's What It Learned Overnight. Your Agent Doesn't Run Out of Context. It Degrades at 79% I Wired OpenRouter Free Models Into My OpenClaw Fallback Chain. Here's What Actually Works.

Metadata Routing

Run your AI side-project on zahid.host