Metadata Routing

A developer discovered metadata routing in scikit-learn, a feature that elegantly solves the problem of passing sample weights and groups through complex ML pipelines. The feature, enabled via set_config(enable_metadata_routing=True), allows pipelines to route auxiliary information like fraud detection weights and customer IDs to specific components, eliminating hacky workarounds.

A couple of months ago, I stumbled upon this video by Vincent D. Warmerdam https://www.youtube.com/watch?v=lQ -Aja-slA about metadata routing in scikit-learn. I'll be honest, I had no idea what "metadata routing" even meant, but Vincent's explanation completely changed how I think about building ML pipelines. The video showed me that one of the most frustrating problems in scikit-learn; passing sample weights and groups through complex pipelines finally had an elegant solution. It piqued my curiosity enough that I dove deep into the feature, tested it extensively, and honestly, I was surprised by how little coverage this gets in technical blogs and articles. So I figured, why not write about it myself and share what I learned? If you've ever struggled with imbalanced datasets, grouped cross-validation, or just wanted to pass custom information through your pipelines, this article is for you. Let's start from the very beginning. Let's start with a concrete example. You're building a credit card fraud detection model with this data: Your training data X = transaction features Amount, merchant, time, location, etc. y = is fraud 0 = legitimate, 1 = fraud But you also have additional information: sample weights = 1.0, 1.0, 10.0, 1.0, ... Fraud transactions weighted 10x customer ids = 101, 102, 101, 103, ... Which customer made each transaction Metadata is the "extra information" beyond your features X and labels y : sample weight groups Imagine you're building a fraud detection system for a financial company. You have: The Challenge: Your model needs to: The problem? This "metadata" weights, groups isn't part of your feature matrix X or labels y. It's auxiliary information that needs to flow through your entire ML pipeline. Before scikit-learn 1.3, this was nearly impossible. Let's see why. Prior to metadata routing, you'd face multiple interconnected problems: python from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear model import LogisticRegression Your fraud detection pipeline pipe = Pipeline 'scaler', StandardScaler , 'clf', LogisticRegression You have fraud weights fraudulent transactions weighted 10x fraud weights = np.where y == 1, 10.0, 1.0 This doesn't work pipe.fit X, y, sample weight=fraud weights Error: unexpected keyword argument python from sklearn.model selection import cross val score, GroupKFold You have customer IDs can't split customers across folds customer groups = df 'customer id' .values This doesn't work with pipelines scores = cross val score pipe, X, y, cv=GroupKFold n splits=5 , groups=customer groups Pipeline doesn't know what to do with this python from sklearn.model selection import GridSearchCV You need BOTH weights AND groups during hyperparameter tuning grid = GridSearchCV pipe, param grid, cv=GroupKFold n splits=5 This is impossible - can't pass both grid.fit X, y, sample weight=fraud weights, groups=customer groups Doesn't work So you can begin to see the problem by now. Pipelines had no way to route this metadata to specific components. You'd have to use hacky workarounds like clf sample weight , which was inconsistent, broke with nested pipelines, and completely failed with cross-validation. Metadata routing solves ALL three problems at once with a clean, explicit API. Here's how it transforms our fraud detection pipeline: python from sklearn import set config from sklearn.model selection import GridSearchCV, GroupKFold Enable metadata routing globally set config enable metadata routing=True Build the fraud detection pipeline pipe = Pipeline 'scaler', StandardScaler , 'clf', LogisticRegression Configure metadata routing - declare what each component needs pipe 'clf' .set fit request sample weight=True pipe 'clf' .set score request sample weight=True Problem 1 SOLVED: Pass weights through pipeline pipe.fit X, y, sample weight=fraud weights Problem 2 SOLVED: Use groups in cross-validation scores = cross val score pipe, X, y, cv=GroupKFold n splits=5 , groups=customer groups Works perfectly Problem 3 SOLVED: Combine weights AND groups in GridSearchCV grid = GridSearchCV pipe, param grid, cv=GroupKFold n splits=5 grid.fit X, y, sample weight=fraud weights, groups=customer groups Both work print f"Best model handles imbalance AND respects customer grouping " What changed? Each component explicitly declares what metadata it needs using set request methods. The pipeline then automatically routes metadata to the right places. Simple, explicit, powerful. Here's what you need to know: set fit request fit set score request score set predict request predict Important: The pipeline doesn't pass metadata to every step. Only components that explicitly call set request metadata=True will receive that metadata. Components that don't request metadata won't receive it, even if you pass it to the pipeline. Example: Selective routing pipe = Pipeline 'scaler', StandardScaler , Doesn't request sample weight 'clf', LogisticRegression Requests sample weight pipe 'clf' .set fit request sample weight=True Only clf gets weights When you call: pipe.fit X, y, sample weight=weights What happens: - scaler.fit X, y → NO sample weight didn't request it - clf.fit X scaled, y, sample weight=weights → Gets sample weight requested it Let's build a custom transformer that uses sample weights during fitting. This is useful for weighted feature scaling or selection. python import numpy as np from sklearn.base import BaseEstimator, TransformerMixin class WeightedStandardScaler BaseEstimator, TransformerMixin : """StandardScaler that respects sample weights during fitting.""" def init self : self.mean = None self.std = None def fit self, X, y=None, sample weight=None : """Fit scaler using weighted mean and std.""" if sample weight is None: sample weight = np.ones X.shape 0 Normalize weights sample weight = sample weight / sample weight.sum Compute weighted statistics self.mean = np.average X, axis=0, weights=sample weight variance = np.average X - self.mean 2, axis=0, weights=sample weight self.std = np.sqrt variance return self def transform self, X : """Transform using fitted statistics.""" return X - self.mean / self.std def get metadata routing self : """Configure metadata routing for this transformer.""" return super .get metadata routing .add self request self .fit sample weight=True Request sample weight in fit Usage from sklearn import set config set config enable metadata routing=True X = np.random.randn 100, 5 weights = np.random.rand 100 scaler = WeightedStandardScaler X scaled = scaler.fit transform X, sample weight=weights Here's what matters when building custom estimators: sample weight parameter in fit method get metadata routing to declare routing requirements add self request and chain routing configuration None case when metadata isn't providedNow let's use our custom transformer in a pipeline with multiple metadata consumers. python from sklearn.pipeline import Pipeline from sklearn.linear model import LogisticRegression from sklearn.model selection import train test split Create sample data X = np.random.randn 1000, 10 y = X :, 0 + X :, 1 0 .astype int sample weights = np.random.rand 1000 X train, X test, y train, y test, w train, w test = train test split X, y, sample weights, test size=0.2, random state=42 Build pipeline with metadata routing pipe = Pipeline 'scaler', WeightedStandardScaler , 'classifier', LogisticRegression max iter=1000 Configure routing: both steps need sample weight pipe.set fit request sample weight=True pipe 'classifier' .set fit request sample weight=True Fit with sample weights - they're routed to both steps pipe.fit X train, y train, sample weight=w train Score also supports metadata routing pipe 'classifier' .set score request sample weight=True score = pipe.score X test, y test, sample weight=w test print f"Weighted accuracy: {score:.3f}" Pipeline Routing Rules: set request Metadata routing shines in hyperparameter tuning scenarios where you need to pass weights or groups to cross-validation. python from sklearn.model selection import GridSearchCV from sklearn.datasets import make classification Generate imbalanced dataset X, y = make classification n samples=1000, n features=20, n informative=15, n redundant=5, weights= 0.9, 0.1 , random state=42 Create sample weights to handle imbalance sample weights = np.where y == 1, 10.0, 1.0 Build pipeline pipe = Pipeline 'scaler', WeightedStandardScaler , 'clf', LogisticRegression max iter=1000 Configure metadata routing for both steps pipe 'scaler' .set fit request sample weight=True pipe 'clf' .set fit request sample weight=True pipe 'clf' .set score request sample weight=True GridSearchCV with metadata routing param grid = { 'clf C': 0.1, 1.0, 10.0 , 'clf penalty': 'l1', 'l2' } grid search = GridSearchCV pipe, param grid, cv=5, scoring='accuracy', n jobs=-1 Fit with sample weights - they're used in both fitting and scoring grid search.fit X, y, sample weight=sample weights print f"Best params: {grid search.best params }" print f"Best weighted score: {grid search.best score :.3f}" Access the best model best pipe = grid search.best estimator GridSearchCV Routing Features: groups parameter for GroupKFold and similar splitters Using Groups for Cross-Validation: python from sklearn.model selection import GroupKFold Create grouped data e.g., multiple samples per patient groups = np.repeat np.arange 100 , 10 100 groups, 10 samples each Configure pipeline to use groups grid search = GridSearchCV pipe, param grid, cv=GroupKFold n splits=5 , n jobs=-1 Pass groups to ensure they're not split across folds grid search.fit X, y, groups=groups, sample weight=sample weights Sometimes you need to pass different metadata values to different pipeline steps. Metadata aliasing lets you route metadata under different names. python from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear model import LogisticRegression Scenario: You have two types of weights - feature weights: for weighted feature scaling - sample weights: for weighted model training Create pipeline pipe = Pipeline 'scaler', WeightedStandardScaler , 'clf', LogisticRegression Configure aliasing: route 'weights' parameter to different metadata pipe 'scaler' .set fit request sample weight='feature weights' Alias pipe 'clf' .set fit request sample weight='sample weights' Alias Now you can pass both types of weights pipe.fit X, y, feature weights=feature importance weights, Goes to scaler sample weights=class balance weights Goes to classifier Use cases for aliasing: Important: The parameter name you use in fit must match the alias, not the internal parameter name. Metadata routing works seamlessly with nested pipelines, automatically propagating metadata through all levels. python from sklearn.pipeline import Pipeline, FeatureUnion from sklearn.decomposition import PCA from sklearn.feature selection import SelectKBest Build nested pipeline: preprocessing pipeline inside main pipeline preprocessing = Pipeline 'scaler', WeightedStandardScaler , 'features', FeatureUnion 'pca', PCA n components=10 , 'select', SelectKBest k=5 main pipe = Pipeline 'preprocess', preprocessing , 'clf', LogisticRegression Configure routing at any level main pipe 'preprocess' 'scaler' .set fit request sample weight=True main pipe 'clf' .set fit request sample weight=True Metadata routes through all levels automatically main pipe.fit X, y, sample weight=weights What happens: A few things to remember about nested pipelines: pipe 'outer' 'inner' Complex example with FeatureUnion: FeatureUnion with different metadata needs feature union = FeatureUnion 'weighted pca', WeightedPCA , Needs sample weight 'standard select', SelectKBest Doesn't need sample weight pipe = Pipeline 'features', feature union , 'clf', LogisticRegression Only weighted pca gets the weights pipe 'features' .transformer list 0 1 .set fit request sample weight=True pipe 'clf' .set fit request sample weight=True pipe.fit X, y, sample weight=weights weights go to weighted pca and clf, but not to standard select 1. Always Enable Metadata Routing Explicitly python from sklearn import set config set config enable metadata routing=True 2. Use Descriptive Metadata Names Good: clear purpose estimator.set fit request sample weight=True, class prior=True Avoid: generic names estimator.set fit request metadata=True 3. Configure Routing at Pipeline Creation Configure immediately after creating pipeline pipe = Pipeline ... pipe 'step1' .set fit request sample weight=True pipe 'step2' .set fit request sample weight=True 4. Handle None Gracefully in Custom Estimators python def fit self, X, y=None, sample weight=None : if sample weight is None: sample weight = np.ones len X ... rest of implementation Pitfall 1: Forgetting to Enable Metadata Routing This will fail silently or raise errors pipe.fit X, y, sample weight=weights Metadata routing not enabled Pitfall 2: Not Configuring All Steps Only configured classifier, scaler won't receive weights pipe 'clf' .set fit request sample weight=True pipe.fit X, y, sample weight=weights Scaler doesn't get weights Pitfall 3: Mixing Old and New APIs Don't use both approaches pipe.fit X, y, clf sample weight=weights Old way pipe 'clf' .set fit request sample weight=True New way Pitfall 4: Forgetting to Request Metadata in score pipe 'clf' .set fit request sample weight=True Forgot this: pipe 'clf' .set score request sample weight=True pipe.score X, y, sample weight=weights Weights ignored in scoring Check Routing Configuration: Inspect what metadata a component requests print pipe 'clf' .get metadata routing Verify Metadata is Being Used: python Add logging to custom estimators def fit self, X, y=None, sample weight=None : print f"Received sample weight: {sample weight is not None}" ... rest of implementation Test with and without Metadata: Ensure your estimator works both ways estimator.fit X, y Without metadata estimator.fit X, y, sample weight=weights With metadata n jobs in GridSearchCV and similar memory parameter in Pipeline for caching Use metadata routing when you need to: Don't use metadata routing when: When I first started working with metadata routing, I struggled to clearly demarcate what should be a feature versus what should be metadata. For instance, in the earlier credit card fraud use case we saw, I kept asking myself: "Should customer fraud history be a feature? What about transaction timestamps? Customer IDs?" The line felt blurry, and I made several mistakes before understanding the distinction. Let me share what I learned, so you can avoid the confusion I went through. Features X : Information the model uses to make predictions Metadata: Information about how to train/evaluate the model, but not used for predictions Let's look at some ambiguous cases: Transaction amount as a FEATURE X = 100.50, 'online', 'electronics' , Amount is a feature 25.00, 'store', 'groceries' y = 0, 1 Fraud labels The model learns: "Large electronics purchases online are suspicious" Decision: Feature - The model uses amount to predict fraud Customer's fraud history as METADATA sample weight X = 100.50, 'online', 'electronics' , 25.00, 'store', 'groceries' y = 0, 1 Customer 1 has 0% fraud history → weight = 1.0 Customer 2 has 50% fraud history → weight = 5.0 pay more attention sample weights = 1.0, 5.0 Decision: Metadata - Tells the model "pay more attention to this sample" but isn't used for prediction But wait You could also make this a feature: Customer fraud history as a FEATURE X = 100.50, 'online', 'electronics', 0.0 , Added fraud history 25.00, 'store', 'groceries', 0.5 y = 0, 1 Decision: Feature - Now the model learns "customers with high fraud history are risky" Ask yourself: "Should the model learn patterns from this, or does it tell the model how to learn?" | Scenario | Feature or Metadata? | Why? | |---|---|---| | Transaction amount | Feature | Model predicts based on amount | | Customer ID | Metadata groups | For grouping in CV, not prediction | | Time of day | Feature | Model learns "3 AM transactions are suspicious" | | Data quality score | Metadata weight | "Trust this sample more/less" | | Previous fraud count | Could be either | See below | | Geographic location | Feature | Model learns regional patterns | | Sample collection date | Metadata groups | For time-based CV splits | Some information genuinely could be either. Here's how to decide: Option 1: As a Feature X = 100, 'online', 2 , 2 previous frauds 50, 'store', 0 0 previous frauds Option 2: As Metadata Sample Weight X = 100, 'online' , 50, 'store' sample weights = 5.0, 1.0 Weight based on fraud history Option 3: Both X = 100, 'online', 2 , 50, 'store', 0 sample weights = 5.0, 1.0 Use as Feature when: Use as Metadata when: Use as Both when: Mistake 1: Using customer ID as a feature X = 101, 100, 'online' , Customer ID as feature 102, 50, 'store' Problem: Model memorizes customers instead of learning patterns. Use as metadata groups instead Mistake 2: Using sample importance as a feature X = 100, 'online', 5.0 , Importance score as feature 50, 'store', 1.0 Problem: Importance score won't be available at prediction time. Use as metadata sample weight Better Approach: Separate concerns X = 100, 'online' , 50, 'store' sample weights = 5.0, 1.0 Importance groups = 101, 102 Customer IDs Features = What the model learns from Metadata = How the model learns When in doubt, ask yourself: "Will this be available when making predictions on new data?" If no, it's probably metadata Looking back at everything we've covered, metadata routing really changes the game for building ML pipelines in scikit-learn. No more hacky workarounds with clf sample weight or struggling to pass groups through cross-validation. You just declare what each component needs, and the routing system handles the rest. It's cleaner, more explicit, and honestly just makes sense. What you should remember: set config enable metadata routing=True set request methods to declare metadata requirements None gracefully in custom estimators Next Steps: Author's Note: This article covers scikit-learn 1.3+. The metadata routing API is stable and recommended for all new projects. Legacy parameter passing e.g., clf sample weight still works but is discouraged.