Fast Automatic ML Hyperparameter tuning Using Optuna (w. MLflow model registry and IRIS DB)

A developer demonstrated automatic hyperparameter tuning using Optuna with MLflow and InterSystems IRIS database. The approach efficiently optimizes LightGBM models on the California Housing dataset, leveraging Optuna's search algorithms and MLflow for experiment tracking and model registry. The integration with IRIS enables concurrent studies and scalable hyperparameter searches.

This article presents a straightforward approach to automatically and efficiently tune hyperparameters for machine learning models using Optuna as the optimisation framework. We explore how to use both Optuna’s native storage options and InterSystems IRIS as a database backend to track the progress of hyperparameter searches. We also show how MLflow can be used to monitor experiments and manage models through its tracking and model registry UI. This article is based on this Kaggle Notebook https://www.kaggle.com/code/jorgeivnjh/fast-automatic-ml-hyperparameter-tuning-w-optuna , which you can run and directly edit yourself. When training ML models, the choice of hyperparameters can strongly influence performance. They are not the only factor, but they can significantly affect both convergence and generalisation. Tuning hyperparameters manually takes a lot of effort. This is especially true because hyperparameters interact with each other, so tuning them independently is usually not enough. For example, higher regularisation may require a lower learning rate for more stable optimization. A more complex model may require stronger regularization to avoid overfitting, but at the same time, a very small learning rate on a complex model can make learning too slow. Optuna is an MIT-licensed open source library, which allows commercial use, that automates hyperparameter search for ML models developed with the most popular frameworks such as scikit-learn, PyTorch, TensorFlow, and LightGBM. It works by defining a search space and an objective metric to either minimize or maximize. Optuna then explores the search space efficiently to find well-performing configurations. Here we use Optuna to tune a LightGBM model on a dummy dataset and show how to scale the search using shared database storage. We will also use MLflow for experiment tracking and model registry, and IRIS DB as a possible Optuna storage backend for concurrent studies. We will use the California Housing dataset, commonly used in ML examples, to populate IRIS tables and run the tuning workflow. Note: For the last bit, you will need an existing IRIS instance that you can connect to. I am using the one created with Docker by running the docker-compose file from this repo https://github.com/JorgeIvanJH/IRIS and MLflow-Continuous-Training-Pipeline . I am also using the environment variables and requirements.txt from that repository, together with Python 3.12. python import os import dotenv import sklearn import pandas as pd import sqlalchemy from sqlalchemy import create engine import optuna import lightgbm as lgb from sklearn.model selection import cross val score from sklearn.model selection import KFold import seaborn as sns import matplotlib.pyplot as plt import datetime as dt import seaborn as sns import matplotlib.pyplot as plt dotenv.load dotenv Connection String to Existing IRIS Database server = os.getenv "IRIS SERVER" port = os.getenv "IRIS PORT" Standard InterSystems superserver port namespace = os.getenv "IRIS NAMESPACE" username = os.getenv "IRIS USERNAME" password = os.getenv "IRIS PASSWORD" print f"pandas version: {pd. version }" print f"sklearn version: {sklearn. version }" print f"sqlalchemy version: {sqlalchemy. version }" print f"optuna version: {optuna. version }" print f"lightgbm version: {lgb. version }" print f"seaborn version: {sns. version }" print f"matplotlib version: {plt.matplotlib. version }" pandas version: 2.3.3 sklearn version: 1.8.0 sqlalchemy version: 2.0.46 optuna version: 4.8.0 lightgbm version: 4.6.0 seaborn version: 0.13.2 matplotlib version: 3.10.8 Optuna https://optuna.org/ is a hyperparameter optimization framework that speeds up tuning by training multiple model configurations and learning from their results. It provides: For a richer intro to Optuna, see this video https://www.youtube.com/watch?v=P6NwZVl8ttc A practical approach to efficiently find good hyperparameters is: Important Hyperparameter tuning must use an appropriate validation setup. Otherwise, we may only find the configuration that best overfits the validation split, rather than one that generalizes well to the dataset at hand. The cell below loads scikit-learn's fetch california housing dataset, and changes the column names to snake case. Load California Housing Dataset X,y = sklearn.datasets.fetch california housing return X y=True,as frame=True X.columns = col.replace " ", " " for col in X.columns y.name = "median house value" df = X.copy df y.name = y It is essential to choose the right cross-validation strategy. This depends on the task, whether it is regression or classification, whether the target is imbalanced, whether the order of samples matters, and whether there are groups in the data. For example, if multiple rows belong to the same patient, we may want to avoid having samples from the same patient appear in both training and validation splits. Refer to this summary https://scikit-learn.org/stable/auto examples/model selection/plot cv indices.html sphx-glr-auto-examples-model-selection-plot-cv-indices-py of the options available in SKlearn for further guidance. For simplicity, we can use the following decision rules: if time order matters: use TimeSeriesSplit no shuffle equivalent else: if groups exist: if classification and classes are imbalanced: use StratifiedGroupKFold no shuffle equivalent else: use GroupKFold → or GroupShuffleSplit else: if classification and classes are imbalanced: use StratifiedKFold → or StratifiedShuffleSplit else: use KFold → or ShuffleSplit crossvalstrategy = KFold n splits=3, shuffle=True, random state=42 After choosing the model, in this case LightGBM, we define the hyperparameters that we want to tune and the metric that we want to optimize. The cells in this section can be run multiple times until we reach a satisfactory performance level. The variables marked as tweakable are the ones we are likely to adjust between studies. The general process is: Since this is a regression task, we use mean squared error as the metric to minimize. The metric is evaluated using the cross-validation strategy defined above. Note: When storage=storage url points to a supported database, such as SQLite or InterSystems IRIS, Optuna automatically creates the tables needed to track studies, trials, parameters, and results. Each study is identified by its study name. If the same study name and database are reused with load if exists=True, Optuna resumes from the existing study instead of starting from scratch. This shared storage is also what enables concurrent optimization: multiple processes, or even multiple machines, can connect to the same database and contribute trials to the same study. NUM TRIALS = 20 Tweak os.environ "LOKY MAX CPU COUNT" = str os.cpu count def objective trial : param = { "learning rate": trial.suggest float "learning rate", 0.001, 0.2,log=True , Tweak "max depth": trial.suggest int "max depth", 3, 50 , Tweak "n estimators": trial.suggest int "n estimators", 50, 1000 , Tweak "num leaves": trial.suggest categorical "num leaves", 16, 31, 63, 127, 255 , "lambda l2": trial.suggest float "lambda l2", 1e-8, 10.0, log=True , Tweak "max bin": trial.suggest categorical "max bin", 63, 127, 255 } model = lgb.LGBMRegressor param scores = cross val score model, X, y, cv=crossvalstrategy, scoring="neg mean squared error", n jobs=-1 return -scores.mean study = optuna.create study study name=f"lightgbm hyperparam tuning {dt.datetime.now .strftime '%Y-%m-%d %H-%M-%S' }", direction="minimize", storage=storage url, load if exists=True, sampler=optuna.samplers.TPESampler seed=42 , study.optimize objective, n trials=NUM TRIALS, show progress bar=True, n jobs=1 best params = study.best params print f"\nBest parameters: {best params}" print f"\nBest performance: {study.best value}" 32m I 2026-05-13 15:58:38,618 0m A new study created in memory with name: lightgbm hyperparam tuning 2026-05-13 15-58-38 0m 0%| | 0/20 00:00<?, ?it/s 32m I 2026-05-13 15:59:02,770 0m Trial 0 finished with value: 0.22124664870518 and parameters: {'learning rate': 0.00727491708802781, 'max depth': 48, 'n estimators': 746, 'num leaves': 255, 'lambda l2': 0.002570603566117598, 'max bin': 255}. Best is trial 0 with value: 0.22124664870518. 0m 32m I 2026-05-13 15:59:06,986 0m Trial 1 finished with value: 0.2059125561807643 and parameters: {'learning rate': 0.0823143373099555, 'max depth': 13, 'n estimators': 222, 'num leaves': 63, 'lambda l2': 0.0032112643094417484, 'max bin': 255}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 15:59:13,470 0m Trial 2 finished with value: 0.25714400572802726 and parameters: {'learning rate': 0.01120548642504815, 'max depth': 40, 'n estimators': 239, 'num leaves': 127, 'lambda l2': 3.850031979199519e-08, 'max bin': 127}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 15:59:22,415 0m Trial 3 finished with value: 0.26413921215873515 and parameters: {'learning rate': 0.0050225633119947675, 'max depth': 7, 'n estimators': 700, 'num leaves': 255, 'lambda l2': 2.133142332373004e-06, 'max bin': 63}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 15:59:28,245 0m Trial 4 finished with value: 0.20942294704047681 and parameters: {'learning rate': 0.01811326544803337, 'max depth': 11, 'n estimators': 972, 'num leaves': 31, 'lambda l2': 6.257956190096665e-08, 'max bin': 255}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 15:59:54,053 0m Trial 5 finished with value: 0.22529793459324102 and parameters: {'learning rate': 0.007840758945457348, 'max depth': 16, 'n estimators': 838, 'num leaves': 255, 'lambda l2': 4.6876566400928895e-08, 'max bin': 63}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 15:59:57,575 0m Trial 6 finished with value: 0.6243686001512612 and parameters: {'learning rate': 0.0010296901472345186, 'max depth': 42, 'n estimators': 722, 'num leaves': 31, 'lambda l2': 0.5860448217200517, 'max bin': 63}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 16:00:01,328 0m Trial 7 finished with value: 0.25616396880444836 and parameters: {'learning rate': 0.005194929407101736, 'max depth': 18, 'n estimators': 743, 'num leaves': 31, 'lambda l2': 0.0703178263660987, 'max bin': 127}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 16:00:02,230 0m Trial 8 finished with value: 0.4328137375744699 and parameters: {'learning rate': 0.015952322469109693, 'max depth': 23, 'n estimators': 74, 'num leaves': 63, 'lambda l2': 1.4726456718740824, 'max bin': 255}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 16:00:03,606 0m Trial 9 finished with value: 0.5036899804922363 and parameters: {'learning rate': 0.0033610226697378754, 'max depth': 6, 'n estimators': 325, 'num leaves': 31, 'lambda l2': 0.1710207048797339, 'max bin': 127}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 16:00:07,940 0m Trial 10 finished with value: 0.21142577467959092 and parameters: {'learning rate': 0.14804113057514628, 'max depth': 30, 'n estimators': 458, 'num leaves': 63, 'lambda l2': 3.757350306893132e-05, 'max bin': 255}. Best is trial 1 with value: 0.2059125561807643. 0m 32m I 2026-05-13 16:00:11,156 0m Trial 11 finished with value: 0.2017814916171883 and parameters: {'learning rate': 0.08309297264998405, 'max depth': 12, 'n estimators': 950, 'num leaves': 16, 'lambda l2': 0.0008326596975497944, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:12,488 0m Trial 12 finished with value: 0.20764432653610213 and parameters: {'learning rate': 0.10507813096831281, 'max depth': 28, 'n estimators': 508, 'num leaves': 16, 'lambda l2': 0.0016316751769423123, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:12,862 0m Trial 13 finished with value: 0.3044026543083153 and parameters: {'learning rate': 0.054273532006916266, 'max depth': 3, 'n estimators': 131, 'num leaves': 16, 'lambda l2': 6.119264662645272e-05, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:16,388 0m Trial 14 finished with value: 0.20646055020810183 and parameters: {'learning rate': 0.041057846227823123, 'max depth': 14, 'n estimators': 366, 'num leaves': 63, 'lambda l2': 0.007230065446525416, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:18,008 0m Trial 15 finished with value: 0.21268042685192567 and parameters: {'learning rate': 0.04807456550053136, 'max depth': 21, 'n estimators': 604, 'num leaves': 16, 'lambda l2': 6.458243615671745e-06, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:28,022 0m Trial 16 finished with value: 0.21844697644015332 and parameters: {'learning rate': 0.18423283160212306, 'max depth': 10, 'n estimators': 992, 'num leaves': 127, 'lambda l2': 9.015211997542714, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:29,373 0m Trial 17 finished with value: 0.20797590828555537 and parameters: {'learning rate': 0.08294987485804219, 'max depth': 33, 'n estimators': 188, 'num leaves': 63, 'lambda l2': 0.018231434623139052, 'max bin': 255}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:30,247 0m Trial 18 finished with value: 0.23633039578627624 and parameters: {'learning rate': 0.02831149820738454, 'max depth': 24, 'n estimators': 355, 'num leaves': 16, 'lambda l2': 0.00012197971292668617, 'max bin': 127}. Best is trial 11 with value: 0.2017814916171883. 0m 32m I 2026-05-13 16:00:35,660 0m Trial 19 finished with value: 0.21720640666066582 and parameters: {'learning rate': 0.07858633974467637, 'max depth': 13, 'n estimators': 879, 'num leaves': 63, 'lambda l2': 0.0007188574432995588, 'max bin': 63}. Best is trial 11 with value: 0.2017814916171883. 0m Best parameters: {'learning rate': 0.08309297264998405, 'max depth': 12, 'n estimators': 950, 'num leaves': 16, 'lambda l2': 0.0008326596975497944, 'max bin': 255} Best performance: 0.2017814916171883 Below we inspect the best-performing trials from the study. This gives us a quick view of which hyperparameter combinations performed best and helps guide future searches: trials df = study.trials dataframe trials df = trials df.sort values "value" trials df = trials df.loc :, trials df.columns.str.contains "params|value" top trials df = trials df.head 10 display top trials df display top trials df.describe | value | params lambda l2 | params learning rate | params max bin | params max depth | params n estimators | params num leaves | | |---|---|---|---|---|---|---|---| | 11 | 0.201781 | 8.326597e-04 | 0.083093 | 255 | 12 | 950 | 16 | | 1 | 0.205913 | 3.211264e-03 | 0.082314 | 255 | 13 | 222 | 63 | | 14 | 0.206461 | 7.230065e-03 | 0.041058 | 255 | 14 | 366 | 63 | | 12 | 0.207644 | 1.631675e-03 | 0.105078 | 255 | 28 | 508 | 16 | | 17 | 0.207976 | 1.823143e-02 | 0.082950 | 255 | 33 | 188 | 63 | | 4 | 0.209423 | 6.257956e-08 | 0.018113 | 255 | 11 | 972 | 31 | | 10 | 0.211426 | 3.757350e-05 | 0.148041 | 255 | 30 | 458 | 63 | | 15 | 0.212680 | 6.458244e-06 | 0.048075 | 255 | 21 | 604 | 16 | | 19 | 0.217206 | 7.188574e-04 | 0.078586 | 63 | 13 | 879 | 63 | | 16 | 0.218447 | 9.015212e+00 | 0.184233 | 255 | 10 | 992 | 127 | | value | params lambda l2 | params learning rate | params max bin | params max depth | params n estimators | params num leaves | | |---|---|---|---|---|---|---|---| | count | 10.000000 | 1.000000e+01 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | 10.000000 | | mean | 0.209896 | 9.047112e-01 | 0.087154 | 235.800000 | 18.500000 | 613.900000 | 52.100000 | | std | 0.005155 | 2.849745e+00 | 0.049444 | 60.715731 | 8.759122 | 313.844424 | 34.252169 | | min | 0.201781 | 6.257956e-08 | 0.018113 | 63.000000 | 10.000000 | 188.000000 | 16.000000 | | 25% | 0.206756 | 2.078945e-04 | 0.055703 | 255.000000 | 12.250000 | 389.000000 | 19.750000 | | 50% | 0.208699 | 1.232167e-03 | 0.082632 | 255.000000 | 13.500000 | 556.000000 | 63.000000 | | 75% | 0.212367 | 6.225365e-03 | 0.099582 | 255.000000 | 26.250000 | 932.250000 | 63.000000 | | max | 0.218447 | 9.015212e+00 | 0.184233 | 255.000000 | 33.000000 | 992.000000 | 127.000000 | After the first broad search, we can estimate which hyperparameters had the strongest impact on performance. This helps us decide which parameters deserve a more focused search in the next study. The cell below calculates the importance score for each hyperparameter on a scale from 0 to 1. Higher values indicate parameters that had more influence on the objective metric in this study. param importance dict = optuna.importance.get param importances study plt.figure figsize= 10, 6 sns.barplot x=list param importance dict.values , y=list param importance dict.keys plt.xlabel 'Importance Score' plt.ylabel 'Hyperparameter' plt.title 'Hyperparameter Importance' plt.tight layout plt.grid plt.show From the plot above, we can identify the most relevant hyperparameters. Next, we choose how many of the top parameters we want to compare. In this example, we select the two most important ones. The contour plot below helps us visualize how these two parameters interact and which regions of the search space produced better results. We can use this to define narrower ranges for future studies. numparamstocompare = 2 best2params = k for k, v in sorted param importance dict.items , key=lambda x: x 1 -numparamstocompare: optuna.visualization.matplotlib.plot contour study, params=best2params Every time we test a set of hyperparameters, we should evaluate it properly using cross-validation to avoid selecting a model that just overfits to a particular train/validation split. This means training as many models as the number of folds we choose. For example, using 5-fold or 10-fold cross-validation implies training 5–10 models per hyperparameter configuration. There is no strict rule for the number of folds, but 5 or 10 are commonly used depending on how expensive each model is to train. As a result, evaluating each set of hyperparameters becomes 5–10 times more time-consuming, and this cost increases further as the dataset grows. For this reason, we want to accelerate the hyperparameter search. One way to do this is by running multiple processes, each working on the same Optuna study and exploring the same search space in parallel. If a machine has 16 cores, we can run up to 16 workers concurrently, which can significantly reduce the total optimization time although not always perfectly linearly due to overhead and coordination between workers . An important advantage of Optuna is that if all workers point to a common storage database, the study is shared across processes. Optuna will create and manage the required tables in the database, and all workers will contribute trials to the same study. This means that: By default, you can specify "sqlite:///optuna lgbm.db" as the storage parameter, and Optuna will create a local database for the study. The same approach can also be extended to a centralized database such as InterSystems IRIS, enabling distributed hyperparameter tuning across multiple machines. We can combine Optuna for hyperparameter tuning and MLflow for experiment tracking and model registry. This way, we can leverage the same MLflow model registry capabilities shown in this repo https://github.com/JorgeIvanJH/IRIS and MLflow-Continuous-Training-Pipeline . One of the main advantages of Optuna is how easy it is to scale hyperparameter tuning across processes or even across machines. We can run the same optimization study from different machines, and as long as all of them point to the same storage database, all workers will contribute trials to the same study. As trials finish, Optuna can use the accumulated results to guide future samples. In the example below, we run multiple workers against the same Optuna study. Running this as a separate Python script, not in a standard Jupyter notebook, allows parallel hyperparameter tuning with MLflow tracking. MLflow keeps track of the parent run, each child trial run, the final best parameters, the best cross-validation score, and the final trained model. The cell below ran 3200 trials in 25 minutes on a Windows laptop with 16 cores, using 16 workers with 200 trials each. Each trial used 3 cross-validation splits. python import os import dotenv import optuna import lightgbm as lgb import multiprocessing as mp import mlflow import mlflow.lightgbm from mlflow.models import infer signature import numpy as np from sklearn.model selection import cross val score, KFold from sklearn.datasets import fetch california housing import datetime as dt dotenv.load dotenv STORAGE URL = "sqlite:///optuna lgbm.db" for local testing Hyperparameter tuning configuration NUM WORKERS = min 16, mp.cpu count NUM TRIALS PER WORKER = 200 BASE SEED = 42 NUM CV SPLITS = 3 5 or 10 would be better EXPERIMENT NAME = "LightGBM Hyperparameter Tuning with Optuna and MLflow" crossvalstrategy = KFold n splits=NUM CV SPLITS, shuffle=True, random state=BASE SEED Load dataset X, y = fetch california housing return X y=True, as frame=True X.columns = col.replace " ", " " for col in X.columns y.name = "median house value" def objective trial : params = { "learning rate": trial.suggest float "learning rate", 0.001, 0.2, log=True , CHANGEABLE "max depth": trial.suggest int "max depth", 3, 50 , CHANGEABLE "n estimators": trial.suggest int "n estimators", 50, 1000 , CHANGEABLE "num leaves": trial.suggest categorical "num leaves", 16, 31, 63, 127, 255 , "lambda l2": trial.suggest float "lambda l2", 1e-8, 10.0, log=True , CHANGEABLE "max bin": trial.suggest categorical "max bin", 63, 127, 255 , "random state": BASE SEED, "verbosity": -1, "n jobs": 1, } parent run id = os.getenv "MLFLOW PARENT RUN ID" with mlflow.start run run name=f"trial {trial.number}", nested=True, parent run id=parent run id, tags={"mlflow.parentRunId": parent run id} if parent run id else None, as child run: mlflow.log params params model = lgb.LGBMRegressor params scores = cross val score model, X, y, cv=crossvalstrategy, scoring="neg mean squared error", n jobs=1, crossval score = -scores.mean Log current trial's error metric mlflow.log metrics {"cv mse mean": crossval score} for fold idx, score in enumerate scores : mlflow.log metric f"fold {fold idx} mse", -score Make it easy to retrieve the best-performing child run later trial.set user attr "run id", child run.info.run id return crossval score def run worker args : worker id, study name, parent run id = args mlflow.set tracking uri os.getenv "MLFLOW TRACKING URI" mlflow.set experiment EXPERIMENT NAME os.environ "MLFLOW PARENT RUN ID" = parent run id study = optuna.load study study name=study name, storage=STORAGE URL, sampler=optuna.samplers.TPESampler seed=BASE SEED+worker id , study.optimize objective, n trials=NUM TRIALS PER WORKER, show progress bar=False, n jobs=1, return worker id if name == " main ": MLflow setup datetime str = dt.datetime.now .strftime "%Y-%m-%d %H:%M" RUN NAME = f"parent {datetime str}" STUDY NAME = f"optuna {datetime str}" tracking uri = os.getenv "MLFLOW TRACKING URI" mlflow.set tracking uri tracking uri mlflow.set experiment EXPERIMENT NAME experiment = mlflow.get experiment by name EXPERIMENT NAME experiment id = experiment.experiment id with mlflow.start run run name=RUN NAME, log system metrics=True as parent run: parent run id = parent run.info.run id os.environ "MLFLOW PARENT RUN ID" = parent run id optuna.create study direction="minimize", study name=STUDY NAME, storage=STORAGE URL, load if exists=False, mlflow.log params { "n trials": NUM TRIALS PER WORKER NUM WORKERS, "num workers": NUM WORKERS, "cv n splits": crossvalstrategy.n splits, "seed": BASE SEED, "study name": STUDY NAME, } worker args = worker id, STUDY NAME, parent run id for worker id in range NUM WORKERS with mp.Pool processes=NUM WORKERS as pool: pool.map run worker, worker args study = optuna.load study study name=STUDY NAME, storage=STORAGE URL, best params = study.best trial.params best value = study.best value best child run id = study.best trial.user attrs.get "run id" mlflow.log params {f"best {k}": v for k, v in best params.items } mlflow.log metric "best cv mse", float best value if best child run id: mlflow.log param "best child run id", best child run id Train final model on full dataset with best hyperparameters. Important: keep same seed final model = lgb.LGBMRegressor best params, random state=BASE SEED, verbosity=-1, n jobs=1, final model.fit X, y input sample = X.sample 100, random state=BASE SEED signature = infer signature input sample, final model.predict input sample mlflow.lightgbm.log model lgb model=final model, name="best model", signature=signature, input example=X.head 5 , The code above works as a proof of concept when working across different machines. Each machine or process can point to the same shared Optuna storage database and contribute trials to the same study. However, if we are using a single PC, the simpler version below is usually preferable. It runs the same study with parallel jobs controlled by Optuna's n jobs parameter. This approach is simpler and can achieve similar performance, although the exact trials and final best model are not guaranteed to be identical to the multiprocessing version. The code below also ran 3200 trials, in this case in 27 minutes. python import os import dotenv import optuna import lightgbm as lgb import multiprocessing as mp import mlflow import mlflow.lightgbm from mlflow.models import infer signature import numpy as np from sklearn.model selection import cross val score, KFold from sklearn.datasets import fetch california housing import datetime as dt dotenv.load dotenv STORAGE URL = "sqlite:///optuna lgbm.db" for local testing Hyperparameter tuning configuration NUM WORKERS = min 16, mp.cpu count NUM TRIALS PER WORKER = 200 BASE SEED = 42 NUM CV SPLITS = 3 5 or 10 would be better EXPERIMENT NAME = "LightGBM Hyperparameter Tuning with Optuna and MLflow 2" crossvalstrategy = KFold n splits=NUM CV SPLITS, shuffle=True, random state=BASE SEED Load dataset X, y = fetch california housing return X y=True, as frame=True X.columns = col.replace " ", " " for col in X.columns y.name = "median house value" def objective trial : params = { "learning rate": trial.suggest float "learning rate", 0.001, 0.2, log=True , CHANGEABLE "max depth": trial.suggest int "max depth", 3, 50 , CHANGEABLE "n estimators": trial.suggest int "n estimators", 50, 1000 , CHANGEABLE "num leaves": trial.suggest categorical "num leaves", 16, 31, 63, 127, 255 , "lambda l2": trial.suggest float "lambda l2", 1e-8, 10.0, log=True , CHANGEABLE "max bin": trial.suggest categorical "max bin", 63, 127, 255 , "random state": BASE SEED, "verbosity": -1, "n jobs": 1, } parent run id = os.getenv "MLFLOW PARENT RUN ID" with mlflow.start run run name=f"trial {trial.number}", nested=True, parent run id=parent run id, tags={"mlflow.parentRunId": parent run id} if parent run id else None, as child run: mlflow.log params params model = lgb.LGBMRegressor params scores = cross val score model, X, y, cv=crossvalstrategy, scoring="neg mean squared error", n jobs=1, crossval score = -scores.mean Log current trial's error metric mlflow.log metrics {"cv mse mean": crossval score} for fold idx, score in enumerate scores : mlflow.log metric f"fold {fold idx} mse", -score Make it easy to retrieve the best-performing child run later trial.set user attr "run id", child run.info.run id return crossval score if name == " main ": MLflow setup datetime str = dt.datetime.now .strftime "%Y-%m-%d %H:%M" RUN NAME = f"parent {datetime str}" STUDY NAME = f"optuna {datetime str}" tracking uri = os.getenv "MLFLOW TRACKING URI" mlflow.set tracking uri tracking uri mlflow.set experiment EXPERIMENT NAME experiment = mlflow.get experiment by name EXPERIMENT NAME experiment id = experiment.experiment id with mlflow.start run run name=RUN NAME, log system metrics=True as parent run: parent run id = parent run.info.run id os.environ "MLFLOW PARENT RUN ID" = parent run id optuna.create study direction="minimize", study name=STUDY NAME, storage=STORAGE URL, load if exists=False, mlflow.log params { "n trials": NUM TRIALS PER WORKER NUM WORKERS, "num workers": NUM WORKERS, "cv n splits": crossvalstrategy.n splits, "seed": BASE SEED, "study name": STUDY NAME, } study = optuna.load study study name=STUDY NAME, storage=STORAGE URL, study.optimize objective, n trials=NUM TRIALS PER WORKER NUM WORKERS, show progress bar=False, n jobs=NUM WORKERS, best params = study.best trial.params best value = study.best value best child run id = study.best trial.user attrs.get "run id" mlflow.log params {f"best {k}": v for k, v in best params.items } mlflow.log metric "best cv mse", float best value if best child run id: mlflow.log param "best child run id", best child run id Train final model on full dataset with best hyperparameters. Important: keep same seed final model = lgb.LGBMRegressor best params, random state=BASE SEED, verbosity=-1, n jobs=NUM WORKERS, final model.fit X, y input sample = X.sample 100, random state=BASE SEED signature = infer signature input sample, final model.predict input sample mlflow.lightgbm.log model lgb model=final model, name="best model", signature=signature, input example=X.head 5 , As a result of running either script, we get a parent run in MLflow with the final best model trained using the best hyperparameters found across the 3200 trials. The parent run also stores the best hyperparameters, the best cross-validation score, and the ID of the best child run. Each child run contains the parameters and metrics for one Optuna trial. All of this can be explored in the MLflow UI, for example at http://localhost:5000/ /experiments http://localhost:5000/ /experiments , where we can inspect the parent run, compare child runs, and download or register the final model. In the image below, we see two plots from MLflow's UI. On the left, we get a sense of the search space by comparing the mean cross-validation MSE across trials with different values of max depth and num leaves. On the right, we see the 100 worst models, meaning the trials with the highest mean squared error across cross-validation. The best found model achieved a score of approximately 0.199580. When trying to replicate the same process with IRIS DB as the Optuna storage backend, multiple issues arose when running more than 4 workers in parallel. This is likely related to how each worker process creates its own connection to IRIS and writes trial metadata concurrently to the same Optuna study. The code below worked fine with up to 3 workers running at the same time. Another option is to keep a single Python process pointing to IRIS and set Optuna's n jobs parameter to the number of concurrent jobs we want just as we did above . This approach uses threads inside one process, which can be simpler from a database-connection perspective because it avoids multiple independent Python processes creating separate connections to IRIS. However, this approach is not always equivalent to multiprocessing. Since Optuna's n jobs uses threads, CPU-bound Python code can be limited by Python's GIL. In this specific example, most of the expensive work is done by LightGBM and scikit-learn routines, so threading may still provide useful speedup, but it may not scale the same way as true multiprocessing. python import os import dotenv import optuna import lightgbm as lgb import multiprocessing as mp from sqlalchemy.pool import NullPool from sklearn.model selection import cross val score, KFold from sklearn.datasets import fetch california housing import datetime as dt dotenv.load dotenv NUM WORKERS = min 8, mp.cpu count CHANGEABLE NUM TRIALS PER WORKER = 20 CHANGEABLE STUDY NAME = f"IRIS lightgbm study {dt.datetime.now .strftime '%Y-%m-%d %H-%M-%S' }" CHANGEABLE BASE SEED = 42 CHANGEABLE server = os.getenv "IRIS SERVER" port = os.getenv "IRIS PORT" namespace = os.getenv "IRIS NAMESPACE" username = os.getenv "IRIS USERNAME" password = os.getenv "IRIS PASSWORD" STORAGE URL = f"iris://{username}:{password}@{server}:{port}/{namespace}" crossvalstrategy = KFold n splits=3, shuffle=True, random state=BASE SEED Load Dataset X, y = fetch california housing return X y=True, as frame=True X.columns = col.replace " ", " " for col in X.columns y.name = "median house value" def objective trial : param = { "learning rate": trial.suggest float "learning rate", 0.001, 0.2, log=True , CHANGEABLE "max depth": trial.suggest int "max depth", 3, 50 , CHANGEABLE "n estimators": trial.suggest int "n estimators", 50, 1000 , CHANGEABLE "num leaves": trial.suggest categorical "num leaves", 16, 31, 63, 127, 255 , "lambda l2": trial.suggest float "lambda l2", 1e-8, 10.0, log=True , CHANGEABLE "max bin": trial.suggest categorical "max bin", 63, 127, 255 , "random state": BASE SEED, "verbosity": -1, "n jobs": 1, } model = lgb.LGBMRegressor param scores = cross val score model, X, y, cv=crossvalstrategy, scoring="neg mean squared error", n jobs=1, return -scores.mean def run worker args : worker id, study name, = args worker storage = make storage study = optuna.load study study name=study name, storage=worker storage, sampler=optuna.samplers.TPESampler seed=BASE SEED + worker id , study.optimize objective, n trials=NUM TRIALS PER WORKER, show progress bar=False, n jobs=1 return worker id def make storage : return optuna.storages.RDBStorage url=STORAGE URL, engine kwargs={ "poolclass": NullPool, "connect args": { "timeout": 30 }, Helps with heavy concurrent writes }, if name == " main ": main storage = make storage optuna.create study direction="minimize", study name=STUDY NAME, storage=main storage, load if exists=False, if hasattr main storage, "get engine" : main storage.get engine .dispose worker args = worker id,STUDY NAME,None for worker id in range NUM WORKERS with mp.Pool processes=NUM WORKERS as pool: results = pool.map run worker, worker args final storage = make storage final study = optuna.load study study name=STUDY NAME, storage=final storage print f"\nOverall Best Value: {final study.best value}, Overall Best Params: {final study.best params}" Optuna saves the study metadata in IRIS for future reference. This includes studies, trials, trial parameters, trial values, intermediate values, and related metadata in the Optuna storage tables created in IRIS. For further performance analysis, we can query these tables directly or, preferably, load the study back through Optuna and use Optuna's built-in visualization and analysis tools to inspect the optimization history, parameter importance, and trial performance. The image below shows the Optuna storage tables created in IRIS DB.