Streamline external access to Amazon SageMaker MLflow using a REST API proxy AWS announced a solution for organizations to integrate Amazon SageMaker MLflow with existing enterprise systems through a Flask-based REST API proxy. The proxy service provides HTTPS access to SageMaker MLflow without requiring the MLflow SDK, addressing security and network restrictions that prevent direct SDK usage. This enables teams to maintain compliance with corporate policies while adopting cloud-native ML workflows. Artificial Intelligence https://aws.amazon.com/blogs/machine-learning/ Streamline external access to Amazon SageMaker MLflow using a REST API proxy Machine learning ML teams use MLflow to manage their ML lifecycle effectively. Amazon SageMaker MLflow https://docs.aws.amazon.com/sagemaker/latest/dg/mlflow.html provides comprehensive ML experiment tracking and model management capabilities. However, many enterprises have existing infrastructure requirements that need HTTPS-based integrations rather than direct SDK usage. Many organizations need to integrate Amazon SageMaker MLflow with their established systems while maintaining their security and infrastructure patterns. This integration challenge affects teams who can’t use the SDK directly because of corporate security policies, network restrictions, or legacy system constraints. In this post, we demonstrate how to build a secure Flask-based MLflow proxy service that provides HTTPS access to Amazon SageMaker MLflow without requiring the MLflow SDK. This solution is for organizations undergoing cloud transformation who want to preserve their existing ML workflows while adopting cloud-native services. This post covers the following topics: - Implementing the MLflow proxy service for MLflow HTTPS requests. - Configuring AWS Identity and Access Management IAM authentication for secure access. - Managing URL pre-signing and request transformation. After implementing this solution, you can: - Access SageMaker MLflow securely through standard HTTPS endpoints. - Maintain compliance with your organization’s security requirements. - Integrate MLflow with existing enterprise systems. - Reduce implementation complexity and maintenance overhead. Solution overview A lightweight Flask-based MLflow proxy architecture provides secure integration between enterprise systems and Amazon SageMaker MLflow through three key components. Component 1: Application Load Balancer ALB An AWS Application Load Balancer https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html serves as the upstream router, providing the following: - Traffic distribution for MLflow UI and REST API requests. - Initial request handling and routing. - Support for custom domain names and SSL termination. Note: This implementation uses ALB, but you can alternatively use other routing solutions such as Nginx based on your requirements. Component 2: Flask MLflow Proxy Service At the heart of the architecture, a Python-based Flask application handles the following: - Intercepting and processing incoming HTTPS requests. - Managing AWS authentication and request signing. - Transforming URLs for secure MLflow endpoint access. - Handling response routing back to clients. Component 3: Amazon SageMaker MLflow The AWS managed SageMaker MLflow service provides the following: - Support for two MLflow deployment modes: - MLflow Tracking Server – managed MLflow tracking server. - MLflowApp – serverless MLflow application. - Backend metadata store for tracking information. - Storage for model files and data. This architecture provides secure communication while maintaining compatibility with existing enterprise systems. The proxy service acts as a bridge, transforming standard HTTPS requests into authenticated AWS API calls that can interact with SageMaker MLflow. Architecture and request workflow The following diagram shows how the Flask proxy service provides secure communication between external clients and Amazon SageMaker MLflow. Figure 1: Architecture diagram showing the Flask proxy service integration with Amazon SageMaker MLflow The architecture diagram shows three main components: - An ALB that handles incoming traffic. - A Flask proxy service that manages authentication and request transformation. - Amazon SageMaker MLflow that processes ML operations. Request workflow Let’s explore how requests flow through this architecture to provide secure MLflow access. When a client initiates an HTTPS request, it first reaches the ALB, which acts as the entry point for all incoming traffic. The ALB then routes these requests to the MLflow proxy service. When it receives the request, the MLflow proxy service performs several critical functions: - Handles authentication through AWS IAM integration. - Transforms URLs and pre-signs them for secure access. - Processes the MLflow REST API endpoints as needed. The MLflow proxy service transforms the incoming request into an authenticated AWS request before making the API call to SageMaker MLflow REST endpoints. After SageMaker MLflow processes the request, it returns a response which the MLflow proxy service processes and routes back to the original client. This workflow maintains security while providing integration between enterprise systems and SageMaker MLflow. Prerequisites To follow this walkthrough, make sure you have the following: An AWS account https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-creating.html .- A workstation with the following tools installed: AWS Command Line Interface AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html configured with permissions to create:- Amazon Virtual Private Cloud Amazon VPC and associated networking components. - Amazon Elastic Compute Cloud Amazon EC2 instances. - Amazon SageMaker AI resources. - Amazon Simple Storage Service Amazon S3 buckets. - AWS Identity and Access Management IAM roles and policies. - AWS CloudFormation stacks. - AWS Application Load Balancers. Node.js https://nodejs.org/en/download version 18.0.0 or later. NPM https://docs.npmjs.com/downloading-and-installing-node-js-and-npm . AWS Cloud Development Kit AWS CDK CLI https://docs.aws.amazon.com/cdk/v2/guide/cli.html version 2.100.0 or later.- Python 3.x with pip or pip3. - Required knowledge: - Basic understanding of AWS services and IAM permissions. - Familiarity with Python and Flask applications. - Understanding of MLflow concepts and operations. - Cost considerations: - This solution creates AWS resources that might incur costs. - Key cost-driving resources include: - Amazon EC2 instances. - Application Load Balancer. - Amazon SageMaker AI resources. - Amazon S3 storage. For information about AWS service pricing, see AWS Pricing Calculator https://calculator.aws/ / . Deploy the solution This section walks you through deploying the solution in your AWS account and validating it. The deployment process takes approximately 40 minutes. Step 1: Deploy the infrastructure using AWS CDK - Download the solution code and install dependencies: Bootstrap your environment for AWS CDK https://docs.aws.amazon.com/cdk/v2/guide/bootstrapping-env.html . Skip this step if your AWS account and Region are already bootstrapped for AWS CDK.Bootstrap the AWS account and Region for CDK:- Deploy the required resources on your AWS account.The solution consists of four CDK stacks: - Networking stack — creates the VPC and networking components. - SageMaker AI domain stack — sets up the SageMaker domain. - SageMaker MLflow stack — deploys the MLflow tracking server or MLflow serverless app. - Flask application stack — deploys the MLflow proxy service. Deploy all the stacks with one of the following commands. For tracking server based deployment: For serverless app based deployment: Step 2: Install and configure the Flask MLflow proxy service - Connect to the EC2 instance: - Note the Amazon EC2 instance ID from the CDK output or from the sagemaker-infra-flaskapp-{mlflowType} AWS CloudFormation stack output section. - Use AWS Systems Manager Session Manager to connect. Follow the Session Manager connection guide https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-with-systems-manager-session-manager.html . - Note the Amazon EC2 instance ID from the CDK output or from the - Install Python 3.13 and dependencies.Install Python packages: Note: This script is designed for Ubuntu-based systems. For other Linux distributions, install Python 3.12+, PIP3, and Virtualenv using your system’s package manager. - Install and start the MLflow proxy service: - Check the Flask MLflow proxy service status: Note: If the service isn’t running, check logs with the following command: Step 3: Validate MLflow REST API access This section demonstrates how to interact with MLflow REST APIs through the ALB. Note: These examples use the HTTP unsecured protocol. For production environments, we recommend HTTPS. We use curl to make the API requests in this post, but you can use any tool you prefer. The provided curl commands work identically for both tracking server and serverless modes; the proxy service handles the differences transparently. - Get your ALB DNS name by running the following command on your workstation: - Test MLflow API endpoints by running the following commands on your workstation. Replace