How to Deploy Your ML Model to AWS (Step-by-Step Guide)

wpnews.pro

cd /news/machine-learning/how-to-deploy-your-ml-model-to-aws-s… · home › topics › machine-learning › article

[ARTICLE · art-36244] src=dev.to ↗ pub=2026-06-22T07:21Z topic=machine-learning verified=true sentiment=· neutral

How to Deploy Your ML Model to AWS (Step-by-Step Guide)

A developer provides a step-by-step guide to deploying machine learning models to AWS using SageMaker, covering model packaging, S3 upload, endpoint creation, and inference testing. The guide includes code examples for saving models, creating inference scripts, and invoking endpoints, along with troubleshooting tips and cost considerations.

read2 min views1 publishedJun 22, 2026

I've trained more ML models than I've deployed. There's something comforting about the local loop—model.fit()

, model.evaluate()

, hitting 94% accuracy, then staring at the screen wondering, "Okay, how do I make this actually useful?"

If you're stuck there right now, this guide will help.

Note: I wrote this based on AWS documentation and standard SageMaker patterns. If you try it, drop a comment about what worked (or broke).

model.pkl

(or .joblib

)requirements.txt

with your dependenciesaws configure

)

import joblib
joblib.dump(model, 'model.pkl')

Create a requirements.txt

file:

sklearn==1.2.0
pandas==1.5.0
numpy==1.23.0`

Keep both files in the same folder.

import boto3

s3 = boto3.client('s3')

bucket_name = 'my-unique-ml-bucket-12345'  # Make this unique
s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={
    'LocationConstraint': 'us-east-1'
})

s3.upload_file('model.pkl', bucket_name, 'models/model.pkl')
s3.upload_file('requirements.txt', bucket_name, 'models/requirements.txt')

model_s3_path = f's3://{bucket_name}/models/model.pkl'

Save this as inference.py

import json
import joblib
import numpy as np
import os

model = None

def model_fn(model_dir):
    return joblib.load(os.path.join(model_dir, 'model.pkl'))

def input_fn(input_data, content_type):
    if content_type == 'application/json':
        data = json.loads(input_data)
        return np.array(data['features'])
    raise ValueError(f"Unsupported content type: {content_type}")

def predict_fn(input_data, model):
    return model.predict(input_data)

def output_fn(prediction, content_type):
    return json.dumps({'predictions': prediction.tolist()})

These four functions are what SageMaker calls when someone hits your endpoint.

Run this in a Python script:

from sagemaker.sklearn.model import SKLearnModel
from sagemaker import get_execution_role

sklearn_model = SKLearnModel(
    model_data=model_s3_path,
    role=get_execution_role(),
    instance_type='ml.m5.large',
    entry_point='inference.py',
    py_version='py3'
)

sklearn_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large',
    endpoint_name='my-model-endpoint'
)

This takes 5–10 minutes. You'll see Creating

→ In Service

import boto3
import json

runtime = boto3.client('sagemaker-runtime')

response = runtime.invoke_endpoint(
    EndpointName='my-model-endpoint',
    ContentType='application/json',
    Body=json.dumps({'features': [[5.1, 3.5, 1.4, 0.2]]})
)

result = json.loads(response['Body'].read().decode())
print(result)

If you see {'predictions': [...]}

, it worked.

Endpoints cost money even when idle:

aws sagemaker delete-endpoint --endpoint-name my-model-endpoint
aws sagemaker delete-endpoint-config --endpoint-config-name my-model-endpoint

Error | Fix | |---|---| NoCredentialsError | Run aws configure again | InvalidRoleException | IAM role needs S3 + SageMaker permissions | ModelError | Check inference.py for missing imports | Endpoint stuck on Creating | Wait 5–10 more minutes |

Your IAM role needs:

s3:GetObject

, s3:PutObject

sagemaker:CreateModel

, sagemaker:CreateEndpoint

Resource | Cost | |---|---| ml.m5.large | ~$0.20/hour (~$6/month if 24/7) | | S3 storage | ~$0.02/GB/month |

Delete when not using. I've seen $50 surprises from idle endpoints.

If you're following this, check:

pip show boto3 sagemaker

os

, joblib

, numpy

are installedIf something breaks, comment below with the error. I'll update this guide.

Deploying ML feels intimidating until you do it once. SageMaker handles most of the complexity. You just upload your model to S3, point SageMaker at it, and deploy.

I've trained models that sat on my laptop for months because I didn't know how to deploy them. Now I tell people: "Just run this script, it's not that hard."

If you're building something with this, drop a comment. I love seeing what people deploy.

source & further reading

dev.to — original article The Architecture of the AI Web: Moving Past Traditional SEO How to set up cloud budget alerts on AWS, GCP, Azure GEO: How to Get Your Content Cited by AI Search Engines (With Data from the Princeton Study)

~/api · this article 200

$curl api.wpnews.pro/v1/news/how-to-deploy-your-ml-mo…

Read original on dev.to → dev.to/shresthapandey/how-to-deploy-your-ml-mode…

mentioned entities

AWS

SageMaker

boto3

scikit-learn

joblib

pandas

numpy

metadata

slughow-to-deploy-your-ml-model-to-aws-step-by-step-guide

topic#machine-learning

secondary2 topics

sentimentneutral

canonicaldev.to

navigation

← prevGEO: How to Get Your Content Cit…

next →How to set up cloud budget alert…

── more in #machine-learning 4 stories · sorted by recency

letsdatascience.com · 21 Jun · #machine-learning

Moving Inference Workloads from Lambda to SageMaker

aws.amazon.com · 17 Jun · #machine-learning

Amazon SageMaker AI Async Inference now supports inline request payloads

aws.amazon.com · 17 Jun · #machine-learning

Context intelligence for your data and AI agents at scale

aws.amazon.com · 16 Jun · #machine-learning

Introducing container caching in Amazon SageMaker AI for faster model scaling

── more on @aws 3 stories trending now

wpnews · 21 Jun · #large-language-models

Anthropic faces a class action lawsuit accusing it of selling Claude Max subscribers far less than advertised

wpnews · 21 Jun · #artificial-intelligence

Plotting AI model release cadence: two labs are accelerating, three aren't

wpnews · 21 Jun · #ai-safety

Author Argues for Slower AI Despite Cancer Benefits

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required