I got very used to running agents locally.
The workflow was simple: run the agent, let it write outputs into my filesystem, then inspect everything in an ./outputs folder.
Markdown reports, JSON files, screenshots, charts — whatever the agent produced, it was right there.
Then I deployed it.
Same agent, same logic. But now the "output" lived in a container filesystem that vanished the second the task finished. A retry wrote report_20260313_103042.pdf
next to report_20260313_103041.pdf
. And when I wanted to share this with someone, I no longer had a clean link.
Nothing about the agent had changed.
Everything about the environment had.
If you build agents that produce files (reports, datasets, images, JSON dumps), you've probably hit this gap.
Local development hides it.
Production hands it to you on day one.
On your machine, persisting agent output is trivial:
from pathlib import Path
def save_report(content: bytes, run_id: str) -> Path:
out_dir = Path("./outputs") / run_id
out_dir.mkdir(parents=True, exist_ok=True)
path = out_dir / "summary.pdf"
path.write_bytes(content)
return path
That's it. Write bytes, get a path, move on.
You can list the directory, cat the file, open the PDF, or hand the path to the next script in your pipeline.
For one person running one agent on one laptop, this is perfectly fine.
The problem is not local development.
The problem is mistaking “it works on my laptop” for “I have a storage layer.”
Production agents do not get a reliable ./outputs/ folder.
They run in environments where the filesystem is temporary, isolated, or both.
Serverless functions may give you /tmp, but it is scoped to the execution environment and often limited in size. Containers lose local state when they restart. Background workers, queues, and orchestrators can run each task on a different machine.
And retries are not an edge case. They are part of the system.
Your orchestrator will eventually rerun a failed step, and now you have the same logical output produced twice.
Then there is the human in the loop.
Agents produce things people actually need to read: compliance PDFs, analysis summaries, generated slides, CSV exports, charts, screenshots, debug bundles.
Those people do not have SSH access to your worker node.
They need a link, not a filepath on a machine they will never see.
So the production checklist starts looking very different from local dev:
| Local | Production |
|---|---|
path.write_bytes() |
|
| Upload to durable object storage | |
./outputs/run_42/ |
|
| Queryable grouping by run/session | |
| "It's in the repo" | Stable ID retrievable from any machine |
| You remember the filename | Idempotent retries that don't duplicate |
| Files live forever | TTL / lifecycle rules |
| You Slack the file manually | Shareable download URL with expiry |
I have talked to a few teams that hit the same wall.
The agent logic is done.
Now the artifact plumbing begins.
Here's the distinction that changed how I think about this:
A file is bytes at a path.
An artifact is a file plus context.
That context is what makes the output usable after the agent is done.
For example:
A PDF sitting on disk is a file.
A PDF tagged with session_id=pipeline_run_42, agent_id=report-writer, model=claude-sonnet-4, retrievable as art_2xk9f7v3m1p0, and set to expire in 30 days?
That is an artifact.
Your agent may still produce files.
But downstream agents, debug tools, production workflows, and the humans waiting in Slack all need artifacts.
Most teams do not start by building an artifact store. They start with S3 (or R2, or GCS) and a slowly growing feeling that object keys aren't enough.
The pattern I keep seeing, including in our own user research, goes like this.
First, put the bytes in object storage:
import hashlib
import boto3
s3 = boto3.client("s3")
BUCKET = "my-agent-outputs"
def upload_file(local_path: str, tenant_id: str) -> str:
data = open(local_path, "rb").read()
content_hash = hashlib.sha256(data).hexdigest()
ext = local_path.rsplit(".", 1)[-1]
key = f"{tenant_id}/{content_hash}/{ext}"
s3.put_object(Bucket=BUCKET, Key=key, Body=data)
return key
Then you realize the object key is not enough.
You need to know which run produced the file, which agent created it, what kind of output it is, when it should expire, and how to find it later.
So you add a metadata table:
CREATE TABLE artifacts (
id text PRIMARY KEY,
tenant_id uuid NOT NULL,
filename text NOT NULL,
content_type text NOT NULL,
size_bytes bigint NOT NULL,
content_hash text NOT NULL,
session_id text,
agent_id text,
metadata jsonb NOT NULL DEFAULT '{}',
expires_at timestamptz NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
deleted_at timestamptz
);
CREATE INDEX idx_artifacts_session
ON artifacts (tenant_id, session_id, created_at DESC)
WHERE deleted_at IS NULL;
Then wrap it in an API:
def create_artifact(file_path, session_id, agent_id, metadata=None):
key = upload_file(file_path, tenant_id=current_tenant())
artifact_id = f"art_{generate_id()}"
db.execute(
"""
INSERT INTO artifacts
(id, tenant_id, filename, content_hash, session_id, agent_id, metadata, expires_at)
VALUES (%s, %s, %s, %s, %s, %s, %s, now() + interval '30 days')
""",
(artifact_id, key, session_id, agent_id, metadata or {}),
)
return artifact_id
Congratulations, you're on your way to building an artifact store.
Then the other 80% shows up:
I've watched engineers spend time building this type of wrapper and still not get dedup, TTL, or session semantics right.
This is not a knock on those teams. It is necessary plumbing. But necessary plumbing is still plumbing - and most teams should be spending that time on their product, not rebuilding agent infrastructure.
If you are deciding whether to build this yourself or use a purpose-built layer, this is the basic checklist I would use.
You need to answer one question quickly:
What did this pipeline run produce?
Not grep logs.
Not list an S3 prefix and hope the naming convention held.
One query:
artifacta ls --session pipeline_run_42
A session should be whatever your orchestrator already uses: pipeline_run_42, daily_batch_20260313, customer_report_8841.
It should not require a separate “create session” step just to group outputs.
When a report looks wrong three weeks later, you need to know what produced it.
Which agent?
Which model?
Which stage of the workflow?
That means agent_id and metadata should be captured at upload time, not buried in logs you hope still exist.
client.push(
"analysis.json",
session_id="pipeline_run_42",
agent_id="summarizer",
metadata={"model": "claude-sonnet-4", "stage": "final"},
)
Object storage metadata is not enough.
Headers are limited, awkward to query, and easy to make inconsistent across a pipeline.
You want structured metadata stored with the artifact record and filterable when listing artifacts.
Agent systems usually need two forms of deduplication:
These solve different problems.
Content hashing prevents duplicate storage. Idempotency prevents a retry from creating a second logical artifact.
Conflating the two is a common bug in homegrown wrappers.
Artifacts should expire by default.
An experiment, batch run, or debug file should not live forever because nobody remembered to clean it up.
Storage lifecycle rules help, but they usually operate at the bucket or prefix level. They do not understand your artifact metadata, which makes per-artifact expiration harder than it should be.
Humans need a link, not a file path.
A good artifact layer should make it easy to create a stable download URL with configurable expiry:
https://dl.example.com/lnk_...
That link should be separate from your internal storage details and easy to share with a teammate, customer, or workflow step.
Downstream agents should not coordinate through shared filesystem paths.
Agent A pushes an artifact and gets an ID. Agent B pulls by ID, or lists the session and filters by metadata.
export ARTIFACTA_SESSION_ID="pipeline_run_42"
python extract.py # pushes CSV
python analyze.py # lists session, pulls CSV, pushes report
python notify.py # creates download link for the human
Session sealing matters too.
Once a run is finalized, late uploads should fail clearly instead of silently corrupting the run:
409 Session 'pipeline_run_42' is sealed. No new artifacts can be added.
I'm building Artifacta, an artifact store purpose-built for AI agents.
It is not an orchestrator, search engine, or agent framework. It is the layer between your agent and object storage: session-aware, queryable artifact storage with a CLI, MCP, Python SDK, and REST API.
For example:
pip install artifacta-cli
export ARTIFACTA_API_KEY="ak_live_..."
artifacta push report.pdf --session earnings-q4-2025 --agent report-writer
artifacta ls --session earnings-q4-2025
artifacta link art_2xk9f7v3m1p0 # share with a human
Or from Python:
from artifacta import Client
client = Client()
artifact = client.push("report.pdf", session_id="earnings-q4-2025")
print(artifact.id) # art_2xk9f7v3m1p0
I’m sharing it because this is a problem I keep seeing in agent workflows, even if Artifacta is not the solution every team chooses.
I’m curious how other teams handle this today:
Drop your setup in the comments. I’m especially interested in approaches that are not just object storage plus glue code.