S3 objects have historically carried three types of native metadata: system-defined metadata, user-defined metadata (2 KB, immutable after upload), and object tags. With Amazon S3 Metadata β a separate bucket-level feature β you can aggregate all metadata types into managed Iceberg tables for SQL-based querying.
In June 2026, AWS introduced another metadata type: S3 Annotations β mutable, structured payloads (up to 1,000 per object, each up to 1 MB, totaling 1 GB) in JSON, XML, YAML, or plain text that can be added, updated, or deleted at any time without touching the underlying object.
Having worked closely with Agent infrastructure and being familiar with Agent architectures, I'd like to share three concrete use cases based on my experience.
π Core point: S3 Annotations let a file's context (transcripts, summaries, action items) live on the file itself β so deletion, replication, and migration don't require separate mapping or cleanup logic. The meeting summary example illustrates this, but the pattern applies to many scenarios where processing context needs to stay bound to its source object. π
Most Agent products today offer some form of meeting transcription and summarization. The Agent takes a recording β either captured live via a streaming model, or uploaded after the fact for batch processing β produces a full transcript, and then generates structured meeting minutes with action items.
In practice, companies building these Agents retain the recording alongside its transcript and minutes for future search, knowledge building, and audit purposes. That raises a straightforward question: how should these files live together, and how do you keep them linked?
Typically, the recording, transcript, and minutes end up as separate S3 objects grouped by a shared path prefix:
s3://agent-data/meetings/
βββ 2026-06-15-standup/
β βββ recording.mp4
β βββ transcript.json
β βββ minutes.json
βββ 2026-06-16-review/
β βββ recording.mp4
β βββ transcript.json
β βββ minutes.json
βββ ...
Or their relationship is tracked in an external database (DynamoDB, RDS, etc.).
Either way, the context (transcript, minutes) lives separately from the data (recording), and their association depends on the application layer. At scale β thousands or tens of thousands of recordings β this starts to cost you:
Deletion requires coordination: removing a recording means you also need to clean up its associated files or database mappings. Miss one, and you've got orphaned data accruing storage charges.
Replication and migration need extra care: moving data across regions or buckets means ensuring every related file travels together, or updating path mappings in your external system.
The link isn't native to S3: S3 has no idea that recording.mp4
and transcript.json
belong together. That relationship exists only in your path convention or external database β any gap in that layer, and the association breaks silently.
The underlying issue: context doesn't live with the data. They're separate objects, held together by convention.
With S3 Annotations, you attach the transcript and minutes directly to the recording object itself:
transcript
: the full transcript document
meeting_minutes
: the generated minutes
action_items
: structured action items
Each is an independent, named annotation that can be written or updated separately. What this gives you:
Lifecycle is automatic β delete the recording, and its annotations go with it. No orphans.
Copy and replication just work β annotations follow the object by default. No extra rules to configure.
Take the offline case: a user uploads a 1 GB recording, and only then does the Agent begin transcription. S3's user-defined metadata (x-amz-meta-*
) can't help because:
It's write-once: you set it at upload time, and it's locked. The transcript and minutes don't exist yet at that point β there's nothing to write.
It's tiny: the 2 KB total limit can't hold a transcript that runs tens of KB.
Updating means copying the object: to change metadata, you'd have to copy the entire 1 GB file. That's not practical.
S3 Annotations solve all three: they're writable at any time, hold up to 1 MB each, and are mutable without touching the object.
π Core point: When Agents output structured JSON as annotations β whether as processing results (summaries, transcripts) or generation context (prompts, outlines) β those files become precisely searchable via annotation tables without any additional vectorization pipeline. This isn't a full replacement for semantic search, but it means vector-based retrieval is no longer the only option for multimodal file discovery.π
Today's Agent products handle multimodal content across the board β generating images, building presentations, summarizing PDFs, transcribing recordings. In doing so, Agents naturally produce structured outputs: summaries, classifications, outlines, prompts. If you write these as structured JSON annotations at processing time, those files become searchable immediately β no separate vectorization pipeline needed, and no repeated analysis when the file comes up again later.
| File Type | What the Agent Does | JSON Annotation | How You Search Later |
|---|---|---|---|
| Recording | Transcribes and generates structured minutes | {"date": "...", "topics": [...], "action_items": [...]} |
|
| "Meetings about EKS upgrade last week" β filter by topic, date, owner | |||
| Extracts summary, entities, key insights | {"summary": "...", "entities": [...], "keywords": [...]} |
||
| "Reports mentioning customer X" β filter by entity, no re-analysis | |||
| Image/Video | User generates via prompts with iterative refinement | {"prompt": "...", "iterations": [...], "style": "..."} |
|
| "That cyberpunk logo I made" β search prompt keywords directly | |||
| PPT | Generates presentation from user requirements, producing an outline | {"outline": [...], "topics": [...], "slide_count": 12} |
|
| "My cost optimization deck" β search by outline topics |
These scenarios split into two types:
Processing results (recordings, PDFs): the Agent's job is to analyze the file and produce output. That output becomes the annotation.
Generation context (images, PPT): the Agent's job is to create something for the user. The intermediate context β prompts, iteration history, outlines β is a byproduct that's worth preserving.
In both cases, the architectural principle is the same: design the Agent workflow to emit structured JSON rather than plain text for anything that should be searchable later. Structured JSON enables precise field-level queries through annotation tables; plain text limits you to fuzzy LIKE
matching.
For meeting minutes, this means having the Agent produce:
// annotation name: "meeting_minutes"
{
"date": "2026-06-15",
"topics": ["EKS upgrade", "cost optimization"],
"action_items": [
{"owner": "Alice", "task": "Complete EKS 1.30 upgrade plan"},
{"owner": "Bob", "task": "Submit RI purchase request"}
],
"summary": "Discussed EKS cluster upgrade timeline and Q3 cost optimization goals..."
}
With structured fields, you can run json_extract_scalar
queries β filtering by date, grouping by topic, searching by assignee β things that are impossible against freeform text.
To query across objects, you need S3 Metadata β a bucket-level feature (not the object-level user-defined metadata discussed earlier; the naming is confusing, but they're completely different things). Once enabled, S3 automatically streams all annotations into a managed Apache Iceberg table (annotation tables).
From there, Agents can query with Athena SQL:
-- Find meetings with EKS-related action items from last week
SELECT bucket, object_key, text_value
FROM annotation_table
WHERE name = 'meeting_minutes'
AND text_value LIKE '%EKS%'
AND json_extract_scalar(text_value, '$.date') >= '2026-06-09'
Or via S3 Tables MCP Server, Agents can simply ask: "Find all meeting minutes that discussed EKS upgrade last week." The MCP Server handles the SQL translation.
Traditional multimodal search requires a vectorization pipeline β embedding model, vector store (like S3 Vectors), index maintenance. It's great for semantic similarity. But in the scenarios above, the Agent is already producing structured content as part of its job. Writing that content as a JSON annotation gives you queryability for free β no extra pipeline.
These aren't substitutes; they solve different problems. Vector search handles semantic queries (fuzzy, open-ended). Annotation tables handle structured queries (exact field matching, aggregation). Use them together or independently, depending on the scenario.
π Core point: When Agents archive execution traces to S3, offloaded context can be attached as annotations rather than stored at separate paths β making each trace a self-contained unit that doesn't depend on external file references remaining valid over time.π
In Agent context engineering, context off is standard practice. An Agent working on a document or code task pulls in content from many external files. Once that content is no longer needed in the active window, the Agent swaps it out β replacing the actual content with a file path or link. If it's needed again, the Agent reads from that path.
When the session ends, the full execution trace goes to S3 for long-term storage. At that point, the trace contains only path references; the offloaded content lives at a separate S3 location. This creates the same problem as Scenario 1: the trace and its referenced content are decoupled. Paths can go stale after bucket reorganization, and lifecycle management requires extra care.
Attach the offloaded content directly as annotations on the trace object. The relationship goes from "two separate objects linked by a path string" to "one self-contained object." The trace carries everything needed to reconstruct the full execution context, independent of any external file paths.
This applies specifically to post-session archival β active sessions still manage context in memory or a database. S3 Annotations address the archival problem: how to make stored traces fully self-contained.
The payoff is simple: when you need to retrace a past execution β for debugging, auditing, or session restoration β you don't need to chase down external paths and verify they still resolve. Just read the annotations on the trace object. Self-contained archives are dramatically easier to maintain long-term.
S3 Annotations offer three levels of interaction, each suited to different needs:
| Method | Best For | How It Works |
|---|---|---|
| Direct API | ||
| Single-object read/write | CRUD operations on individual annotations | |
| Annotation Tables + Athena | ||
| Cross-object batch queries | Requires S3 Metadata enabled; annotations auto-flow into Iceberg tables | |
| MCP Server | ||
| Natural language retrieval | Agent queries without writing SQL |
The foundation. Agents use these when they know exactly which object to work with. Amazon S3 supports the following API operations for annotations:
PutObjectAnnotation β Creates or overwrites an annotation on an object. You specify the annotation name and payload in the request.
GetObjectAnnotation β Returns the payload of a specific annotation by name.
ListObjectAnnotations β Returns the list of annotations on an object. The response includes each annotation's name, size, ETag, and last modified date.
DeleteObjectAnnotation β Removes a specific annotation by name.
Example β writing a transcript annotation:
PUT /meeting-001.mp4?annotation&name=transcript HTTP/1.1
Content-Type: application/json
{"date": "2026-06-15", "speakers": [...], "segments": [...]}
These suit the case where the Agent just finished processing a file and needs to persist results, or a user referenced a specific file and the Agent needs to pull its annotation.
When you need to search across objects β "which recordings are missing minutes," "all meetings about EKS last week" β you can't call single-object APIs one by one.
Enable S3 Metadata (bucket-level) and annotations automatically flow into managed Iceberg tables:
Journal table: near-real-time, good for detecting fresh annotations quickly
Annotation table: ~1 hour refresh, good for batch analysis and auditing
Query with Athena:
-- Recordings that have a transcript but no minutes yet
SELECT bucket, object_key
FROM annotation_table a
WHERE a.name = 'transcript'
AND NOT EXISTS (
SELECT 1 FROM annotation_table b
WHERE b.object_key = a.object_key AND b.name = 'meeting_minutes'
)
-- Find all meetings about EKS from last week
SELECT bucket, object_key, text_value
FROM annotation_table
WHERE name = 'meeting_minutes'
AND text_value LIKE '%EKS%'
AND json_extract_scalar(text_value, '$.date') >= '2026-06-09'
This is ideal for batch scheduling, pipeline monitoring, and any query that needs precise conditions.
S3 Tables MCP Server puts a natural language interface on top of annotation tables. Agents describe what they want:
The MCP Server generates the SQL, runs it, and returns results. This fits conversational Agent scenarios where users ask questions in natural language and the Agent needs to search across its historical outputs.
This article explored three distinct ways S3 Annotations can be applied in Agent scenarios, each with a different focus:
Scenario 1 β Lifecycle binding: processing results (transcripts, summaries) stay attached to the source file. No separate mapping, no orphaned files, no manual cleanup on deletion or migration.
Scenario 2 β Structured multimodal retrieval: by outputting structured JSON during processing or generation, Agents gain searchable metadata for free β enabling precise retrieval across recordings, PDFs, images, and presentations without a dedicated vectorization pipeline. Semantic vector search is no longer the only option for multimodal file discovery.
Scenario 3 β Self-contained archival: offloaded context lives on the trace object itself, making long-term archives independent of external file paths.
The three scenarios address different needs, but share one insight: Agents already produce structured context as part of their work β S3 Annotations give that context a place to live that's natively bound to the data, lifecycle-managed, and queryable. If you're building on S3 as your Agent's persistence layer, it's worth evaluating which outputs could benefit from this pattern.
Two behaviors worth noting in specific configurations (source):
Multipart upload copies (objects > ~8 MB): when the AWS CLI or SDK uses Transfer Manager to copy large objects, annotations are not copied by default. Specify --copy-props all
in the CLI or the equivalent SDK configuration to include them.
Versioned buckets: a simple DELETE (without specifying a version ID) creates a delete marker, but annotations on the underlying version remain intact. This doesn't break the lifecycle binding described in Scenario 1 β it simply means that in versioned buckets, you need to specify the version ID when deleting to fully remove both the object and its annotations.
References:
#AmazonS3 #AIAgents #ContextEngineering #CloudArchitecture #AWS