Provisioning a Bedrock RAG knowledge base with S3 Vectors, without the hallucinated API calls.
If you've asked an AI coding agent to set up AWS, you've seen it confidently invent a parameter, reach for a deprecated service, or burn ten minutes retrying against a service it never saw in training. The failure mode that bites hardest is the silent one: the agent thinks it succeeded, and you find out an hour later.
I hit two of these while standing up the retrieval layer for a LangGraph support bot, an Amazon Bedrock Knowledge Base backed by Amazon S3 Vectors. I'd love to say I caught both with deep AWS expertise. I caught them because the Agent Toolkit for AWS read the docs I hadn't. Both would have shipped, and neither did.
The goal: take a folder of markdown product docs and make them queryable by meaning, so an agent can answer "is this safe for color-treated hair?" from the real docs instead of guessing. Think of it as giving the agent a library it can search instead of making things up. That's the retrieval half of RAG, the foundation a LangGraph agent will later call as a tool.
Four moving parts, wrapped in one managed service:
retrieve
call.To follow along, you need an AWS account, a non-root IAM identity with credentials configured locally, uv installed, and the toolkit installed in your agent. The fastest path across Kiro, Claude Code, Cursor, and Codex is the AWS CLI installer, aws configure agent-toolkit; in Kiro you can instead add the
.kiro/settings/mcp.json
(pin the mcp-proxy-for-aws
npx skills add aws/agent-toolkit-for-aws/skills
. The toolkit plugs into the agent you already use and loads task-specific amazon-bedrock
skill, which carries the validated, current procedure for building a Knowledge Base. That word, "current," is the whole story.My first instinct, straight from an older tutorial, was anthropic.claude-3-5-sonnet-20240620-v1:0
. Calling it returned:
ResourceNotFoundException: This model version has reached the end of its life.
The fix the toolkit's doc search surfaced: current Anthropic models on Bedrock are inference-profile only. You invoke them through a cross-region profile id like us.anthropic.claude-sonnet-4-5-20250929-v1:0
, not the bare on-demand id.
On its own, an agent might not even diagnose this correctly. "Not found" reads like a permissions or region problem, so it could swap in another stale id and hit "on-demand throughput isn't supported" instead, flailing sideways. The toolkit got it right because it read the current model docs, not because it happened to remember them.
I created the vector bucket, pointed the Knowledge Base at an index name, and assumed Bedrock would create the index. It didn't:
ValidationException: The specified index could not be found (S3Vectors 404)
The real requirement, from the S3 Vectors docs: you create the index yourself, and it must declare two non-filterable metadata keys that Bedrock uses to store chunk text and metadata. Miss them and ingestion fails later with a cryptic error far from the cause. The working command:
aws s3vectors create-index \
--vector-bucket-name <VECTOR_BUCKET> \
--index-name <INDEX_NAME> \
--data-type float32 --dimension 1024 --distance-metric cosine \
--metadata-configuration '{"nonFilterableMetadataKeys":["AMAZON_BEDROCK_TEXT","AMAZON_BEDROCK_METADATA"]}' \
--region us-east-2
This is the one that best captures why current docs matter. S3 Vectors launched in 2025, so the requirement isn't in most models' training data. A toolkit-less agent would most likely create the index, think it succeeded, and only hit the wall at ingestion time, then burn an afternoon recreating it with the wrong config. The dimension (1024) and distance metric here aren't arbitrary either: they have to match the Titan embedding model, which is the kind of cross-resource constraint an agent gets wrong when it's guessing.
With those two out of the way, the validated sequence ran clean: create the IAM service role (trust bedrock.amazonaws.com
with confused-deputy conditions, so another customer can't trick the role into acting on their resources, plus least-privilege permissions to invoke Titan, read the bucket, and use the vector index), create the Knowledge Base, attach the S3 data source with fixed-size chunking (300 tokens, 20% overlap), and run ingestion. Result: 10/10 documents indexed, zero failures.
The proof is a retrieval query:
aws bedrock-agent-runtime retrieve \
--knowledge-base-id <KB_ID> \
--retrieval-query '{"text":"Is the Curl Cream safe for color-treated hair?"}' \
--region us-east-2
Top hit came back at 0.86 similarity, on the exact product doc with the right answer. The library is stocked.
Strip away the demo and the toolkit changed two things: it handed the agent the validated setup order up front (no trial-and-error), and it caught two mistakes a model trained months ago wouldn't know, because it checks current docs and ships procedures AWS maintains. AWS reports developers see fewer iterations and errors with it; on this build, the two catches alone saved me an afternoon.
Two honest gaps. First, the toolkit's own rules recommend infrastructure-as-code over direct CLI, and I didn't follow that. I ran CLI calls and tracked them in a tagged manifest for teardown. It works, but CDK or CloudFormation would be the reproducible artifact a reader could clone. Second, I left the IAM role's trust policy scoped to knowledge-base/*
instead of the specific KB id; tightening that aws:SourceArn
is the obvious hardening step before this is anything but a demo.
This is the retrieval foundation, not the whole app. Two concrete next steps, and you could take either:
If you take one thing: the toolkit's real value isn't typing commands for you, it's making better decisions, grounded in current docs, on the things an AI agent gets wrong in ways you don't notice until an hour later.