{"slug": "amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in", "title": "Amazon Bedrock AgentCore harness is now generally available: Go from idea to production-grade agent in minutes", "summary": "Amazon Bedrock AgentCore harness is now generally available, enabling developers to build production-grade AI agents in minutes with two API calls. The managed abstraction handles sandboxed compute, memory, identity, and observability, eliminating the need to manually wire up infrastructure. The harness supports any model served on Amazon Bedrock and allows model switching mid-session without losing context.", "body_md": "[Artificial Intelligence](https://aws.amazon.com/blogs/machine-learning/)\n\n# Amazon Bedrock AgentCore harness is now generally available: Go from idea to production-grade agent in minutes\n\nA year ago, Simon Willison wrote one of the cleanest definitions of an agent that has stuck around:\n\nAn LLM agent runs tools in a loop to achieve a goal.\n\nThat definition stuck because it describes what every production agent actually does. Kiro, Amazon Q Developer, Quick Agents, Codex, Claude Code: under the hood, they all run the same shape. The agent loop is the common denominator.\n\nBut the loop was never the hard part. The hard part was everything around it.\n\nPick a framework. Wire up tools. Provision sandboxed compute. Configure storage, secrets, networking. Decide where memory lives. Bolt on observability. Get the right dependencies into the right container. Also, local prototyping tends to be the easy part: a single developer can stand up an agent on their laptop in an afternoon. **Getting it into production is where the work explodes**, and the moment it has to serve more than one user, a whole new layer of work shows up: concurrency, isolation, identity, state, scaling.\n\nWorse, that overhead multiplied with every new use case. Teams that wanted to experiment, try a different model, swap a tool, point the agent at a new domain, found themselves repeating the same plumbing. The bottleneck wasn’t intelligence. It was orchestration and infrastructure.\n\nWhen we launched the **AgentCore harness** in preview in April, we made a bet: the AgentCore primitives (Runtime, Memory, Gateway, Browser, Identity, Observability) already give teams everything they need to run agents in production; what they shouldn’t have to do is wire them up by hand every time. The harness handles that wiring as a managed abstraction, so it becomes something you configure rather than something you build.\n\nToday, **Amazon Bedrock AgentCore harness is generally available**. Two API calls ([CreateHarness](https://docs.aws.amazon.com/bedrock-agentcore-control/latest/APIReference/API_CreateHarness.html) to define an agent, [InvokeHarness](https://docs.aws.amazon.com/bedrock-agentcore/latest/APIReference/API_InvokeHarness.html) to run it), a quick walkthrough in the AgentCore CLI (as shown in the below gif), or a few clicks in the console, and you have an agent running in minutes. It runs in its own isolated environment with a filesystem and shell, so it can read files, run commands, and write code safely. It remembers users and conversations across sessions, picks up skills you point it at (including the AWS-curated catalog), browses the web, calls your tools through gateway or MCP, and switches model providers mid-session without losing context. Every step streams back to you in real time and is automatically traced to CloudWatch. There’s no need to write orchestration code or build a container, except if you want to.\n\n## What the harness offers you\n\nA harness is everything an agent needs to run in production, wrapped behind two API calls. You point to the model, tools, skills, and instructions you want. AgentCore handles the sandboxed environment, the memory, the storage, the identity, and the observability that ties it all together. Capabilities new at GA are marked with * in the diagram below.\n\n## Any model: Use the right model for the job, switch when you need to\n\nDifferent tasks need different models. Customers told us they want to plan with one model and execute with another, swap a provider for a price-performance test, or move off a model that just shipped a regression, all without losing the conversation. Pick a default model on `CreateHarness`\n\n, then override it on any single `InvokeHarness`\n\ncall when you need to. The default stays in place for every other invocation. Set the matching field on `model`\n\nfor the provider you want:\n\n`bedrock`\n\nfor any model served on Amazon Bedrock, including Anthropic Claude, Amazon Nova, Meta Llama, DeepSeek, Qwen, Kimi, MiniMax, Cohere, Mistral and as of recently[OpenAI GPT-5.5 and GPT-5.4 on Bedrock](https://aws.amazon.com/blogs/aws/get-started-with-openai-gpt-5-5-gpt-5-4-models-and-codex-on-amazon-bedrock/)`openAi`\n\nfor direct access to OpenAI’s API (api.openai.com)`gemini`\n\nfor Google Gemini`liteLlm`\n\nfor any third-party provider supported by[LiteLLM](https://docs.litellm.ai), including Anthropic direct, Cohere, Mistral, Vertex, Azure OpenAI, and others\n\nAnd the part that customers told us mattered most: **switch providers at any point, even mid-session, and keep context.** For example, you can use Claude Opus to plan, switch to GPT-5.5 to write code, switch to Gemini to summarize. The conversation continues. The harness handles the transition seamlessly.\n\nIf you’re using API keys to access any of the underlying model providers, they’re stored securely in **AgentCore Identity’s token vault**. The agent never sees raw credentials.\n\n**Tools as config: Connect your agent to the world without writing glue code**\n\nTools are how the agent affects anything outside its own reasoning, and wiring them is the part most teams quietly hate. Customers told us they don’t want to write per-API adapter code, manage MCP server lifecycles, or build their own browser sandbox. They want to declare what the agent can use and let the harness handle the connection, the auth, and the execution.\n\n`tools`\n\non `CreateHarness`\n\nare a list. Each entry has a type and a config block, and the harness wires them in:\n\n`agentcore_gateway`\n\n: you can reference an[AgentCore Gateway](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/gateway.html)by ARN. Every target the gateway exposes (OpenAPI, Smithy, Lambda, MCP) shows up as a tool, with IAM/JWT auth, per-tool authorization, and outbound credential brokering handled for you.`remote_mcp`\n\n: you can connect directly to any MCP server by URL. Good when the server is already secured and you don’t need Gateway’s governance layer in front of it.`agentcore_browser`\n\n: a full browser sandbox as a one-line reference. Click, type, navigate, screenshot.`agentcore_code_interpreter`\n\n: sandboxed Python and Node execution, same one-line pattern.`inline_function`\n\n: a tool schema the harness emits as a tool-use event in the stream and waits for you to respond on. Use it for human-in-the-loop approvals or for tools that have to run on your side.\n\nEvery session also gets built-in shell (run commands inside the microVM) and `file_operations`\n\n(read and write on the agent’s filesystem) without you listing them. They’re what make the stateful filesystem and shell story usable from the model.\n\nYou have the same options on `InvokeHarness`\n\nfor per-call edits, where you can pass new tools to change tools for a single call, or strip the list down to a focused set for that invocation via the `allowed_tools`\n\nparameter. Defaults are set at create time, but you can easily override at invoke time.\n\n## Built-in memory: Your harness remembers users and conversations\n\nCustomers want their agent to recognize a returning user, pick up where the last conversation left off, and remember preferences without anyone replaying message history. In preview, you had to provision an AgentCore Memory resource separately and pass its ARN, which worked but was a second API call and an easy thing to forget on the way to production.\n\nAt GA, omitting memory on `CreateHarness`\n\nprovisions a managed memory automatically, with sensible defaults: `SEMANTIC`\n\n+ `SUMMARIZATION`\n\nstrategies, 30-day event expiry, AWS-owned encryption, and multi-tenant isolation by default through namespace templates that key on `actorId`\n\n. It’s a real, customer-owned Memory resource, provisioned for you. Memory isn’t mandatory. If your agent is stateless, set `memory: { disabled: {} }`\n\nand the harness skips memory entirely. If you’d rather attach an AgentCore Memory resource you already own, pass `agentCoreMemoryConfiguration`\n\nwith its ARN. Those three paths look like the following:\n\nSwitching to your own memory is one `UpdateHarness`\n\ncall. Pass `agentCoreMemoryConfiguration`\n\nwith your memory ARN and the previously managed memory disassociates immediately. It’s still a regular AgentCore Memory resource in your account, so you can keep using it anywhere, attach it to another harness, query it directly, or delete it on your own terms. When you delete the harness, the managed memory is cascade-deleted by default (`deleteManagedMemory: true`\n\n). Pass `deleteManagedMemory: false`\n\nif you want to keep it.\n\nThe managed memory is automatic but not opaque. It’s a real, addressable AWS resource you can query, attach to a different agent, audit, or hand to an analytics pipeline.\n\n## Skills: Give your agent the right expertise on the right task\n\nCustomers want their agent to know how to handle a specific task before it tries it. For example, how to format an Excel report, how to file a JIRA ticket the way their team files them, or how to follow AWS-recommended procedures for accessing their data on AWS. Skills are how you give the agent that knowledge on demand. They’re bundles of files, scripts, and instructions. The harness loads skill metadata and pulls full content into context only when the task actually calls for it.\n\nAt GA, [HarnessSkill](https://docs.aws.amazon.com/bedrock-agentcore-control/latest/APIReference/API_HarnessSkill.html) is a union with four sources, so you can attach skills declaratively without baking them into a container or shelling in:\n\n`awsSkills`\n\n– turn on the AWS-curated skill bundle.`git`\n\n– clone a public or private repo over HTTPS, pinned to a commit or a branch.`s3`\n\n– pull a skill bundle from your own Amazon Simple Storage Service (Amazon S3) bucket.`path`\n\n– reference a path that already exists in the container you brought in.\n\nThe same shape works on `InvokeHarness`\n\nfor per-call layering. The harness materializes each skill onto the session filesystem on session start, or during a new invocation if the Skills configuration changes.\n\n**The big unlock for AWS builders:** the [AWS skills repository](https://github.com/aws/agent-toolkit-for-aws/tree/main/skills) ships curated skills covering the AWS surface area, from core skills (SDK usage, infrastructure as code (IaC), AWS Identity and Access Management (IAM), Amazon CloudWatch, and Amazon Bedrock) to service-specific deep workflows for analytics, databases, Amazon Elastic Compute Cloud (Amazon EC2), networking, security, serverless, and storage.\n\nTo make this even simpler, GA introduces a first-class `awsSkills`\n\ntoggle: turn on the AWS skill bundle with zero plumbing, no URL, no network fetch (the skills are brought in the harness’s underlying runtime, whenever you need them).\n\n## Environment and filesystem: Run your agent in the environment it needs\n\nMost agents run fine on the harness’s default environment, which includes Python and bash. When you need more (a private dependency, a runtime version, a CLI tool, or persistence across sessions), two knobs let you shape the agent’s runtime to match your stack: the *container image* and the *filesystem*.\n\n**Container image.** If Python and bash aren’t enough, you can package your source code, dependencies, runtimes, and tools into a custom container, push it to Amazon Elastic Container Registry (Amazon ECR), and reference it in `CreateHarness`\n\n. The agent then uses that exact environment. You can also pair it with [InvokeAgentRuntimeCommand](https://docs.aws.amazon.com/bedrock-agentcore/latest/APIReference/API_InvokeAgentRuntimeCommand.html), an API that runs a shell command directly inside the agent’s microVM session, for session-specific setup that varies per invocation (clone a particular branch, seed test data, or pull credentials). It’s deterministic, doesn’t go through the model, and doesn’t burn tokens.\n\n**Filesystem.** Agents often need files to outlive a single response: a shared knowledge base, a working directory across sessions, or a place to drop produced documents back into your bucket. The harness gives you three filesystem options, each with different reach and persistence characteristics.\n\nType |\nManaged |\nVirtual private cloud (VPC) required |\nPersistence |\nManaged session storage |\nYes | No | Across stop/resume cycles of the same `runtimeSessionId` . |\nAmazon Elastic File System (Amazon EFS) access point |\nBYO | Yes | Across all sessions, sharable across harnesses. |\nAmazon Simple Storage Service (Amazon S3) Files access point |\nBYO | Yes | Across all sessions and harnesses, with full Amazon S3 durability, versioning, and history. |\n\nReach for **managed session storage** for working files that need to survive microVM restarts within a session. Reach for **EFS** when multiple harnesses or sessions need to share reference data, prompts, or skill bundles. Reach for **S3 Files** when you want the agent to read and write through standard file operations while changes are automatically synchronized with the backing S3 bucket (the agent writes a report, the report appears in your S3 bucket as it goes).\n\n## Unified observability: See what your agent did, in one place\n\nWhen something goes wrong, customers want to know in one place what the agent ran, what it called, where it slowed down, and where it failed. A typical harness invocation crosses runtime + memory + gateway + a built-in tool or two, and stitching that picture together used to mean opening five tabs.\n\nAt GA, every harness page in the AgentCore console shows **a single observability widget**: an aggregate row that summarizes the harness across every primitive it touched, plus per-primitive sections that appear only for the primitives the harness is configured with or has used.\n\nFor deeper analysis, **CloudWatch GenAI Observability** has a new **Harnesses tab** alongside Runtime and other primitives. Drill from a harness, into a session, into a single trace, and see exactly what the agent did, in what order, how long each step took, and where it failed. Logs from every primitive (memory, gateway, browser, code interpreter) surface inline at the right span, so you stop hopping between log groups to piece together what happened.\n\n## Evaluate and optimize: Keep improving your agent in production\n\nOnce your agent is in production, the question shifts from “does it work?” to “is it improving?” Customers want a way to score how their agent is actually doing on real traffic, get suggestions on what to change, and validate those changes before rolling them out. GA brings two pieces that close that loop:\n\nscore harness traces with built-in large language model (LLM)-as-a-judge evaluators (helpfulness, faithfulness, safety), or with custom evaluators you author. Run them online (scoring every session as it happens), on-demand for a single trace, in batch over historical traces, against a fixed test dataset, or as a simulation with synthetic users to stress-test before going live.[AgentCore Evaluations](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/evaluations.html)reads those evaluator scores and generates[AgentCore optimization](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/optimization.html)**prompt and tool-description recommendations**, then validates them by routing live traffic between two variants through AgentCore Gateway with online evaluation scoring per session and statistical significance reporting. Variants can be different versions of an optional**configuration bundle** on the same runtime, or different version pointing at different endpoints, so you can A/B-test prompt and tool-description changes without redeploying code by pointing just at a different endpoint.\n\nRun your harness, capture traces, get scores, get recommendations, A/B-test the recommended configuration against the current one, then ship the winner.\n\n## Version and roll back: Roll out changes safely, roll back instantly\n\nCustomers want to update prompts, swap a tool, or try a new model on a subset of traffic without putting the whole agent at risk. Versioning and endpoints on the harness mirror what AgentCore Runtime already offers: every `UpdateHarness`\n\ncreates an **immutable version** capturing the full configuration (model, system prompt, tools, memory config, skills, environment, truncation, execution limits), and rollback is “point the endpoint at an earlier version.”\n\nThe `DEFAULT`\n\nendpoint auto-advances on every update. Named endpoints (`PROD`\n\n, `STAGING`\n\n) stay pinned until you explicitly promote.\n\n## Export to code: Graduate when configuration isn’t enough\n\nWhen a use case outgrows configuration (custom orchestration, multi-agent coordination, deep instrumentation), customers want to take the agent further without rebuilding it from scratch. One CLI command exports the harness as Strands-based code that can host on AgentCore Runtime or anywhere else:\n\nThe exported project preserves your model, prompt, tools, memory wiring, skills, and container environment. Same compute path, same observability, same identity primitives. **The graduation is a config-to-code translation, not an architecture switch.**\n\nStrands is the first export target; Claude Agent SDK is coming soon, so customers who prefer that framework can graduate the same way.\n\nThis is the part of the harness story we care about most. When configuration stops being enough, you graduate to the same compute and the same primitives, with code you can read and modify, instead of starting over from scratch.\n\n## Other notable additions\n\nWe also added the following:\n\n**Step Functions integration.** A harness invocation is now a first-class state in [AWS Step Functions](https://docs.aws.amazon.com/step-functions/latest/dg/connect-bedrockagentcore.html). In Workflow Studio, search for **AgentCore InvokeHarness** and drag it into your workflow. Use **Quick Create Harness** to scaffold a new harness and execution role from inside Step Functions, or point at an existing harness and override per call. The same InvokeHarness semantics apply, with defaults at the harness and overrides at the Task state.\n\n**Web Search on AgentCore.** The new [Web Search on AgentCore](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/gateway-target-connector-web-search-tool.html) (also launched at NY Summit) is available to harness agents through AgentCore Gateway: expose Web Search as a Gateway target, reference the Gateway from the harness, and the agent has search. A first-party `agentcore_web_search`\n\ntool type is coming soon, matching the one-line pattern of `agentcore_browser`\n\nand `agentcore_code_interpreter`\n\n.\n\n## What you can do with all of this\n\nThere are countless use cases the harness can support, across industries and agent types. To give you a sense of the diversity, here are three concrete examples, each something teams told us they were piecing together by hand before.\n\n**A research and writing agent.** The agent could search the web, browse sources, draft a document, and hand you back a real xlsx or pptx file, with memory carrying across sessions so the next question doesn’t replay everything. The minimum to stand it up is one `CreateHarness`\n\ncall:\n\n`tools`\n\n:`agentcore_browser`\n\n, plus a Gateway target that exposes Web Search on AgentCore.`skills`\n\n: a`git`\n\nsource pointing at`anthropics/skills`\n\nfor the document-skills bundle.\n\nMemory is on by default, so you don’t configure it explicitly. That’s it.\n\n**An AWS data and analytics agent for your team.** The agent could pull data from your AWS account (Amazon Athena, AWS Glue, Amazon S3, Amazon Redshift, Amazon CloudWatch), run an analysis, and hand back a summary, a chart, or a finding, while following AWS-recommended procedures for accessing each service step by step instead of improvising. The minimum to stand it up is one `CreateHarness`\n\ncall:\n\n`skills`\n\n:`[{\"awsSkills\": {}}]`\n\nto flip on the curated AWS catalog (analytics, database, Amazon EC2, networking, security, serverless, and storage).`executionRoleArn`\n\n: an IAM role scoped to whatever AWS APIs you want the agent to read from.\n\nAdd `agentcore_code_interpreter`\n\nif you want the agent to also run Python in a sandbox to slice and visualize the data it pulls.\n\n**A coding agent.** The agent could read your code base, plan a change, write it, run the tests, and open a pull request (PR), with the ability to switch to a different model mid-session for design and implementation without losing context. The minimum to stand it up is two steps:\n\n- Push a custom container with your repo and toolchain to Amazon ECR.\n- Call\n`CreateHarness`\n\nwith`environmentArtifact`\n\npointing at that image, plus a Gateway target wired to GitHub (or your internal GitLab or Bitbucket equivalent) so the agent can interact with branches, PRs, and reviews.\n\nFor deterministic git operations like clone, commit, push, and open a PR (without paying the model to think through them), call `InvokeAgentRuntimeCommand`\n\ndirectly.\n\nThose are three different agents, with the exact same harness. The API configuration is the only thing that changes.\n\n## Pay only for what you use\n\nThere is no additional harness fee. You pay for the underlying capabilities based on actual consumption.\n\n**Runtime compute**(where the harness session runs): active-consumption pricing per second, $0.0895 per vCPU-hour, $0.00945 per GB-hour. Agentic workloads spend significant time waiting on model and tool I/O. Runtime bills only when CPU is actually consumed.**Browser** and**Code Interpreter**: same active-consumption model.** Gateway**: per-1,000 invocations and per-1,000 search queries.** Memory**: per-1,000 short-term events, per-1,000 long-term records per month, per-1,000 retrievals.** Observability**: standard Amazon CloudWatch pricing for spans, logs, and metrics.** Model inference**: charged by Amazon Bedrock or the third-party provider at their standard rates.\n\nEach is independent. Use one, use all. An agent that runs for 60 seconds and calls two tools costs accordingly. An agent that runs for an hour with heavy compute costs accordingly. **You pay proportionally to what your agent actually computes.**\n\nFor full pricing details, see [the AgentCore pricing page](https://aws.amazon.com/bedrock/agentcore/pricing/).\n\n## What some of our customers are excited about with harness\n\n**Omar Paul, VP of Product at Twilio **stated that *“Twilio’s customers are building AI agents that work across voice, messaging, and digital channels — with real-time intelligence and persistent memory that make every interaction feel like a conversation. By combining AgentCore harness with Twilio Conversations, developers can go from idea to live agent without rewiring infrastructure. The best customer experiences happen when great AI and great communications infrastructure are built together.” *\n\n**Dr. Lukas Schack, Principal Machine Learning Engineer at TUI GROUP **told us that *“Amazon Bedrock AgentCore has become a core building block at TUI: we use Runtime to host agents across frameworks and Memory to share context between them, in production and in workshops with over 500 employees, sometimes with more than 130 people building at the same time. With AgentCore harness what used to take weeks from idea to working product now takes minutes, and customer-facing use cases are next.”*\n\n**Rodrigo Moreira, VP of Engineering, VTEX **said *“We’re building AI agents that will revolutionize ecommerce. **Previously, prototyping each new agent required days of orchestration code and infrastructure setup before we could validate an idea. AgentCore harness has changed that: swapping a model, adding a tool, replacing a skill, or refining an agent’s instructions is now a configuration change, not a rebuild. We can now validate agent ideas in minutes instead of days, and we’re looking forward to accelerating agent development further with these new capabilities”.*\n\n**Kazumi Matsuda, Senior Manager, AI Promotion Department at FUJISOFT **noted that* “At Fujisoft, we’re building AI agents to accelerate software development and operations across our teams. Our framework, Character Capsule, packages agent roles, skills, and execution procedures as reusable capsules that scale to multi-agent orchestration on AgentCore. With AgentCore harness, we deploy new agents in minutes and version each change. Once in production, evaluations scores how our agents perform using execution logs, and AgentCore’s optimization capabilities generate prompt and tool suggestions based on those scores. We A/B test those recommendations on live traffic before rolling out, so improvement is continuous, not guesswork. Together, these capabilities let us stand up new agents quickly and keep improving them with confidence, catching quality regressions before they reach production and rolling out only the changes we’ve validated across our multi-agent patterns.”*\n\n## Get started\n\nAmazon Bedrock AgentCore harness is available today in [all AWS Regions where AgentCore is generally available](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agentcore-regions.html).\n\nThe faster a team can get from idea to working agent, the more ideas they can afford to test. The harness collapses that loop from days to minutes. We’re excited to see what you build.\n\n### Additional resources\n\nFor more information, see the following:\n\n[Get started](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/harness-get-started.html)[GitHub examples](https://github.com/awslabs/agentcore-samples/tree/main/01-features/01-harness)- Documentation", "url": "https://wpnews.pro/news/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in", "canonical_source": "https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-production-grade-agent-in-minutes/", "published_at": "2026-06-18 17:32:22+00:00", "updated_at": "2026-06-18 18:00:20.444235+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "large-language-models", "ai-products"], "entities": ["Amazon Bedrock", "AgentCore", "Simon Willison", "Kiro", "Amazon Q Developer", "Codex", "Claude Code", "CloudWatch"], "alternates": {"html": "https://wpnews.pro/news/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in", "markdown": "https://wpnews.pro/news/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in.md", "text": "https://wpnews.pro/news/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in.txt", "jsonld": "https://wpnews.pro/news/amazon-bedrock-agentcore-harness-is-now-generally-available-go-from-idea-to-in.jsonld"}}