AWS published its June 15, 2026 weekly roundup recapping recent infrastructure and AI tooling moves. The post, authored by Esra Kayabali, highlights new compute capacity, model availability on Bedrock, agentic tooling for cost and observability, and a maintenance mode shift for legacy CLI users. It also surfaces internal Amazon productivity data from agent adoption experiments.
Graviton5-Powered EC2 M9g and M9gd Instances Reach General Availability
M9g and M9gd instances, powered by AWS Graviton5 processors on the sixth-generation Nitro System, are now generally available.
Key specs and performance data from the announcement:
- Up to 25% better overall compute performance vs. Graviton4-based instances.
- Up to 35% faster performance for web applications and machine learning inference.
- Up to 30% faster for databases.
- First AWS processor to support PCIe Gen6 and DDR5-8800 memory, with 5x larger L3 cache than prior generation.
- Up to 15% higher network bandwidth and 20% higher Amazon EBS bandwidth (average across sizes) vs. M8g instances.
- M9gd variant adds up to 11.4 TB of NVMe SSD local storage with 30% higher IOPS vs. M8gd.
- Introduces Nitro Isolation Engine using formal verification for mathematically proven isolation between VMs.
- Supports Instance Bandwidth Configuration (IBC) to adjust allocation between EBS and VPC networking by up to 25%.
Actions to take: Teams running general-purpose, web, ML inference, or database workloads should benchmark M9g/M9gd instances against current Graviton4 or x86 setups for potential cost/performance gains, especially where network or storage IOPS matter. Review Nitro System formal verification details for compliance-sensitive deployments.
Gemma 4 Family Now Available on Amazon Bedrock
Google DeepMind’s Gemma 4 models launched on Bedrock in three variants:
- Gemma 4 31B (dense architecture, 256K-token context window) — positioned for reasoning and coding workloads.
- Gemma 4 26B-A4B (mixture-of-experts) — targeted at cost- and latency-sensitive use cases.
- Gemma 4 E2B (smallest variant) — designed for low-latency interactive applications.
All variants support native function calling, structured output, reasoning, response streaming, multimodal input (text, image, video, audio), and more than 35 languages.
Actions to take: Developers and AI teams should test the 31B variant for complex reasoning/coding tasks and the MoE/smaller variants for production latency or cost optimization. Compare against current Bedrock models on specific workloads before broad rollout.
Anthropic Claude Fable 5 Launched Then Revoked on Bedrock
Claude Fable 5 became available on Bedrock June 9 with extended asynchronous task execution, advanced vision capabilities (diagrams, charts, PDFs), and proactive self-verification. Access requires opting into the Data Retention API (Anthropic mandates 30-day retention of inputs/outputs for Mythos-class models).
On June 12, Anthropic requested AWS revoke access to Claude Fable 5 and Claude Mythos 5 for all users to comply with a US Government export control directive. All other models, including Opus 4.8, remain unaffected.
Actions to take: Any teams that began using Fable 5 should immediately check access status and migrate workloads to unaffected models. Review Data Retention API opt-in requirements and Anthropic’s compliance statement for any Mythos-class usage. Monitor export control developments as a vendor risk factor.
AWS FinOps Agent Enters Preview
A new FinOps Agent is available in preview for practitioners and engineering teams. Capabilities include:
- Answering cost questions and generating reports for finance/engineering stakeholders.
- Surfacing rightsizing, idle resource, and Savings Plans recommendations pulled from AWS Cost Optimization Hub and Compute Optimizer.
- Automatically investigating cost anomaly root causes and posting findings to Slack channels.
- Running recurring FinOps workflows on user-defined schedules.
- Opening Jira tickets on behalf of users based on recommendations.
Available at no additional charge during preview (US East N. Virginia region noted in related materials).
Actions to take: FinOps and platform teams should enable the preview in supported regions, connect it to existing Cost Optimization Hub/Compute Optimizer data, and test automated anomaly investigation + Slack/Jira workflows. Use early results to quantify potential savings before general availability.
Additional Tooling and Platform Updates
Kiro Pro Max tier now available: Higher usage limits, access to latest frontier models, and expanded agentic capabilities for sustained high-volume coding, specification generation, and agent-driven development work.Amazon OpenSearch Service MCP Apps for agentic observability: AI agents in compatible IDEs (e.g., Claude Desktop, VS Code) can now investigate incidents using logs, traces, metrics, and alerts from OpenSearch domains/collections and Amazon Managed Service for Prometheus. Each tool call returns both a text summary for agent reasoning and an interactive visualization in the same thread. Tools cover log/metrics/trace investigation, service performance, topology, dynamic visualizations, agent/cluster health, and instrumentation scoring.AWS Workload Credentials Provider now available: Allows workloads outside AWS (third-party or on-premises) to obtain short-term credentials without long-term access keys, supporting least-privilege patterns.AWS CLI v1 enters maintenance mode: New releases limited to critical bug fixes and security issues. botocore and s3transfer dependencies are now vendored directly into the CLI v1 codebase (standalone package updates no longer affect it). Environments running both CLI v1 and boto3 will maintain separate library copies.
Actions to take: Development teams using Kiro should evaluate the Pro Max tier for higher-volume agentic workflows. Observability and SRE teams should test OpenSearch MCP Apps inside supported IDEs for faster incident investigation. Workloads running outside AWS should adopt the new Credentials Provider to eliminate long-term keys. All CLI v1 users should begin migration planning to v2 ahead of full maintenance restrictions.
Frontier Team Productivity Data Shared
The roundup also spotlights a detailed post from Swami on AI-native development experiments across hundreds of Amazon engineering teams. Key measured outcomes:
- A six-engineer team rebuilt the Amazon Bedrock inference engine in 76 days (originally scoped for 30 developers over 12–18 months).
- Median productivity gain of 4.5x in normalized deployment velocity across structured pilots with Amazon Stores teams; some teams exceeded 10x.
- Specific examples: Perfect Order Experience feature cycle reduced from two weeks to an afternoon; WW Grocery design document creation cut from five days to a few hours.
Five practices highlighted for teams adopting agents: invest in agent context (steering files, standards, repositories) before production code; expect and push through an initial slowdown; maintain a steady backlog of well-scoped tasks; make intent explicit via structured specifications; shift testing left for agent self-correction.
Actions to take: Engineering leaders should review the full Swami post and pilot the five practices on a contained project, tracking deployment velocity and cycle time before scaling. Treat the 4.5x–10x+ gains as directional benchmarks rather than guaranteed outcomes.
Source: https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-finops-agent-in-preview-gemma-4-on-bedrock-kiro-pro-max-and-more-june-15-2026/ Teams evaluating new compute capacity, Bedrock model options, or cost/observability automation should prioritize the GA instances, Gemma 4 variants, and FinOps Agent preview for near-term testing. The CLI v1 maintenance mode and Fable 5 revocation serve as near-term migration and risk signals.