{"slug": "stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai", "title": "Stratoclave: a tenant-aware credit gateway for Amazon Bedrock — now with OpenAI codex support", "summary": "Stratoclave, an open-source tenant-aware credit gateway for Amazon Bedrock, now supports OpenAI codex and GPT-5.x models through Bedrock's `bedrock-mantle` endpoint. The single FastAPI service running on ECS Fargate provides per-user credit tracking and RBAC controls across Anthropic Messages API and OpenAI Responses API routes, using DynamoDB for credit reservation and audit logging without requiring Postgres, Redis, or a SaaS control plane.", "body_md": "If you let a team share a single AWS account for Amazon Bedrock, you quickly run into questions Bedrock alone does not answer: who called which model, under whose budget, through which identity. **Stratoclave** is a small OSS gateway that puts those answers in front of Bedrock without dragging in Postgres, Redis, or a SaaS control plane.\n\nIt was originally written for myself — I just wanted per-user credits in front of Bedrock for personal use of Claude Code. It grew into something that now also covers OpenAI codex / GPT-5.x via Bedrock's `bedrock-mantle`\n\nendpoint.\n\nRepo:[(Apache 2.0, alpha)]`littlemex/stratoclave`\n\nStratoclave is a single FastAPI service on ECS Fargate that exposes two inference routes:\n\n| Route | Wire format | Backend |\n|---|---|---|\n`POST /v1/messages` |\nAnthropic Messages API |\n`bedrock:Converse` in us-east-1 |\n`POST /openai/v1/responses` |\nOpenAI Responses API |\n`bedrock-mantle` in us-east-2 / us-west-2 |\n\nBoth routes share the same DynamoDB-backed credit reservation, the same `messages:send`\n\n/ `responses:send`\n\nRBAC scopes, the same audit log, and the same three identity paths (Cognito password, AWS SSO via Vouch-by-STS, long-lived `sk-stratoclave-*`\n\nkeys).\n\nThe control plane is one AWS region (us-east-1) and one Fargate task. Bedrock for OpenAI is cross-region, but no second control-plane region is deployed.\n\nThe web console login screen redirects to the Cognito Hosted UI for password / SSO sign-in; CLI users instead run `stratoclave auth login`\n\nand then bring this tab into focus with `stratoclave ui open`\n\n.\n\nThe reason this exists. Every inference call atomically reserves `max_tokens + input_estimate`\n\nfrom the caller's budget with a conditional `UpdateItem`\n\n, invokes the upstream, then refunds the diff from the real token counts on return. `UsageLogs`\n\nalways records the actual spend, not the reservation. Concurrent requests cannot race past the quota — the conditional write either commits or fails.\n\nThe pipeline lives in one file (`backend/mvp/_pipeline.py`\n\n) and is shared between both routes — the OpenAI Responses route applies an extra reasoning-effort multiplier (1× / 2× / 4× / 8× for `low`\n\n/ `medium`\n\n/ `high`\n\n/ `xhigh`\n\n) on the upfront reservation because reasoning traces can blow output by an order of magnitude. The minimum reservation is 8 192 tokens regardless of multiplier.\n\nPersonal usage history shows per-call token counts, model names, and credit spend drawn from the same `UsageLogs`\n\ntable.\n\nThe single behaviour I am proudest of. The CLI signs an `sts:GetCallerIdentity`\n\nrequest locally with SigV4, the backend forwards the signed payload to STS verbatim, and the backend trusts only the `Arn`\n\n/ `UserId`\n\n/ `Account`\n\nSTS returns. No IdP refresh token ever touches the backend.\n\nThe pattern is the same one [HashiCorp Vault has used for a decade](https://developer.hashicorp.com/vault/docs/auth/aws) in its AWS `iam`\n\nauth method. Anything that populates `~/.aws/credentials`\n\nworks the same way: `aws sso login`\n\n, `saml2aws`\n\n, Entra ID / Okta / ADFS SAML federation, even a regular IAM user with long-lived keys (default DENY unless explicitly allowed per trusted account). EC2 instance profiles are rejected by default because they cannot be attributed to a single human.\n\nA full backend compromise cannot pivot into the customer's IAM Identity Center or SAML IdP. The worst-case blast radius is bounded to Stratoclave's own resources — Bedrock overspend, DynamoDB tampering, impersonation within this deployment.\n\nThe trusted-accounts admin page is where AWS account IDs and `allowed_role_patterns`\n\n(fnmatch) are managed — this is the allowlist that gates SSO logins from outside accounts.\n\n`stratoclave codex -- \"...\"`\n\n(and `stratoclave claude -- \"...\"`\n\n)\nA wrapper subcommand that mints a 30-minute ephemeral `responses:send`\n\n(or `messages:send`\n\n) key, hands it to the child process via env, and revokes the key on exit:\n\n``` bash\n$ stratoclave codex -- \"Write a hello-world Python function\"\n[INFO] Launching codex via Stratoclave proxy (model=openai.gpt-5.4, key=sk-stratoclave-...)\n[INFO] Child process uses an ephemeral responses-only API key;\n       the Cognito bearer is not exported and the user's\n       ~/.codex/config.toml is untouched.\n```\n\nThe child gets a key scoped to exactly one route; the user's Cognito bearer never leaves the parent process. MCP servers and tool subprocesses started by codex cannot pivot back into the user's stratoclave admin endpoints because the env they inherit doesn't carry the right credentials.\n\nThe same wrapper exists for Claude Code (`stratoclave claude`\n\n). They share the env-scrub list and the revoke-on-exit lifecycle through one Rust struct (`ChildLauncher`\n\n) so a fix to one applies to both.\n\n`/.well-known/stratoclave-config`\n\nOne unauthenticated discovery endpoint that drives the entire CLI bootstrap:\n\n``` bash\n$ stratoclave setup https://<your>.cloudfront.net\n$ stratoclave auth sso --profile your-aws-sso-profile     # or `auth login --email`\n$ stratoclave codex -- \"...\"\n```\n\nThe endpoint returns Cognito IDs, default model names, and OpenAI base path / supported regions when `CODEX_ENABLED=true`\n\n. Old CLI binaries hitting a new backend deserialize cleanly because every new field is `Optional`\n\n.\n\nAnything that speaks Anthropic Messages or OpenAI Responses with a custom `base_url`\n\nworks. The CLI is just a quality-of-life wrapper. If you prefer using the OpenAI SDK directly:\n\n``` python\nimport openai\n\nclient = openai.OpenAI(\n    base_url=\"https://<your>.cloudfront.net/openai/v1\",\n    api_key=\"sk-stratoclave-xxxxxxxx...\",   # mint via web console or CLI\n)\nresp = client.responses.create(\n    model=\"openai.gpt-5.4\",\n    input=\"Hello\",\n)\nprint(resp.output_text)\n```\n\nSame for Anthropic:\n\n``` python\nfrom anthropic import Anthropic\n\nclient = Anthropic(\n    base_url=\"https://<your>.cloudfront.net\",\n    api_key=\"sk-stratoclave-xxxxxxxx...\",\n)\nprint(client.messages.create(\n    model=\"claude-opus-4-7\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n).content[0].text)\n```\n\nClaude Desktop's Cowork (Gateway mode) and Cline / Continue / Aider with `ANTHROPIC_BASE_URL`\n\nwork the same way.\n\nThe personal API-keys page is where you manage long-lived `sk-stratoclave-*`\n\nkeys yourself — the same key-shape the wrapper subcommands mint ephemerally and revoke on exit.\n\nThe admin flow from dashboard to a provisioned tenant member, top to bottom.\n\nThe dashboard summarises tenants, users, recent activity, and credit consumption.\n\nThe new-user form collects email, role (`admin`\n\n/ `team_lead`\n\n/ `user`\n\n), tenant assignment, and an initial credit budget.\n\nThe user detail view shows assigned role, tenant, remaining vs total credit, and the user's own API keys.\n\nThe tenant detail view lists members, their credit balances, and the tenant-wide monthly cap.\n\nThe admin usage-logs page is the audit trail. Filter by `tenant_id`\n\n, `user_id`\n\n, and ISO-8601 `since`\n\n/ `until`\n\n— backed by a PK Query when `tenant_id`\n\nis set, a GSI Query when `user_id`\n\nis set, and a Scan otherwise (truncated at 100 rows).\n\n`bedrock-mantle`\n\ncalls for OpenAI are cross-region (us-east-2 / us-west-2).| Dimension | Stratoclave | LiteLLM Proxy |\n|---|---|---|\n| Providers | Amazon Bedrock (Claude family + OpenAI GPT-5.x) | 100+ (OpenAI, Anthropic, Bedrock, Vertex, Azure, Gemini, Ollama, …) |\n| State | DynamoDB only (serverless) | Postgres required, Redis recommended |\n| RBAC | admin / team_lead / user, tenant-scoped | Proxy / Internal User / Team, global / team / user / key / model budgets |\n| API keys |\n`sk-stratoclave-*` , scope narrowing, cap of 5 active, immediate revoke |\nVirtual keys with `expires / max_budget / rpm_limit / tpm_limit / models`\n|\n| SSO / STS | Built-in (Vouch by STS, covers `aws sso` , `saml2aws` , IAM users) |\nEnterprise tier (Okta / Entra ID / OIDC / SAML) |\n| Deploy | AWS CDK v2, Fargate from 256 CPU / 512 MiB | Docker / Helm / ECS / EKS / Cloud Run |\n| License | Apache 2.0 (everything OSS) | Dual license (MIT + Commercial); SSO / audit are commercial |\n| CLI integration |\n`stratoclave claude --` / `stratoclave codex --` ephemeral wrappers |\n`ANTHROPIC_BASE_URL` / `OPENAI_BASE_URL` env override |\n\nYou're an AWS-native team that already has IAM Identity Center / `saml2aws`\n\n, you only call Bedrock, and you do not want to run an RDBMS for a proxy. You want per-tenant credit, per-user override, an audit trail, and the option to mint short-lived keys for CI without touching Postgres.\n\nYou're not? Pick LiteLLM. Stratoclave is opinionated and small on purpose.\n\n```\n# Deploy to your AWS account\ngit clone https://github.com/littlemex/stratoclave.git\ncd stratoclave\nexport AWS_PROFILE=your-admin-profile\nexport AWS_REGION=us-east-1 CDK_DEFAULT_REGION=us-east-1\nexport CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query Account --output text)\ncd iac && npm install && ./scripts/deploy-all.sh\n\n# Build the CLI (pre-built binaries TBD)\ncd ../cli && cargo build --release\nexport PATH=\"$PWD/target/release:$PATH\"\n\n# Bootstrap and use\nstratoclave setup https://<your>.cloudfront.net\nstratoclave auth sso --profile <your-aws-sso-profile>\nstratoclave codex -- \"Hello, who are you?\"\nstratoclave claude -- \"Summarise this repository\"\n```\n\nAlpha. Public HTTP surfaces, DynamoDB schemas, and CDK construct props may change without notice until `v0.1.0`\n\nis cut. Issues and pull requests welcome.\n\nIf you read this far and the tradeoffs match your situation, I would be very glad to hear how the deploy goes.", "url": "https://wpnews.pro/news/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai", "canonical_source": "https://dev.to/littlemex63454/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai-codex-support-266", "published_at": "2026-06-03 11:21:13+00:00", "updated_at": "2026-06-03 11:43:07.590866+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-tools", "ai-products", "artificial-intelligence", "large-language-models"], "entities": ["Stratoclave", "Amazon Bedrock", "OpenAI", "FastAPI", "ECS Fargate", "DynamoDB", "Cognito", "AWS SSO"], "alternates": {"html": "https://wpnews.pro/news/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai", "markdown": "https://wpnews.pro/news/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai.md", "text": "https://wpnews.pro/news/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai.txt", "jsonld": "https://wpnews.pro/news/stratoclave-a-tenant-aware-credit-gateway-for-amazon-bedrock-now-with-openai.jsonld"}}