Lago Open-source SDK: Bill on top of your LLM token cost with no middleware

wpnews.pro

cd /news/large-language-models/lago-open-source-sdk-bill-on-top-of-… · home › topics › large-language-models › article

[ARTICLE · art-14738] src=github.com ↗ pub=2026-05-27T00:17Z topic=large-language-models verified=true sentiment=↑ positive

Lago Open-source SDK: Bill on top of your LLM token cost with no middleware

Lago released an open-source SDK that wraps existing LLM clients to automatically extract token usage data and send it to Lago's billing platform without requiring middleware or API changes. The SDK supports AWS Bedrock and Mistral providers with p99 overhead under 5 milliseconds, buffering usage events in memory and flushing them in batches while surviving provider or Lago outages through exponential backoff. The tool enables developers to bill customers based on LLM token consumption by attaching subscription IDs per call, per context, or as a default fallback.

read3 min views10 publishedMay 27, 2026

Instrument LLM clients and emit usage events to Lago for billing.

                  ┌──────────────┐
your code ──────► │ wrapped client│ ──► provider (Bedrock / Mistral / …)
                  └──────┬───────┘
                         │ (extract usage)
                         ▼
                  ┌──────────────┐
                  │  Lago events │ ──► api.getlago.com
                  └──────────────┘

Wraps your existing LLM client in place — no API surface change for your application code.
Extracts usage from each response into a normalized shape ( CanonicalUsage

). - Buffers events in memory, flushes them in batches to Lago's /events/batch

endpoint. - Survives provider/Lago outages with exponential backoff and a bounded buffer.

p99 wrap-overhead under 5 ms — your call is never blocked on Lago.

pip install lago-agent-sdk

For Bedrock support: pip install 'lago-agent-sdk[bedrock]'

(adds boto3

). For Mistral support: pip install 'lago-agent-sdk[mistral]'

(adds mistralai

import boto3
from lago_agent_sdk import LagoSDK

sdk = LagoSDK(
    api_key="<YOUR_LAGO_API_KEY>",
    api_url="https://api.getlago.com/api/v1/",
    default_subscription_id="sub_acme",
)
client = sdk.wrap(boto3.client("bedrock-runtime", region_name="eu-west-1"))

resp = client.converse(
    modelId="eu.amazon.nova-lite-v1:0",
    messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)
sdk.flush()

The wrapped client behaves identically to the original — same arguments, same return shape, same exceptions. The SDK adds an in-memory queue that batches events to Lago in the background.

from mistralai.client import Mistral
from lago_agent_sdk import LagoSDK

sdk = LagoSDK(api_key="...", default_subscription_id="sub_acme")
client = sdk.wrap(Mistral(api_key="..."))

resp = client.chat.complete(
    model="mistral-small-latest",
    messages=[{"role": "user", "content": "Hello"}],
)
sdk.flush()

Three ways to set the external_subscription_id

, in priority order:

client.converse(..., extra_lago={"subscription": "sub_acme", "dimensions": {"feature": "summarize"}})

sdk.set_subscription("sub_acme")

sdk = LagoSDK(api_key="...", default_subscription_id="sub_default")

Backed by contextvars

for safe propagation across asyncio

tasks.

Provider	Access	Status
AWS Bedrock	`Converse` (sync + stream)
✓
AWS Bedrock	`InvokeModel` (sync + stream), 7 model families
✓
Mistral	native SDK (`chat.complete` + `chat.stream` )
✓
OpenAI	native SDK	Phase 2
Anthropic	native SDK	Phase 2
Google Gemini	native SDK	Phase 2
LiteLLM	callback bridge	Phase 4

CanonicalUsage

carries 10 numeric fields. Which ones populate depends on the provider:

Field	Lago metric code	Bedrock	Mistral native
input	`llm_input_tokens`
✓	✓
output	`llm_output_tokens`
✓	✓
cache_read	`llm_cached_input_tokens`
✓ (Anthropic)	✓ (when cache hits)
cache_write	`llm_cache_creation_tokens`
✓ (Anthropic)	✗
cache_write_5m / 1h	`llm_cache_write_5m/1h_tokens`
✓ (Anthropic InvokeModel)	✗
reasoning	`llm_reasoning_tokens`
✗ (folded into output)	✗ (folded into output)
tool_calls	`llm_tool_calls`
✓	✓
image_input / audio_input	`llm_image/audio_input_tokens`
✗	✗

Reasoning, image, and audio fields will populate when Phase 2 native OpenAI ships.

The SDK never breaks your LLM call. If anything in instrumentation fails (adapter bug, Lago down, network error), the SDK swallows it, logs a warning, and your call returns normally.

Configurable via LagoConfig.on_error

callback to integrate with Sentry, Datadog, etc.:

from lago_agent_sdk import LagoConfig, LagoSDK

def on_error(exc: Exception, where: str) -> None:
    sentry.capture_exception(exc, tags={"sdk_phase": where})

sdk = LagoSDK(
    api_key="...",
    config=LagoConfig(api_key="...", on_error=on_error),
)

The SDK ships with default metric codes (llm_input_tokens

, llm_output_tokens

, etc.). You need to register matching billable metrics in your Lago tenant before events count toward charges. See Lago docs — Billable Metrics.

git clone https://github.com/getlago/lago-agent-sdk-python
cd lago-agent-sdk-python
python -m venv venv && source venv/bin/activate
pip install -e '.[dev]'
pytest

Run live integration tests (requires real credentials):

AWS_BEARER_TOKEN_BEDROCK="..." \
MISTRAL_API_KEY="..." \
LAGO_API_URL="https://api.getlago.com/api/v1/" \
LAGO_API_KEY="..." \
LAGO_EXTERNAL_SUBSCRIPTION_ID="sub_..." \
pytest tests/integration

Found a vulnerability? See SECURITY.md.

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/lago-open-source-sdk-bil…

Read original on github.com → github.com/getlago/lago-agent-sdk-python

mentioned entities

Lago

Bedrock

Mistral

boto3

Amazon Nova

metadata

sluglago-open-source-sdk-bill-on-top-of-your-llm-token-cost-with-no-middleware

topic#large-language-models

secondary4 topics

sentimentpositive

canonicalgithub.com

navigation

← prevAI Voice-Cloning Scam Steals Tho…

next →Agent Memory: An Anatomy

── more in #large-language-models 4 stories · sorted by recency

machinebrief.com · 11 Jul · #large-language-models

VAST Data: The Key to Making AI Infrastructure Work in the Exabyte Age

dev.to · 11 Jul · #large-language-models

What Bun’s Rust Rewrite Tells Us About Rebuilding the AI Infrastructure Layer in C#

machinebrief.com · 11 Jul · #large-language-models

How Avride's VLMs Enhance Safety for Delivery Robots

the-decoder.com · 11 Jul · #large-language-models

OpenAI admits it "didn't get everything quite right" with ChatGPT Work launch and scrambles to fix UX and costs

── more on @lago 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required