# tracesage: See Inside Your LangGraph Agents

> Source: <https://dev.to/kjgpta/tracesage-see-inside-your-langgraph-agents-55ek>
> Published: 2026-06-16 14:08:03+00:00

*Open-source LangChain/LangGraph tracing — drop in two lines, watch your agents run live in your browser.*

If you've built anything non-trivial with **LangChain** or **LangGraph** — a multi-agent supervisor, a RAG pipeline, a tool-using ReAct loop — you know the feeling. It works on the happy path, then a real query comes in and… something goes wrong. But *what*?

Which agent actually ran? In what order?

Did the model call the tool you expected, or hallucinate a different one?

How many tokens did that one request burn?

Where did the error come from — your tool, the model, or the orchestration?

The usual answer is a wall of `print()`

statements and `verbose=True`

logs you scroll through at 2 a.m. There are great hosted tracing platforms, but they mean signing up, shipping your prompts to a third party, and wiring up an SDK.

I wanted something I could `pip install`

and have running in ten seconds, entirely on my laptop. So I built **tracesage**.

**tracesage** is a local-first observability tool for LangChain & LangGraph agents. It hooks into LangChain's callback stream, captures every chain / tool / LLM / retriever event, stores it locally (SQLite + gzipped blobs), and renders it as an **interactive graph + timeline UI** in your browser — in real time.

🚀 **Two-line integration.** One callback added to your existing `invoke`

/`ainvoke`

.

🧰 **Zero infrastructure.** No Docker, no Postgres, no external service. Just `pip install`

.

🔒 **Never crashes your app.** The callback handler is wrapped to never raise — tracing can fail, your agent keeps running.

🗺️ **MCP-aware.** Tools loaded from MCP servers are attributed back to their server, so you can see which tools came from where.

🧪 **Testable.** A `pytest`

fixture lets you assert "did my agent call `search`

?" in CI.

📦 **MIT licensed**, runs in a single Python process.

Links:

**Examples gallery (30 before/after apps):** in the repo under `examples/showcase/`

Before we write any code, see what we're aiming for:

```
pip install "tracesage[langchain]"
tracesage demo            # seeds a sample trace and opens the UI
```

Your browser opens to `http://localhost:7842/ui`

and you're looking at a live agent topology.

Let's build a tiny but real LangGraph agent and wire tracesage into it. You'll need Python 3.11+ and an LLM provider key (we'll use OpenAI; Anthropic or any LangChain model works identically — tracesage is provider-agnostic).

```
pip install "tracesage[langchain]" langgraph langchain-openai
export OPENAI_API_KEY=sk-...
```

`before.py`

— a standard LangGraph ReAct agent with two tools:

``` python
import asyncio
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"It's 22°C and sunny in {city}."

@tool
def to_fahrenheit(celsius: float) -> float:
    """Convert a temperature in Celsius to Fahrenheit."""
    return celsius * 9 / 5 + 32

agent = create_react_agent(
    ChatOpenAI(model="gpt-4o-mini", temperature=0),
    tools=[get_weather, to_fahrenheit],
)

async def main() -> None:
    result = await agent.ainvoke(
        {"messages": [{"role": "user",
                       "content": "What's the weather in Paris, in Fahrenheit?"}]}
    )
    print(result["messages"][-1].content)

asyncio.run(main())
```

Run it and you get an answer. But you have **no idea** how it got there — did it call `get_weather`

then `to_fahrenheit`

? Did it loop? How many model calls?

`after.py`

— the *only* difference is creating a tracer and passing its handler:

``` python
import asyncio
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

from tracesage import TraceSage          # 1️⃣ import

@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"It's 22°C and sunny in {city}."

@tool
def to_fahrenheit(celsius: float) -> float:
    """Convert a temperature in Celsius to Fahrenheit."""
    return celsius * 9 / 5 + 32

agent = create_react_agent(
    ChatOpenAI(model="gpt-4o-mini", temperature=0),
    tools=[get_weather, to_fahrenheit],
)

async def main() -> None:
    tracer = await TraceSage.create()    # 2️⃣ start tracesage (UI on :7842)

    result = await agent.ainvoke(
        {"messages": [{"role": "user",
                       "content": "What's the weather in Paris, in Fahrenheit?"}]},
        config={"callbacks": [tracer.handler]},   # 3️⃣ the one line you add
    )
    print(result["messages"][-1].content)

    # Keep the process alive so you can explore the UI.
    input("Trace ready at http://localhost:7842/ui — press Enter to exit.")
    await tracer.stop()

asyncio.run(main())
```

That's it. Run `python after.py`

, open ** http://localhost:7842/ui**, and your run is there.

`with`

block
For scripts and notebooks, there's a context manager that starts the UI *and* installs a global handler — so you don't even pass `callbacks=`

:

``` python
import tracesage

with tracesage.trace() as tl:                 # starts UI + global capture
    result = agent.invoke({"messages": [...]})    # 🔍 tracesage: http://127.0.0.1:7842/ui/#run=...
    input("Trace ready — open the printed link, then Enter to exit.")
```

Every new run prints a clickable deep link to that exact trace.

Here's where tracesage earns its keep. Open a run and you get a **topology graph** of everything that happened.

Every node is one of six kinds, colour-coded in the legend (bottom-left):

| Kind | What it is |
|---|---|
`agent` |
a function you registered as a node, that calls other things |
`tool` |
a `@tool` side-effect function (DB, API, calculation) |
`llm` |
a language-model call (what you count, cost, and cache) |
`retriever` |
a `BaseRetriever` — the "R" in RAG |
`chain` |
plumbing: LCEL pipes, the LangGraph state machine, routing functions |
`mcp` |
a synthesized node grouping the tools loaded from one MCP server |

**Click any node** to open its inspector — call counts, durations, errors, and the tools it provides or uses:

The timeline on the right replays the run step-by-step; click a step to expand the full payload (prompts, tool inputs/outputs, token usage, and — on errors — the exception type and traceback).

If your agent loads tools from **MCP servers** (via `langchain-mcp-adapters`

), you usually lose track of *where* each tool came from — they all look like generic LangChain tools at runtime. tracesage fixes that.

Install the extra and register your MCP client:

```
pip install "tracesage[mcp]"
python
from langchain_mcp_adapters.client import MultiServerMCPClient
from tracesage import TraceSage
from tracesage.adapters.mcp import register_mcp_client

tracer = await TraceSage.create()

client = MultiServerMCPClient({
    "weather": {"command": "python", "args": ["weather_server.py"], "transport": "stdio"},
    "math":    {"command": "python", "args": ["math_server.py"],    "transport": "stdio"},
})

# Loads every server's tools AND records tool → server provenance.
tools = await register_mcp_client(tracer, client)
# Your own @tool functions stay "local" (unattributed) automatically.
```

Now the UI shows a **"Tools by source"** panel and dedicated `mcp:`

nodes — every tool is grouped by where it came from:

Click an MCP server node and you see exactly what it provides, how often it was called, and which agents used it:

A complete, runnable MCP example (two local stdio servers + hardcoded tools, **no API key needed**) lives in the repo at `examples/mcp/`

:

```
pip install "tracesage[mcp]"
python examples/mcp/main.py     # then open http://localhost:7842/ui
```

Tracing isn't just for eyeballing. tracesage ships a `pytest`

fixture (`tracesage_capture`

, auto-registered) so you can assert behaviour:

``` python
def test_agent_uses_search(tracesage_capture):
    agent.invoke("find me a hotel in Paris")
    tracesage_capture.assert_tool_called("get_weather")
    tracesage_capture.assert_no_errors()
    assert tracesage_capture.total_tokens()[0] < 5000   # input-token budget
```

No setup, no server — the fixture captures the run in-process and gives you assertions like `assert_tool_called`

, `assert_no_errors`

, and `total_tokens`

.

tracesage is built so you can wire it in *once* and control it per-environment:

**Kill switch:** set `TRACESAGE_ENABLED=false`

(or `enabled=False`

) and `TraceSage`

returns an inert tracer — no server, no DB, a no-op handler, near-zero overhead. Same code ships to prod; tracing just turns off.

**Capture without the UI:** `TRACESAGE_START_SERVER=false`

records traces to disk in prod without binding the in-process UI; view them later with `tracesage serve`

.

**Safety rails:** bearer-token auth, root-level sampling (`sample_rate`

), a per-run event cap, and a hard fail-stop if you bind a non-loopback address without an auth token.

```
# In prod: capture is off, zero overhead, no code change.
#   TRACESAGE_ENABLED=false
# Or capture quietly, no UI:
#   TRACESAGE_START_SERVER=false   → then `tracesage serve` to look later
pip install "tracesage[langchain]"
tracesage demo
```

**Docs & full quickstart:** [https://kjgpta.github.io/tracesage/](https://kjgpta.github.io/tracesage/)

**Concepts (what each node kind means):** [https://kjgpta.github.io/tracesage/concepts/](https://kjgpta.github.io/tracesage/concepts/)

**MCP attribution guide:** [https://kjgpta.github.io/tracesage/mcp/](https://kjgpta.github.io/tracesage/mcp/)

**30 before/after example apps:** `examples/showcase/`

in the repo

If you build agents with LangChain or LangGraph and you're tired of `print`

-debugging your way through a run, give it a spin. Two lines, one browser tab, and your agent stops being a black box.

*tracesage is MIT-licensed and open source.*