{"slug": "show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan", "title": "Show HN: Lookspan – local-first observability for AI agents (npx lookspan)", "summary": "Lookspan launched a local-first observability dashboard for AI agents that runs entirely on the user's machine with zero cloud dependency. The open-source tool ingests trace data via HTTP, MCP, OpenTelemetry, or SDK adapters for LangGraph, CrewAI, and other frameworks, storing everything in local SQLite. By eliminating accounts, API keys, and data shipping to external servers, Lookspan gives developers real-time visibility into agent failures, token usage, and costs while keeping all data private.", "body_md": "**Local-first observability dashboard for AI agents. MCP-native. See every span your agents emit.**\n\n```\nnpx lookspan          # → http://127.0.0.1:3100\nAgent (MCP · LangGraph · CrewAI · OpenTelemetry · HTTP)  →  POST /api/ingest  →  SQLite  →  real-time dashboard\n```\n\n🇪🇸 ¿Prefieres español? Lee el\n\n[README en español].\n\nWhen an AI agent misbehaves — fails, stalls, or quietly burns more tokens than expected — there's no native way to see what happened step by step. Existing observability tools are cloud-first: they want accounts, API keys, and shipping your production data to someone else's servers.\n\nLookspan takes the opposite approach: **everything runs on your machine, data\nnever leaves it, and infra cost is zero.** Instrument your agent with an adapter\n(or just POST JSON) and open the dashboard in your browser.\n\n```\nnpx lookspan              # → http://127.0.0.1:3100, no install, no cloud\n```\n\nSend your first span from any language:\n\n```\ncurl -X POST http://127.0.0.1:3100/api/ingest \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"spans\":[{\"traceId\":\"t1\",\"spanId\":\"s1\",\"parentSpanId\":null,\"type\":\"llm_call\",\"name\":\"agent.run\",\"startedAt\":\"2026-06-02T10:00:00Z\",\"endedAt\":\"2026-06-02T10:00:01Z\",\"status\":\"ok\",\"framework\":\"custom\",\"model\":\"gpt-4o\",\"provider\":\"openai\",\"usage\":{\"inputTokens\":1000,\"outputTokens\":500,\"costUsd\":0}}]}'\n```\n\nOpen `http://127.0.0.1:3100`\n\nand watch the trace appear — with its cost computed server-side.\n\n**HTTP span ingest**—`POST /api/ingest`\n\naccepts JSON batches of spans. Works with any agent that can make an HTTP request.**MCP-native**— the`@lookspan/mcp`\n\nTypeScript SDK wraps any`McpClient`\n\nand emits a span per MCP tool call, without changing your agent code.**Python SDKs**—`lookspan`\n\n(generic client) plus adapters for LangGraph/LangChain (`lookspan-langgraph`\n\n) and CrewAI (`lookspan-crewai`\n\n).**OpenTelemetry**— an OTLP/HTTP receiver at`POST /v1/traces`\n\n; point any OTel exporter at it with no Lookspan SDK.`gen_ai.*`\n\nattributes map to provider/model/tokens.**Real-time streaming**— SSE endpoint`GET /api/stream`\n\npushes`span.ingested`\n\n,`trace.updated`\n\nand`alert.triggered`\n\nto the dashboard, no polling.**React dashboard**— recent traces with a health strip + per-row latency/cost mini-bars; trace detail with a** timeline (waterfall)**or tree view and a** conversation transcript**of the prompt/response; replay diffs and A/B run comparison; costs & overview (error rate, latency p50/p95/p99, cost per day); alerts history.**Cost tracking**— aggregates input/output/cached/reasoning tokens and computes`cost_usd`\n\nper span and per trace from a model pricing table, overridable with`--pricing`\n\n.**Alerts**— get notified when a trace fails or exceeds a cost/token/duration threshold (toast + desktop notification + CLI + persisted history).**Evaluation scores**— attach metrics to a trace (`POST /api/traces/:id/scores`\n\n) from an LLM judge, an assertion, or by hand.**Replay & LLM-as-judge**— re-run a trace's captured prompt against the same or a different model and diff cost/latency/output, or have a judge model score the response 0–1. Needs a provider key (env, in-memory only).**Datasets & experiments**— collect prompts into a test set (seed from a trace or add by hand), run the whole set against a model in batch and score each output with the judge — aggregate cost/latency/score per run.**Local SQLite**— versioned migrations. Database at`~/.lookspan/lookspan.db`\n\nby default; configurable via flag or env var. Optional retention with`--retention`\n\n.**Security**— binds to`127.0.0.1`\n\nby default; optional`--token`\n\nauth; server-side redaction of credential-looking attributes before storage.**One-line CLI**—`npx lookspan`\n\nstarts the server and the dashboard with no global install.\n\nWrap your client in one line — every model call is traced (no OTel, no proxy):\n\n```\nnpm install @lookspan/openai\npython\nimport OpenAI from 'openai';\nimport { observeOpenAI } from '@lookspan/openai';\n\nconst openai = observeOpenAI(new OpenAI());\nawait openai.chat.completions.create({ model: 'gpt-4o', messages });\nnpm install @lookspan/anthropic\npython\nimport Anthropic from '@anthropic-ai/sdk';\nimport { observeAnthropic } from '@lookspan/anthropic';\n\nconst anthropic = observeAnthropic(new Anthropic());\nawait anthropic.messages.create({ model: 'claude-sonnet-4-6', max_tokens: 1024, messages });\nnpm install @lookspan/mcp\njs\nimport { wrapMcpClient, HttpSpanExporter } from '@lookspan/mcp';\n\nconst exporter = new HttpSpanExporter({ endpoint: 'http://127.0.0.1:3100/api/ingest' });\nconst { client } = wrapMcpClient(mcpClient, { exporter, agentId: 'my-agent' });\n\n// Use it exactly as before — every callTool emits a tool_call span.\nawait client.callTool({ name: 'read_file', arguments: { path: '/tmp/foo.txt' } });\nawait exporter.flush();\npip install lookspan            # + lookspan-langgraph / lookspan-crewai\npython\nfrom lookspan import LookspanClient\nfrom lookspan_langgraph import LookspanCallbackHandler\n\nclient = LookspanClient(endpoint=\"http://127.0.0.1:3100/api/ingest\")\nhandler = LookspanCallbackHandler(client=client, agent_id=\"my-agent\")\n\nresult = graph.invoke({\"messages\": []}, config={\"callbacks\": [handler]})\nclient.flush()\n```\n\nPoint any OTel exporter at the standard OTLP endpoint:\n\n```\nexport OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://127.0.0.1:3100/v1/traces\n# protobuf (the OTel default) and JSON are both accepted\n```\n\nMore runnable examples in [ examples/](/JoniMartin27/lookspan/blob/main/examples).\n\nThe drop-in SDKs capture each call's prompt and reply (`captureContent`\n\n, on by\ndefault; secrets are scrubbed server-side). With that, Lookspan can close the\nloop from *observe* to *improve* — open a trace and use the **Replay & judge**\npanel, or call the API directly:\n\n```\n# Provider keys live in memory only — never written to the DB or logged.\nLOOKSPAN_OPENAI_API_KEY=sk-... npx lookspan\n#   ...or LOOKSPAN_ANTHROPIC_API_KEY / --openai-key / --anthropic-key\n\n# Replay the captured prompt against another model and diff cost/latency/output\ncurl -X POST localhost:3100/api/traces/<id>/replay -H 'content-type: application/json' \\\n  -d '{\"model\":\"gpt-4o-mini\"}'   # omit \"model\" to re-run the original\n\n# Score the response 0–1 with an LLM judge (stored as an \"llm-judge\" score)\ncurl -X POST localhost:3100/api/traces/<id>/judge -H 'content-type: application/json' \\\n  -d '{\"metric\":\"correctness\"}'\n```\n\nTo keep prompts/outputs out of Lookspan entirely, pass `{ captureContent: false }`\n\nto `observeOpenAI`\n\n/ `observeAnthropic`\n\n— replay & judge then stay disabled.\n\nScale evaluation from one trace to a whole test set. Build a **dataset** (seed\nitems from real traces or add them by hand), then **run** it against a model —\neach item is replayed and, optionally, scored by the judge, with aggregate\ncost/latency/score per run. Manage it all under **Datasets** in the dashboard, or:\n\n```\n# Create a dataset and add the captured prompt of a trace as an item\nDS=$(curl -s -X POST localhost:3100/api/datasets -d '{\"name\":\"regressions\"}' -H 'content-type: application/json' | jq -r .dataset.id)\ncurl -X POST localhost:3100/api/datasets/$DS/items/from-trace -H 'content-type: application/json' -d '{\"traceId\":\"<id>\"}'\n\n# Run the whole set against a model, judging each output\ncurl -X POST localhost:3100/api/datasets/$DS/run -H 'content-type: application/json' \\\n  -d '{\"model\":\"gpt-4o-mini\",\"judge\":true,\"metric\":\"correctness\"}'\n```\n\n| Method | Path | Description |\n|---|---|---|\n`GET` |\n`/api/health` |\nService status |\n`POST` |\n`/api/ingest` |\nIngest spans (body: `IngestPayload` ) |\n`GET` |\n`/api/traces` |\nList traces (paginated; filter by `framework` , `status` , `sessionId` ) |\n`GET` |\n`/api/traces/:id` |\nTrace detail with all its spans and scores |\n`POST` |\n`/api/traces/:id/scores` |\nAttach an evaluation score (`{name, value, comment?, source?}` ) |\n`POST` |\n`/api/traces/:id/replay` |\nRe-run the captured prompt (`{model?, provider?, spanId?}` ); needs a provider key |\n`GET` |\n`/api/traces/:id/replays` |\nList past replays for the trace |\n`POST` |\n`/api/traces/:id/judge` |\nLLM-as-judge: score the prompt/response (`{metric?, model?, provider?, rubric?}` ) |\n`GET` `POST` |\n`/api/datasets` |\nList / create datasets |\n`GET` |\n`/api/datasets/:id` |\nDataset detail (items + runs) |\n`POST` |\n`/api/datasets/:id/items` |\nAdd item(s) (`{input, expected?}` or `{items:[…]}` ) |\n`POST` |\n`/api/datasets/:id/items/from-trace` |\nSeed an item from a trace's captured prompt |\n`POST` |\n`/api/datasets/:id/run` |\nRun the set against a model (`{model, judge?, metric?}` ); needs a provider key |\n`GET` |\n`/api/runs/:id` |\nRun summary + per-item results |\n`GET` |\n`/api/sessions` |\nList sessions (agents, traces, cost, errors, time range) |\n`GET` |\n`/api/sessions/:id` |\nSession summary + its traces (multi-agent timeline) |\n`GET` |\n`/api/costs/summary` |\nCost breakdown (total, by model, provider, agent) |\n`GET` |\n`/api/stats` |\nStats summary (totals, error rate, latency p50/p95/p99, cost per day) |\n`GET` |\n`/api/alerts` |\nHistory of triggered alerts |\n`GET` |\n`/api/stream` |\nReal-time SSE event stream |\n`POST` |\n`/v1/traces` |\nOpenTelemetry OTLP/HTTP trace receiver (JSON `ExportTraceServiceRequest` ) |\n\n```\nnpx lookspan [options]\n  -p, --port <port>        Port to listen on            (default: 3100)\n      --host <host>        Host to bind to              (default: 127.0.0.1)\n      --db <path>          SQLite database path         (default: ~/.lookspan/lookspan.db)\n      --retention <dur>    Prune traces older than e.g. 7d, 24h, 30m\n      --token <token>      Require Authorization: Bearer <token> on the API\n      --pricing <file>     Custom model pricing table (JSON)\n      --alert-error                Alert when a trace fails\n      --alert-cost <usd>           Alert when a trace costs more than <usd>\n      --alert-tokens <n>           Alert when a trace exceeds <n> tokens\n      --alert-duration <ms>        Alert when a trace takes longer than <ms>\n      --open               Open the dashboard in your browser\n  -h, --help               Show help\n  -v, --version            Show version\n```\n\nEvery flag has a `LOOKSPAN_*`\n\nenvironment-variable equivalent (`LOOKSPAN_PORT`\n\n, `LOOKSPAN_TOKEN`\n\n, `LOOKSPAN_PRICING`\n\n, `LOOKSPAN_ALERT_*`\n\n, …). Replay & LLM-as-judge read `LOOKSPAN_OPENAI_API_KEY`\n\n/ `LOOKSPAN_ANTHROPIC_API_KEY`\n\n(or `--openai-key`\n\n/ `--anthropic-key`\n\n); these stay in memory and are never persisted.\n\nLookspan |\nLangfuse | Phoenix (Arize) | |\n|---|---|---|---|\n| Startup | `npx lookspan` (zero infra) |\nDocker + Postgres + ClickHouse | `pip install` (Python) |\n| Storage | local SQLite | Postgres + ClickHouse | local / in-memory |\n| Focus | TS/JS + MCP stack |\nfull platform (evals, prompts) | evals / RAG (Python) |\n| Your data | never leaves your machine | self-host or cloud | local or cloud |\n| OpenTelemetry | native OTLP receiver | yes | yes (OTel-native) |\n\nLookspan isn't trying to be a full platform. It bets on being **the zero-setup\nobservability layer for the TypeScript/MCP agent stack**, with the best\nfirst-five-minutes experience. See the [ROADMAP](/JoniMartin27/lookspan/blob/main/docs/ROADMAP.md).\n\nLookspan binds to `127.0.0.1`\n\n(loopback) and requires no auth by default — right\nfor local use. If you expose it (`--host 0.0.0.0`\n\n), protect it with a token:\n\n```\nLOOKSPAN_TOKEN=my-token npx lookspan --host 0.0.0.0\n# /api/* and /v1/* then require Authorization: Bearer my-token (/api/health is exempt).\n```\n\nThe collector also **redacts** values of credential-looking keys\n(`authorization`\n\n, `api_key`\n\n, `token`\n\n, `secret`\n\n, `password`\n\n, `cookie`\n\n…) from\n`input`\n\n/`attributes`\n\nbefore persisting, so telemetry never drags secrets into\nthe database.\n\nThis is an npm-workspaces monorepo. `packages/`\n\nholds internal libraries, `apps/`\n\nthe dashboard, `python/`\n\nthe standalone Python SDKs.\n\n```\ngit clone https://github.com/JoniMartin27/lookspan.git\ncd lookspan\nnpm install\nnpm run dev        # API on :3100, dashboard with hot-reload on :5173\nnpm run ci         # typecheck + lint + test + build\n```\n\nContributions welcome — see [.github/CONTRIBUTING.md](/JoniMartin27/lookspan/blob/main/.github/CONTRIBUTING.md).\nRelease process in [docs/PUBLISHING.md](/JoniMartin27/lookspan/blob/main/docs/PUBLISHING.md). Security policy: [SECURITY.md](/JoniMartin27/lookspan/blob/main/SECURITY.md).\n\nMIT — Copyright (c) 2026 Jonathan Martin. See [LICENSE](/JoniMartin27/lookspan/blob/main/LICENSE).", "url": "https://wpnews.pro/news/show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan", "canonical_source": "https://github.com/JoniMartin27/lookspan", "published_at": "2026-06-03 23:05:30+00:00", "updated_at": "2026-06-03 23:46:51.583376+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-infrastructure", "mlops", "ai-products"], "entities": ["Lookspan", "OpenTelemetry", "LangGraph", "CrewAI", "OpenAI", "GPT-4o", "SQLite"], "alternates": {"html": "https://wpnews.pro/news/show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan", "markdown": "https://wpnews.pro/news/show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan.md", "text": "https://wpnews.pro/news/show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan.txt", "jsonld": "https://wpnews.pro/news/show-hn-lookspan-local-first-observability-for-ai-agents-npx-lookspan.jsonld"}}