cd /news/ai-agents/how-to-add-honeycomb-traces-to-your-… · home topics ai-agents article
[ARTICLE · art-19664] src=dev.to pub= topic=ai-agents verified=true sentiment=· neutral

How to add Honeycomb traces to your AI Slack bot

Lunch Pail Labs has integrated Honeycomb traces into its AI Slack agent Pipa to solve the black-box problem when the bot fails. The agent, which runs in E2B sandboxes using OpenCode, now sends OpenTelemetry data to Honeycomb, creating a single trace per run that includes spans for sandbox preparation, OpenCode commands, and Slack delivery. To capture the agent's actual behavior beyond infrastructure metrics, the team emits custom events for user messages, assistant responses, tool calls, and retries, enabling Honeycomb's Agent Timeline feature to visualize the full session.

read4 min publishedJun 2, 2026

Pipa is our agent for studio operations at Lunch Pail Labs. She lives in Slack, is powered by E2B sandboxes, and uses OpenCode for the harness.

When it works, it’s awesome. You ask for help. Pipa goes off, runs the tools, does the work, and comes back with the answer.

When it goes wrong, it’s a complete black box. In the terminal, I can see the mess: tool calls, permission prompts, stalls, and weird little “I can’t do that” moments. In Slack, most of that disappears behind a typing indicator and one final message.

If you’re building an AI agent that lives in Slack or runs in the background, this pain may feel familiar. You need traces. This is the setup I used to send mine to Honeycomb.

A trace shows one request moving through your system. For a Slack agent, that usually means one Slack message, one agent run, and one Slack reply.

The shape depends on your architecture. In Pipa, a chat gateway pings an E2B sandbox, loads the right skills and templates, and runs OpenCode.

For Pipa, I put the telemetry in the chat gateway because that layer sees the whole run: the Slack prompt, the sandbox lifecycle, the OpenCode event stream, stdout and stderr, retries, run status, and Slack delivery status.

The gateway creates one trace per run. That trace is made of spans. A span is one timed step inside the run, like preparing the sandbox, starting OpenCode, watching a tool call, or sending the final Slack reply.

For my bot, the trace includes spans like:

pipa.run.execute

pipa.e2b.sandbox.prepare

pipa.opencode.command

invoke_agent pipa.standard_opencode

The first few spans explain the backend path. The invoke_agent

span explains the agent session. Inside that span, Pipa attaches events for the user message, assistant responses, tool calls, retries, and run summary.

Now the boring infrastructure stuff and the agent’s actual behavior live in the same trace.

OpenTelemetry creates and sends the trace data. Honeycomb is where I inspect it. In Pipa, the gateway sends traces to Honeycomb when HONEYCOMB_API_KEY

is present.

Slack event -> gateway run -> sandbox prepare -> OpenCode command -> Slack reply

For Pipa, the top-level run span is pipa.run.execute

. This is the span I search for when someone says, “the bot got stuck” or “Slack never got a good answer.”

These spans tell me whether the basic plumbing worked:

pipa.e2b.sandbox.prepare

pipa.opencode.command

They answer the early questions. Did the sandbox start? Did OpenCode run? How long did each step take? Did the failure happen before the agent really got going?

One thing I missed at first: traces alone do not give you an agent timeline.

You can emit a bunch of spans and Honeycomb will show you a normal backend trace. Useful, yes, but it still reads like infrastructure: Slack event received, sandbox started, command ran, response sent. You can see which services ran and how long they took. You still cannot see what the agent did.

For that, you need Honeycomb’s Agent Timeline.

The way I understand it, Agent Timeline works when Honeycomb can recognize an agent invocation span and read the agent’s activity as events inside that span. Instead of showing only backend work, the trace can show the session itself: the user prompt, assistant messages, tool calls, tool results, retries, and final output.

A normal trace might look like:

slack.event.received -> sandbox.prepare -> opencode.command -> slack.reply.sent

An agent timeline is more like:

invoke_agent -> user_message -> assistant_response -> tool_call -> tool_result -> assistant_response -> run_summary

That distinction matters. Honeycomb can show normal service spans automatically, but it will not magically know what happened inside your agent. You have to emit those events yourself.

As far as I understand it, “Agent Timeline” is Honeycomb’s product view for this. Other observability tools may support GenAI tracing, span events, or custom trace views. Honeycomb’s Agent Timeline is the feature I was targeting here.

Pipa emits a Honeycomb Agent Timeline-compatible span:

invoke_agent pipa.standard_opencode

That span gets the attributes Honeycomb needs to group and display the run:

gen_ai.conversation.id

gen_ai.agent.name

gen_ai.operation.name

gen_ai.request.model

app.run_id

This is what lets Honeycomb show the agent session as a timeline instead of a pile of unrelated events.

Honeycomb can show service calls, but it cannot know what your agent did unless you tell it. The gateway sends the moments I actually care about: the user message, the tool call, the agent response, and the run summary.

For Pipa, the gateway parses structured opencode run --format json

output and emits events such as:

opencode.user_message

opencode.tool_call

opencode.agent_response

opencode.run_summary

opencode.parser_skipped

That makes the hidden middle visible: the prompt, assistant-visible text, tool names, tool status, token and cost data, retry markers, and failure summaries.

After adding traces, run the bot from Slack and inspect the trace in Honeycomb. The trace should answer these questions:

If the trace cannot answer them, add another span or event at the layer that can see the missing context.

If your AI bot runs inside Slack, traces are how you see what happened between the prompt and the final reply.

Start with the questions you ask when the bot responds badly. Then emit spans and events around those moments. One Slack request should become one run you can actually inspect in Honeycomb.

Originally published on Lunch Pail Labs.

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-to-add-honeycomb…] indexed:0 read:4min 2026-06-02 ·