{"slug": "pi-agent-integration-message-parsing-retry-and-cancellation", "title": "Pi Agent Integration: Message Parsing, Retry, and Cancellation", "summary": "A developer integrating the pi CLI-based AI coding agent into an AI coding assistant project encountered three major challenges: parsing a private JSON event stream, handling ambiguous failure semantics with retry logic, and managing long-running interruptible processes. The solution involved splitting integration into two layers—a provider layer to normalize pi's output into a shared message stream and a core layer to translate business requests—with specific components for message parsing, retry, and cancellation.", "body_md": "When integrating a CLI-based AI agent, you can't avoid three things: how to translate its private event stream into stable messages, who's responsible for retry after failures, and how to cleanly stop the process when users click cancel. These three things essentially boil down to \"clarifying responsibilities\"—it's simple in theory, but you only realize how deep the water is when you actually do it.\n\nRecently, I've been working on an AI coding assistant project, and one of the agents to integrate is [pi](https://github.com/earendil-works/pi-coding-agent). It's a TUI/CLI coding agent that outputs JSON events line by line to stdout when running. Sounds simple—spawn the process, read output, parse it—but when you actually start, you realize \"integrating an agent CLI\" is completely different from \"integrating a regular CLI.\"\n\nWith a regular CLI, you read stdout, get an exit code, and that's it. But agent CLIs have three particularly headache-inducing characteristics:\n\nFirst, its event stream is a **private protocol**. `turn_start`\n\n, `session`\n\n, `message_update`\n\n, `message_end`\n\n, `turn_end`\n\n, `agent_end`\n\n—these are defined by pi itself, not any industry standard. Every upper layer that wants to consume it has to handle it separately, effectively leaking pi's internal details everywhere. It's like looking at someone from a distance—you think you see them clearly, but you're only seeing the side they want to show you.\n\nSecond, its failure semantics are **particularly ambiguous**. The agent might encounter network jitter, model rate limiting, or process crashes while running. Should it retry? Where to retry? Will retry mess up the session state that's already half-output? This is an architectural decision, not something you can solve by casually writing a `for`\n\nloop.\n\nThird, it's **long-running and interruptible**. A single turn might run for tens of seconds or even minutes, and users might want to cancel at any time. When canceled, the process can't become an orphan, tool calls can't be left half-finished, and already output content can't be lost. The water here is much deeper than imagined.\n\nTo solve these pain points, we spent some time straightening out the integration path. I'll get into specifics later, but here's a spoiler: the real difficulty isn't in \"spawning the process,\" but in \"clarifying responsibilities.\"\n\nThe solution shared in this article comes from the\n\n[HagiCode]project—an AI coding assistant that supports multiple models and multiple agent CLI backends. GitHub repository:[HagiCode-org/site], feel free to star it. All the code and all the pitfalls mentioned below are actually running in this project. Writing this out is just leaving myself a memory.\n\nHagiCode splits AI capability integration into two layers:\n\n`Hagicode.Libs`\n\n, providing reusable provider primitives `ICliProvider<TOptions>`\n\n, specifically responsible for \"spawning a CLI agent and normalizing its output into a shared message stream.\"`hagicode-core`\n\n, providing project-level thin adapters `IAIProvider`\n\n, responsible for \"translating business requests into provider parameters, consuming shared message streams, and exposing unified streaming chunks externally.\"Pi's integration follows this path. The bottom layer `PiProvider`\n\nspawns the pi process, reads the JSON event stream, and normalizes it into shared messages; the upper layer `PiCliProvider`\n\ntranslates `AIRequest`\n\ninto `PiOptions`\n\n, consumes `CliMessage`\n\n, and outputs `AIStreamingChunk`\n\n.\n\nThese three things—message parsing, retry, cancellation—fall into three different places: `PiJsonEventMapper`\n\n, a seemingly strange archiving proposal, and `CliProcessManager`\n\n. Let's talk about each one.\n\npi outputs JSON events line by line under `--mode json --print`\n\n. This event set is private to pi and must not be leaked directly to upper layers, otherwise every consumer would couple to pi's internal details, and the entire project would follow pi's upgrades when its event structure changes. This kind of leakage is similar to writing your thoughts on your face—others find it tiring to look at, and you're not necessarily comfortable either.\n\nWe use `PiJsonEventMapper`\n\nas a translation layer to normalize pi's events into shared `CliMessage`\n\n. `CliMessage`\n\nis defined in `HagiCode.Libs.Core/Transport/CliMessage.cs`\n\nwith a very simple structure—just a `(Type, Content)`\n\nrecord. The mapping relationship is roughly as follows:\n\n| pi event | shared message | purpose |\n|---|---|---|\n`session` |\n`session.started` / `session.resumed`\n|\nsession lifecycle |\n`message_update` (text type) |\n`assistant` |\nstreaming body incremental |\n`message_update` (thinking type) |\n`assistant.thought` |\nthought chain |\n`message_update` (tool type) |\n`tool.call` / `tool.update`\n|\ntool call initiation |\n`message_end` / `turn_end` (toolResult) |\n`tool.completed` / `tool.failed`\n|\ntool result |\n`turn_end` / `agent_end`\n|\n`terminal.completed` |\ncurrent turn end |\n| non-zero exit / parse failure | `terminal.failed` |\nterminal state failure |\n\nThis table is just a quick reference, but it contains two key techniques that were figured out after stumbling into pitfalls—worth expanding on.\n\nThis is the easiest place to crash. pi's `message_update`\n\nevent doesn't send increments, but **cumulative full text**—every time a token comes, it resends the \"complete text so far.\"\n\nIf you forward received content directly to the frontend, users will see content repeated: the first is \"you\", the second is \"hello\", the third is \"hello,\", the fourth is \"hello, wor\"... The frontend will think these are four independent outputs. Repetition is fresh the first time you see it, but boring the tenth time.\n\nThe solution is prefix comparison to calculate the true delta:\n\n```\n// Key: pi sends cumulative snapshot, not increment\n// Use prefix comparison to extract the delta, otherwise frontend sees repeated content\nif (text.StartsWith(_lastAssistantTextSnapshot, StringComparison.Ordinal))\n{\n    var delta = text[_lastAssistantTextSnapshot.Length..];\n    _lastAssistantTextSnapshot = text;\n    return delta.Length == 0 ? null : delta;\n}\n```\n\nThere's another hidden pitfall here: **cross-turn prefix replay**. After tool calls end and the assistant continues speaking, pi will replay that previous text from the beginning again. If you only keep a global snapshot, you'll treat the replayed content as increments, causing repetition after tool calls. `PiProviderTests`\n\nhas a dedicated test case `ExecuteAsync_deduplicates_replayed_assistant_prefix_after_tool_turns`\n\ncovering this scenario. In other words, snapshots before and after tool calls need aligned processing, not independent handling.\n\nThought chains (thinking) can't be output as soon as each token is received. pi stuffs a bunch of thinking fragments in the middle of tool calls. If forwarded in real time, the stream order becomes a mess—一会儿是 assistant 正文，一会儿是思考碎片，一会儿又是 tool.call。Does this make sense? It doesn't really, just adding confusion.\n\nOur approach: when receiving thinking events, first put them in `BufferThinkingSnapshot`\n\nfor temporary storage, and wait until `message_end`\n\nor `turn_end`\n\nwith `stopReason != \"toolUse\"`\n\n, then uniformly `DrainBufferedThinkingMessages`\n\n. This way, thinking fragments in the middle of tool calls won't pollute the main stream, and the complete thought process is given all at once when the turn ends.\n\nAgent CLI isn't the ideal system from textbooks. It occasionally spits out a non-JSON line, or a JSON without a `type`\n\nfield. If you throw an exception here, the entire stream dies and users see nothing. The real world isn't always perfect—who can guarantee every line is well-behaved?\n\nOur strategy: any line that fails to parse doesn't interrupt the stream, but is collected into `_invalidOutputLines`\n\n. After the process ends, in `Complete()`\n\n, these \"bad lines\" are spliced into the diagnostic text of `terminal.failed`\n\n. This way, when users see errors, they can directly see what garbage pi actually output, not a dry \"parse error.\"\n\nThis is the easiest pitfall in the entire integration. Intuitively \"integrating a CLI should include retry,\" but HagiCode in an archiving proposal **actively removed** all automatic retry from the provider layer. The proposal is called `remove-provider-auto-retry-support`\n\n.\n\nThe proposal background is written very bluntly. Retry logic was originally scattered in two places: one copy in `Hagicode.Libs`\n\n(OpenCode-style fresh-runtime replay), another in `hagicode-core`\n\n(`ProviderErrorAutoRetryCoordinator`\n\n). Both sides did their own thing, making \"whether to retry\" a hidden implicit behavior inside the provider, secretly changing failure timing, session continuation method, and chat state flow.\n\nThink about it and your head hurts: user sends a message, provider internally retries three times itself, first two fail, third succeeds. Upper layer has no idea what happened in the middle, session state, token counting, UI progress all don't match. This kind of implicit behavior is chronic poison in architecture.\n\nSo the boundary was converged into one sentence:\n\nprovider converges to single-attempt semantics, caller needs to treat non-retry state as normal single-execution result.\n\nIn code, it's three things:\n\n`PiOptions`\n\nhas `maxAttempts`\n\n, no `retryDelay`\n\n, no `retryClassifier`\n\n.`ExecuteAsync`\n\nends after one pi process run completes, failures directly give `terminal.failed`\n\n.`ClaudeCodeRetryableTerminalFailureClassifier`\n\n, `CodexRetryableTerminalFailureClassifier`\n\n, etc.)—if they purely served automatic retry—are all removed from the active path.But please note, **retry capability hasn't disappeared, just moved up**. The proposal explicitly writes \"leaving a stable boundary for higher layers to uniformly take over retry later.\" The DTO for `providerErrorAutoRetry`\n\nconfiguration item, normalization, serialization, frontend settings page round-trip are all preserved, it just no longer drives provider execution. Some things aren't really unwanted, just kept in a different way.\n\nIf you want to add retry on top of pi, the correct approach is to do it at the caller of `PiCliProvider`\n\n—for example, your session orchestration layer (in HagiCode it's Orleans's `SessionGrain`\n\n, frontend might be chat orchestration layer). After getting `terminal.failed`\n\n, judge yourself whether it's retryable, decide delay and count yourself, then send `ExecuteAsync`\n\nagain.\n\nA minimal viable pattern looks like this:\n\n```\n// Retry logic at caller, don't stuff back into PiProvider\n// Otherwise breaks the \"single-attempt\" boundary just established by provider\nasync Task<AIResponse> ExecuteWithRetryAsync(AIRequest req, int maxAttempts, CancellationToken ct)\n{\n    for (var attempt = 1; ; attempt++)\n    {\n        var response = await provider.ExecuteAsync(req, ct);\n\n        // Return on success or reaching limit\n        if (response.FinishReason != FinishReason.Unknown || attempt >= maxAttempts)\n            return response;\n\n        // Only retry retryable terminal failures (network, 5xx, process crash)\n        // model rejected, auth failure这类重试也无意义，别重试\n        await Task.Delay(TimeSpan.FromSeconds(Math.Pow(2, attempt)), ct);\n    }\n}\n```\n\nThe classification logic for judging \"retryable\" is no longer in the provider, caller defines it themselves. `providerErrorAutoRetry`\n\nconfiguration (maxAttempts, retryDelay, enabled) can still be read from the frontend settings page, but what actually drives retry is your orchestration layer, not `PiProvider`\n\n. Repeat this three times.\n\nFor cancellation, PiProvider almost doesn't implement anything itself, fully delegating to `CliProcessManager`\n\n. PiProvider only handles two things: passing `CancellationToken`\n\ndown, and cleanup on exception.\n\nThe chain looks like this, passing all the way down:\n\n```\n调用方 CancellationToken\n  → PiCliProvider.StreamCoreAsync(cancellationToken)\n  → PiProvider.ExecuteProcessAsync([EnumeratorCancellation] cancellationToken)\n  → ReadLineAsync(cancellationToken) / WaitForExitAsync(cancellationToken)\n  → 异常时 _processManager.StopAsync(handle, CancellationToken.None)\n```\n\nNote the last line: cleanup uses `CancellationToken.None`\n\n, not the token the user passed in. This is a detail, but extremely important.\n\nThe reason is: the user's token **is already canceled**. If you use this already-canceled token for cleanup, the cleanup task will be canceled immediately, and the process becomes an orphan—pi still running in the background, no one collecting it, CPU and memory occupied for nothing. So cleanup must use `CancellationToken.None`\n\n, ensuring cleanup actions definitely complete. It's like with people—some things need proper cleanup after they completely stop, otherwise it's just leaving a mess.\n\n`CliProcessManager.StopProcessAsync`\n\nis a three-stage progressive shutdown process, with time constants defined at the top of the file:\n\n```\n// 优雅停止的耐心：先给进程自己收尾的时间\nprivate static readonly TimeSpan GracefulStopTimeout = TimeSpan.FromSeconds(2);\n// 强制 kill 后等待进程真正退出的耐心\nprivate static readonly TimeSpan StopWaitTimeout = TimeSpan.FromSeconds(5);\n```\n\nThe three stages progress like this:\n\n`TryInterruptAsync`\n\nfirst writes a `\\u0003`\n\n(Ctrl+C character) to stdin, and on Unix additionally does `kill -INT <pid>`\n\n. This step is to let pi gracefully end itself—it can perceive the interrupt and wrap up what it's writing.`Process.Kill(entireProcessTree: true)`\n\nto kill the entire process tree together, then wait at most 5 seconds to confirm it's really dead.Why `entireProcessTree: true`\n\n? Because pi spawns child processes when running tools—for example, local model processes routed by provider, bash subprocesses running. Killing only the parent process leaves child processes as orphans continuing to run. Killing the whole tree together is clean.\n\nUnder Windows there's no SIGINT, can only rely on Ctrl+C character, so cross-platform behavior will differ, keep this in mind.\n\nPiProvider's `ExecuteProcessAsync`\n\nwhen `ReadLineAsync`\n\nthrows an exception, uses `ExceptionDispatchInfo.Capture`\n\nto temporarily store the exception, jumps out of the loop to call `StopAsync`\n\nto clean up the process, then `pendingException.Throw()`\n\nrethrows the original exception to the upper layer.\n\nWhy store then throw? Because if you throw directly, the process hasn't been recycled yet and becomes an orphan; if you throw before `StopAsync`\n\n, cleanup logic doesn't run at all. Store it temporarily, first guarantee the process is definitely recycled, then preserve the original `OperationCanceledException`\n\nsemantics completely for the caller—caller gets this exception and can judge \"oh, user actively canceled\" not \"something went wrong.\"\n\nThere's another detail worth mentioning separately. Process startup failure—for example, pi executable doesn't exist, wrong permissions—PiProvider **doesn't throw exceptions**, but synthesizes a `terminal.failed`\n\nmessage, then `yield break`\n\n.\n\nWhy do this? Because if you throw exceptions, upper layer consumers have to handle two completely different semantics: one is \"normal messages during streaming consumption,\" the other is \"exceptions thrown before streaming starts.\" This makes the consumer's `await foreach`\n\nparticularly hard to write.\n\nAfter unifying to \"always give you messages first, then end the stream,\" consumer logic is consistent: getting `terminal.failed`\n\ncounts as failure, getting `terminal.completed`\n\ncounts as success, no need for try/catch branching. This is a small but important design decision, stabilizing the contract.\n\nReferencing `PiScenarioMessageReader`\n\n(libs console test scenario) and `PiCliProvider.StreamCoreAsync`\n\n(core thin adapter) in HagiCode, consumers look roughly like this:\n\n``` js\nawait foreach (var message in provider.ExecuteAsync(options, prompt, cancellationToken))\n{\n    // 1. 短路处理失败，别再处理后续消息\n    if (NormalizedAcpCliAdapter.TryGetFailureMessage(message.Content, out var failure))\n    {\n        yield return new AIStreamingChunk { Type = StreamingChunkType.Error, ErrorMessage = failure };\n        yield break;   // stream ends after terminal.failed\n    }\n\n    // 2. assistant 文本是 cumulative snapshot，自己再做一次增量计算\n    if (message.Type == \"assistant\" && TryGetText(message.Content, out var text))\n    {\n        var delta = ReconcileSnapshot(text);  // 前缀比对\n        if (!string.IsNullOrEmpty(delta)) yield return Chunk(delta);\n    }\n\n    // 3. terminal.completed 是唯一可靠的\"结束\"信号\n    if (message.Type == \"terminal.completed\") break;\n}\n```\n\nOrganizing all the pitfalls encountered along the way into a table, for future reference:\n\n| 现象 | 原因 | 处理 |\n|---|---|---|\n| 前端看到 assistant 文本重复 | 没做 cumulative 转 delta | 用 `ReconcileAssistantTextSnapshot` 做前缀比对 |\n| 取消后进程还在跑 | 清理用了已经取消的 token | 改用 `CancellationToken.None` 做清理 |\n| 重试不生效 | 把重试写进了 PiProvider，但 provider 是单次尝试语义 | 上移到调用方编排层 |\n| pi 报错信息丢失 | 没读 `terminal.failed` 的诊断字段 |\n完整透传 `text` / `invalid_output_lines` / `stderr`\n|\n| 工具调用中途收到思考碎片 | 直接转发了 thinking 事件 | 缓冲到 turn 结束再 `DrainBufferedThinkingMessages`\n|\n\nThe libs layer uses `StubCliProcessManager`\n\nto mock processes, with unit tests covering pure logic like parameter construction, event normalization, incremental deduplication, failure pass-through. The real CLI path uses `HAGICODE_REAL_CLI_TESTS`\n\nenvironment variable to opt-in, running trip scenarios with real models. The core layer's `PiCliProviderTests`\n\nverifies the thin adapter's `AIStreamingChunk`\n\nprojection and session binding.\n\n```\n# 在 Hagicode.Libs 仓库跑 Pi 相关单测\ndotnet test --filter \"FullyQualifiedName~PiProviderTests\"\n\n# 跑真实 CLI 集成测试（需要本地装好 pi）\nHAGICODE_REAL_CLI_TESTS=1 dotnet test --filter \"FullyQualifiedName~PiProviderTests.RealCli\"\n```\n\nPutting these three things together, the mental model for integrating pi actually boils down to one sentence: **let each layer do only its own thing.**\n\n`PiJsonEventMapper`\n\n: private events are normalized into shared `CliMessage`\n\n, cumulative snapshots converted to deltas, thinking buffered until turn end.`CliProcessManager`\n\n: `CancellationToken`\n\nis passed through the full chain, cleanup uses `CancellationToken.None`\n\n, three-stage progressive shutdown (interrupt signal → graceful wait → force kill entire process tree).After these boundaries are clearly drawn, integrating a new agent CLI almost becomes pipeline work—you just need to write a new `XxxProvider`\n\nand `XxxJsonEventMapper`\n\n, and cross-cutting logic like retry, cancellation, message contracts, error handling are all reused. This is also the fundamental reason why HagiCode can simultaneously support multiple agent CLI backends (claude code, codex, pi, gemini cli, etc.) without getting messy.\n\nLet me say that most important boundary one more time: **don't add retry at the provider layer**. Once you understand this, integrating agent CLI is more than halfway done...\n\nReturning to the theme \"Pi Agent Integration: Message Parsing, Retry, and Cancellation,\" what's really worth repeatedly confirming isn't scattered techniques, but whether constraints, implementation boundaries, and engineering trade-offs have been clearly seen.\n\nAs long as the judgment bases in this article are settled into stable checklist items, you can make reliable decisions faster when facing similar problems in the future.\n\nThanks for reading. If this article helped, consider liking, bookmarking, or sharing it.\n\nThis article was created with AI assistance and reviewed by the author before publication.", "url": "https://wpnews.pro/news/pi-agent-integration-message-parsing-retry-and-cancellation", "canonical_source": "https://dev.to/newbe36524/pi-agent-integration-message-parsing-retry-and-cancellation-nl0", "published_at": "2026-06-19 01:12:00+00:00", "updated_at": "2026-06-19 02:00:19.213956+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "large-language-models", "ai-tools", "ai-infrastructure"], "entities": ["pi", "HagiCode", "HagiCode-org/site", "PiProvider", "PiCliProvider", "PiJsonEventMapper", "CliProcessManager", "ICliProvider"], "alternates": {"html": "https://wpnews.pro/news/pi-agent-integration-message-parsing-retry-and-cancellation", "markdown": "https://wpnews.pro/news/pi-agent-integration-message-parsing-retry-and-cancellation.md", "text": "https://wpnews.pro/news/pi-agent-integration-message-parsing-retry-and-cancellation.txt", "jsonld": "https://wpnews.pro/news/pi-agent-integration-message-parsing-retry-and-cancellation.jsonld"}}