Text-to-Speech for Claude Code — Hear What the Agent Is Doing

A developer has added text-to-speech functionality to Claude Code, allowing the AI coding agent to read its responses aloud. The hack uses Claude Code's lifecycle hooks to pipe the agent's messages through the operating system's speech command, enabling users to hear status updates like "done, tests pass" or "I need a decision here" from another room. The developer also integrated the feature into their personal tooling project called teatree, which can play spoken responses through local speakers or attach audio files to Slack DMs.

Claude Code can already listen to you. Run /voice and you get push-to-talk dictation — you speak, it transcribes into the prompt docs https://code.claude.com/docs/en/voice-dictation . What it does not do is talk back. When I leave a long task running, I either babysit the terminal or miss the moment it finishes or asks a question. So I added the other half: text-to-speech. A hook reads the agent's replies aloud. I can be in another room and still hear "done, tests pass" or "I need a decision here". This post has two parts — a small recipe anyone can paste into their config, and how I wired the same idea into my own tooling for the times I'm not at my desk. This is a personal hack, not a Claude Code feature. It reads short text aloud after the agent stops. That's it. No wake words, no conversation, no reading code blocks you don't want that . Claude Code hooks https://code.claude.com/docs/en/hooks run a shell command on lifecycle events. The two that matter here: message field.Notification is the simplest win, so start there. Every OS ships a speech command: say on macOS, spd-say or espeak-ng on Linux, and a one-line PowerShell call on Windows. Here is a Notification hook that speaks the message. Put it in ~/.claude/settings.json : { "hooks": { "Notification": { "hooks": { "type": "command", "command": "jq -r '.message // empty' | say" } } } } jq reads the message field from the JSON on stdin, and say macOS reads piped text aloud. On Linux swap say for spd-say -e or espeak-ng , both of which also read stdin. On Windows, point the command at PowerShell: "command": "jq -r '.message // empty' | powershell -Command \"Add-Type -AssemblyName System.Speech; New-Object System.Speech.Synthesis.SpeechSynthesizer .Speak Console ::In.ReadToEnd \"" That covers the "needs your attention" case. If you also want the agent to read its actual reply, add a Stop hook. The wrinkle: Stop gives you the transcript path, not the text. The transcript is JSONL one JSON object per line , so you pull the last assistant text block out of it: { "hooks": { "Stop": { "hooks": { "type": "command", "command": "jq -rs 'map select .type==\"assistant\" | last | .message.content ? | select .type==\"text\" | .text' \"$ jq -r .transcript path \" 2 /dev/null | head -c 600 | say" } } } } A few honest caveats, because this is where it gets rough: head -c 600 stops say droning through a 4 KB status report. Pick your own limit. jq filter above matches the current JSONL layout. If Claude Code changes it, the filter breaks. Treat it as a hack, not an API.For most people the Notification hook alone is enough, and it's the part least likely to break. I keep my Claude Code automation in a project called teatree https://github.com/souliane/teatree . It has a t3 speak command driven by one teatree.speak table: teatree.speak local = "dm" what plays on this machine's speakers: "dm" | "all" | "off" slack = true attach a spoken audio file to each bot→user Slack DM local controls the speakers in front of you: dm reads only the bot's DMs to you, all also reads every agent turn aloud, off is silent. slack attaches a spoken audio file to each bot→user DM. The two are independent, and both default off, so it does nothing until you configure it. Two destinations because there are two places I am. At the desk, local plays through the speakers the moment a DM lands — no clicking. Away from it, slack is what I reach for: the spoken text arrives as an audio file attached to the DM, and on the phone I press play. Not hands-free, but I can listen while moving instead of stopping to read. Two operational notes. The voice comes from macOS say . And slack needs the bot's file-upload permission, so an existing bot has to be reinstalled once to grant it. The hook recipe is the part I'd actually recommend trying — it's a few lines and it degrades gracefully. The teatree side is tied to my own setup, so take it as one way to structure the same idea rather than something to copy verbatim. I'm still figuring out how much to read aloud. local = "all" gets chatty fast. dm is calmer but misses things. If you try this, I'd be curious what threshold works for you.