cd /news/ai-tools/text-to-speech-for-claude-code-hear-… · home topics ai-tools article
[ARTICLE · art-23568] src=dev.to pub= topic=ai-tools verified=true sentiment=↑ positive

Text-to-Speech for Claude Code — Hear What the Agent Is Doing

A developer has added text-to-speech functionality to Claude Code, allowing the AI coding agent to read its responses aloud. The hack uses Claude Code's lifecycle hooks to pipe the agent's messages through the operating system's speech command, enabling users to hear status updates like "done, tests pass" or "I need a decision here" from another room. The developer also integrated the feature into their personal tooling project called teatree, which can play spoken responses through local speakers or attach audio files to Slack DMs.

read4 min publishedJun 6, 2026

Claude Code can already listen to you. Run /voice

and you get push-to-talk dictation — you speak, it transcribes into the prompt (docs). What it does not do is talk back. When I leave a long task running, I either babysit the terminal or miss the moment it finishes or asks a question.

So I added the other half: text-to-speech. A hook reads the agent's replies aloud. I can be in another room and still hear "done, tests pass" or "I need a decision here". This post has two parts — a small recipe anyone can paste into their config, and how I wired the same idea into my own tooling for the times I'm not at my desk.

This is a personal hack, not a Claude Code feature. It reads short text aloud after the agent stops. That's it. No wake words, no conversation, no reading code blocks (you don't want that).

Claude Code hooks run a shell command on lifecycle events. The two that matter here:

message

field.Notification is the simplest win, so start there. Every OS ships a speech command: say

on macOS, spd-say

or espeak-ng

on Linux, and a one-line PowerShell call on Windows.

Here is a Notification hook that speaks the message. Put it in ~/.claude/settings.json

:

{
  "hooks": {
    "Notification": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "jq -r '.message // empty' | say"
          }
        ]
      }
    ]
  }
}

jq

reads the message

field from the JSON on stdin, and say

(macOS) reads piped text aloud. On Linux swap say

for spd-say -e

or espeak-ng

, both of which also read stdin. On Windows, point the command at PowerShell:

"command": "jq -r '.message // empty' | powershell -Command \"Add-Type -AssemblyName System.Speech; (New-Object System.Speech.Synthesis.SpeechSynthesizer).Speak([Console]::In.ReadToEnd())\""

That covers the "needs your attention" case. If you also want the agent to read its actual reply, add a Stop hook. The wrinkle: Stop gives you the transcript path, not the text. The transcript is JSONL (one JSON object per line), so you pull the last assistant text block out of it:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "jq -rs 'map(select(.type==\"assistant\")) | last | .message.content[]? | select(.type==\"text\") | .text' \"$(jq -r .transcript_path)\" 2>/dev/null | head -c 600 | say"
          }
        ]
      }
    ]
  }
}

A few honest caveats, because this is where it gets rough:

head -c 600

stops say

droning through a 4 KB status report. Pick your own limit.jq

filter above matches the current JSONL layout. If Claude Code changes it, the filter breaks. Treat it as a hack, not an API.For most people the Notification hook alone is enough, and it's the part least likely to break.

I keep my Claude Code automation in a project called teatree. It has a t3 speak

command driven by one [teatree.speak]

table:

[teatree.speak]
local = "dm"   # what plays on this machine's speakers: "dm" | "all" | "off"
slack = true   # attach a spoken audio file to each bot→user Slack DM

local

controls the speakers in front of you: dm

reads only the bot's DMs to you, all

also reads every agent turn aloud, off

is silent. slack

attaches a spoken audio file to each bot→user DM. The two are independent, and both default off, so it does nothing until you configure it.

Two destinations because there are two places I am. At the desk, local

plays through the speakers the moment a DM lands — no clicking. Away from it, slack

is what I reach for: the spoken text arrives as an audio file attached to the DM, and on the phone I press play. Not hands-free, but I can listen while moving instead of stopping to read.

Two operational notes. The voice comes from macOS say

. And slack

needs the bot's file-upload permission, so an existing bot has to be reinstalled once to grant it.

The hook recipe is the part I'd actually recommend trying — it's a few lines and it degrades gracefully. The teatree side is tied to my own setup, so take it as one way to structure the same idea rather than something to copy verbatim.

I'm still figuring out how much to read aloud. local = "all"

gets chatty fast. dm

is calmer but misses things. If you try this, I'd be curious what threshold works for you.

── more in #ai-tools 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/text-to-speech-for-c…] indexed:0 read:4min 2026-06-06 ·