# Text-to-Speech for Claude Code — Hear What the Agent Is Doing

> Source: <https://dev.to/souliane/text-to-speech-for-claude-code-hear-what-the-agent-is-doing-3mom>
> Published: 2026-06-06 20:35:46+00:00

Claude Code can already listen to you. Run `/voice`

and you get push-to-talk dictation — you speak, it transcribes into the prompt ([docs](https://code.claude.com/docs/en/voice-dictation)). What it does not do is talk back. When I leave a long task running, I either babysit the terminal or miss the moment it finishes or asks a question.

So I added the other half: text-to-speech. A hook reads the agent's replies aloud. I can be in another room and still hear "done, tests pass" or "I need a decision here". This post has two parts — a small recipe anyone can paste into their config, and how I wired the same idea into my own tooling for the times I'm not at my desk.

This is a personal hack, not a Claude Code feature. It reads short text aloud after the agent stops. That's it. No wake words, no conversation, no reading code blocks (you don't want that).

Claude Code [hooks](https://code.claude.com/docs/en/hooks) run a shell command on lifecycle events. The two that matter here:

`message`

field.Notification is the simplest win, so start there. Every OS ships a speech command: `say`

on macOS, `spd-say`

or `espeak-ng`

on Linux, and a one-line PowerShell call on Windows.

Here is a Notification hook that speaks the message. Put it in `~/.claude/settings.json`

:

```
{
  "hooks": {
    "Notification": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "jq -r '.message // empty' | say"
          }
        ]
      }
    ]
  }
}
```

`jq`

reads the `message`

field from the JSON on stdin, and `say`

(macOS) reads piped text aloud. On Linux swap `say`

for `spd-say -e`

or `espeak-ng`

, both of which also read stdin. On Windows, point the command at PowerShell:

```
"command": "jq -r '.message // empty' | powershell -Command \"Add-Type -AssemblyName System.Speech; (New-Object System.Speech.Synthesis.SpeechSynthesizer).Speak([Console]::In.ReadToEnd())\""
```

That covers the "needs your attention" case. If you also want the agent to read its actual reply, add a Stop hook. The wrinkle: Stop gives you the transcript path, not the text. The transcript is JSONL (one JSON object per line), so you pull the last assistant text block out of it:

```
{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "jq -rs 'map(select(.type==\"assistant\")) | last | .message.content[]? | select(.type==\"text\") | .text' \"$(jq -r .transcript_path)\" 2>/dev/null | head -c 600 | say"
          }
        ]
      }
    ]
  }
}
```

A few honest caveats, because this is where it gets rough:

`head -c 600`

stops `say`

droning through a 4 KB status report. Pick your own limit.`jq`

filter above matches the current JSONL layout. If Claude Code changes it, the filter breaks. Treat it as a hack, not an API.For most people the Notification hook alone is enough, and it's the part least likely to break.

I keep my Claude Code automation in a project called [teatree](https://github.com/souliane/teatree). It has a `t3 speak`

command driven by one `[teatree.speak]`

table:

```
[teatree.speak]
local = "dm"   # what plays on this machine's speakers: "dm" | "all" | "off"
slack = true   # attach a spoken audio file to each bot→user Slack DM
```

`local`

controls the speakers in front of you: `dm`

reads only the bot's DMs to you, `all`

also reads every agent turn aloud, `off`

is silent. `slack`

attaches a spoken audio file to each bot→user DM. The two are independent, and both default off, so it does nothing until you configure it.

Two destinations because there are two places I am. At the desk, `local`

plays through the speakers the moment a DM lands — no clicking. Away from it, `slack`

is what I reach for: the spoken text arrives as an audio file attached to the DM, and on the phone I press play. Not hands-free, but I can listen while moving instead of stopping to read.

Two operational notes. The voice comes from macOS `say`

. And `slack`

needs the bot's file-upload permission, so an existing bot has to be reinstalled once to grant it.

The hook recipe is the part I'd actually recommend trying — it's a few lines and it degrades gracefully. The teatree side is tied to my own setup, so take it as one way to structure the same idea rather than something to copy verbatim.

I'm still figuring out how much to read aloud. `local = "all"`

gets chatty fast. `dm`

is calmer but misses things. If you try this, I'd be curious what threshold works for you.
