An Offline Meeting Transcriber A developer used Claude Code to build an offline meeting transcriber that processes audio into Granola-style markdown notes entirely on a MacBook, ensuring sensitive meeting recordings never leave the device. The tool runs three local stages — transcription via mlx-whisper, speaker diarization via pyannote, and summarization via Ollama — and was migrated to the Swamp platform for composability after the initial bash version proved difficult to reuse. The project highlights how AI-assisted coding can rapidly produce functional tools while underscoring the importance of modular architecture for long-term maintainability. I take "handwritten" meeting notes in Obsidian when I can. Some of these meetings I'd want transcribed as well, but they are high trust, private, and sensitive — the exact kinds of meetings you don't invite strangers to. So the notes and recordings shouldn't leave my laptop either to be processed or stored by strangers. On a whim, I had Claude Code build offline-meeting-transcriber : audio in, Granola-style markdown out, nothing leaves the MacBook. To be clear, I didn't write any of this by hand — Claude Code did the bash, the Python, and later the swamp models, while I steered, reviewed diffs, and made the design calls. The first version was a bash wrapper around four Python files. It worked. It was also a dead end the moment I wanted to use any piece of it for something else. I had it migrated to swamp https://swamp.club for the same reason I migrated ADW https://matgreten.dev/posts/visibility-into-the-black-box/ — but this time the lever was composability, not observability. The pieces wanted to be reused. Bash had them welded together. What it Does bin/meeting-process samples/standup.m4a "Engineering Standup" runs three stages locally: transcribes the audio to segment-level JSON. mlx-whisper --no-speech-threshold 0.6 is in there because without it Whisper hallucinates "Thanks for watching " on every silent stretch.tags each Whisper segment with the speaker whose timeline overlaps it the most. Segments reach the LLM as pyannote/speaker-diarization-3.1 SPEAKER 00 : … , SPEAKER 01 : … . Label stability inside a meeting matters more than getting names right — renaming SPEAKER 00 → Alice is a one-line sed after the fact.on local Ollama summarizes the transcript into a Granola-style note. Chunked at ~2500 tokens with 200-token overlap. The token estimator is qwen3.6:35b-a3b-nvfp4 len text.split 1.3 — no tiktoken, no extra dep, accurate enough for chunk boundaries. Final markdown lands in ~/Obsidian/Meetings/Unsorted/ for now. Why Swamp Three things in the bash version were obviously reusable and obviously stuck: Transcription. mlx whisper doesn't care that it's transcribing a meeting. It would transcribe a podcast, an interview, a voice memo I left myself in the car. The bash script only knew about meetings. Diarization. Same model, same merge logic, same SPEAKER xx : … output. Useful anywhere I have multi-speaker audio. Summarization with a Granola-style prompt. This one is meeting-shaped, but the chunking + merge infrastructure underneath it isn't. Different prompt, different downstream consumer, same machinery. In bash, none of those were components. They were lines in a single script, with implicit data passing through filenames in out/ , and the only way to "reuse" any of them was to copy-paste the script and edit it. That's the path I've been on for years and the path I keep regretting. So I had the pipeline broken out into three swamp models under models/@mgreten/ , all published on swamp.club: — wraps the binary, exposes @mgreten/mlx-whisper transcribe audioPath , output is a typed transcript artifact.— takes the audio and the transcript artifact, returns a diarized artifact. Soft-fails when the HF token is missing; the workflow continues on the undiarized transcript. A bad diarization never blocks the note. @mgreten/pyannote-diarizer — takes the transcript artifact and a model tag, returns markdown plus a separate @mgreten/meeting-summarizer write note method that lands the file in the vault. The combine notes method handwritten + analysis merge lives here too. Each one has a typed input, a typed output, and exactly one job. The output of each step is a data artifact the next step pulls by name, not a file path I have to remember to clean up: - name: summarize steps: - name: run-summarize task: type: model method modelIdOrName: meeting-summarizer methodName: summarize inputs: transcriptJson: ${{ data.latest "pyannote-diarizer", inputs.noteName .attributes.transcriptJson }} instanceName: ${{ inputs.noteName }} dependsOn: - job: diarize condition: type: succeeded The workflow YAML is one way of wiring those three models. It's not the only way. That's the whole point. What Composability Actually Buys Me Hermes is the obvious next caller. When I want my agent to be able to transcribe a recording I just dropped into it, I don't expose bin/meeting-process to it as a shell command. I expose the model. Same typed inputs, same typed outputs, no shell quoting, no parsing stdout. The bash wrapper still exists for me at the terminal — it shells out to swamp now and gives me a progress counter — but it's no longer the only entry point. The watch-folder daemon is the next one after that. v2 territory, not built. But when I build it, it's calling the workflow, not re-implementing it. This is the part bash couldn't give me. Not "the pipeline is observable." The pipeline is separable . The day I want to reuse the diarizer in something that has nothing to do with meetings, I'm not copy-pasting anything. The --notes Flag is the Real Design Insight The agents almost shipped a tool that overwrote my handwritten meeting notes. The first version they built generated the meeting note. Beautiful, clean, full of action items. The problem: I take "handwritten" notes during meetings, and those notes have the context the audio doesn't — the side conversation, the link I scribbled, the thing I almost said. The generated note has different value. It's complete where mine is partial, but it's also confidently wrong in places my human note isn't. That's the kind of call no agent was going to make for me; I had to catch it in review and redirect. So --notes was added. When you pass it an existing note, the pipeline: - Leaves the handwritten note untouched. - Writes the LLM summary as