{"slug": "show-hn-stable-audio-3-one-shot-sample-generator-110gb-download", "title": "Show HN: Stable Audio 3 – one-shot sample generator (110gb download)", "summary": "Stability AI released Stable Audio 3, a one-shot sample generator that requires an 110GB download, with the v3 libraries hosted publicly on Google Cloud Platform without authentication. The tool, part of the Signals & Sorcery app, allows users to generate large batches of drum and pitched instrument audio samples using rented GPU power from RunPod, with a full 50-category campaign costing under $50 and taking up to a day to complete.", "body_md": "The latest **v3** libraries are hosted publicly on GCP (no auth required). The\n[Signals & Sorcery](https://signalsandsorcery.com) app installs these\nautomatically, but you can grab them directly here:\n\n| Pack | Contents | Download |\n|---|---|---|\nDrums (v3 large) |\n24 roles · 10,359 one-shots (+ prompt sidecars) · ~1.4 GB |\n|\n\n**Instruments**(v3 large)[sas-instrument-pack-v3-large.zip](https://storage.googleapis.com/docs-assets/sas-instrument-pack-v3-large.zip)Each zip contains a\n\n`_pack-version.json`\n\nmarker plus the payload tree (drums:`<role>/*.wav`\n\n; instruments:`<category>/<id>/manifest.json`\n\n+`zones/`\n\n). The instrument pack is zones-only (the 24-bit generation`sources/`\n\nare omitted — they aren't used at playback).\n\nGenerate large batches of audio samples with\n[Stable Audio 3](https://huggingface.co/stabilityai/stable-audio-3-medium)\non a rented [RunPod](https://www.runpod.io) GPU. Two pipelines ship side by side:\n\n**Drums / one-shots**(`run_all.sh`\n\n) — 24 unpitched categories (kicks, snares, hats, claps, 808s, risers, impacts, textures…). Generate → quality-gate → trim/normalize → flat`processed/<role>/`\n\nfolders.**Pitched instruments**(`run_pitched.sh`\n\n) — 28 tuned categories (pianos, basses, pads, leads, strings…). Generate → pitch/quality-gate → multi-source pitch-correct + pre-render playable zones →`instruments/<cat>/<id>/manifest.json`\n\n.\n\nBoth run the same **retry-to-target** loop (re-roll failed prompts until ~150\nsamples per category survive the gate) and **batched** generation on one model load.\n\n**Read top to bottom, copy-paste each command block.** A single-category test\nslice is ~5 minutes and a few cents; the full ~50-category v3 campaign is up to\n~a day of big-GPU pod time (still well under $50 — see [Cost](#cost-recap)).\n\nAssumes an **Apple Silicon Mac** as the control machine.\n\nFor the rationale (why these settings, prompt-design tips, deep cost math),\nsee [ stable_audio_open_batch_oneshot_guide.md](/shiehn/sas-sample-generator/blob/main/stable_audio_open_batch_oneshot_guide.md).\n\nPart of the\n\n[Signals & Sorcery]family. See[Related repos]at the bottom of this README.\n\nPlain-text prompt files — **one description per line** — under `prompts/`\n\n(drums) and `prompts/pitched/`\n\n(instruments). The repo ships **52 categories,\n200 prompts each** (~10,400 prompts), pre-generated by\n[ scripts/gen_prompts.py](/shiehn/sas-sample-generator/blob/main/scripts/gen_prompts.py) in a combinatorial house\nstyle (~58% EDM / ~25% hip-hop & urban / ~17% acoustic-orchestral-world). Run\nthem as-is, edit them, or subset which categories generate.\n\nEach non-comment line becomes one generation job. Example from\n[ prompts/kick.txt](/shiehn/sas-sample-generator/blob/main/prompts/kick.txt):\n\n```\n# 909-style\ntight 909-style kick drum one shot, hard click transient, short punchy body, dry\npunchy 909 kick drum one shot, sharp transient, controlled low end, clean studio sample\n\n# 808-style\ndeep 808 kick one shot, long sub bass decay, smooth sine low end, dry\nwarm 808 kick one shot, saturated low end, medium decay, dry, no melody, no loop\n```\n\nBlank lines and lines starting with `#`\n\nare ignored (handy for grouping).\nAim for ~10 words per line. For drums, always include phrases like\n`one shot, no loop`\n\nso the model doesn't render a rhythmic loop.\n\nEditing prompts changes content-hashes.Output filenames are content-addressed (`{category}-{hash}.wav`\n\n), so re-wording a line orphans the WAV it used to produce. Finalize wordingbeforea GPU run. Re-running with`--skip-existing`\n\nthen only generates the new/changed lines.\n\nFlat `processed/<role>/`\n\noutput; the folder name *is* the drum role.\n\n| Core kit | Dur | EDM / electronic one-shots | Dur | |\n|---|---|---|---|---|\n`kick` |\n1.5s | `clap` |\n0.75s | |\n`snare-standard` |\n1.0s | `808` (tuned sub one-shot) |\n2.0s | |\n`snare-rim` |\n0.75s | `riser` |\n4.0s | |\n`hat-closed` |\n0.5s | `downlifter` |\n3.0s | |\n`hat-open` |\n1.5s | `impact` |\n2.0s | |\n`cymbal-ride` |\n2.5s | `sub-drop` |\n2.0s | |\n`cymbal-crash` |\n3.0s | `sweep` |\n2.5s | |\n`cymbal-splash` |\n1.5s | `texture` (vinyl/foley/glitch) |\n3.0s | |\n`tamborine` |\n1.0s | `zap` |\n0.75s | |\n`shaker` |\n0.75s | `foley-perc` |\n0.75s | |\n`tom-hi` / `tom-mid` / `tom-low` |\n1.0–1.5s | |||\n`hit` (generic stab) |\n1.5s |\n\n`instruments/<cat>/<id>/`\n\nwith a `manifest.json`\n\n. Source pitches, durations, and\nvariant counts are in the\n[Pitched-instrument pipeline](#pitched-instrument-pipeline) table below.\n\n```\nsynths  lead-supersaw  lead-fm  lead-acid  pluck-synth  plucks  keys  pianos\norgans  basses  808-bass  reese-bass  pads  strings  brass  winds  accordion\nbells  mallets  percussion  timpani  guitars  banjos  mandolin  harp  sitar\nvocals  choir\n```\n\nOutput filenames are content-addressed: `{category}-{hash}.wav`\n\n. Same prompt +\nseed → same filename → safely re-runnable with `--skip-existing`\n\n.\n\nTo **subset** what generates, edit\n[ scripts/categories.txt](/shiehn/sas-sample-generator/blob/main/scripts/categories.txt) (drums) or\n\n[(pitched) — comment out any line to skip that category.](/shiehn/sas-sample-generator/blob/main/scripts/pitched_categories.txt)\n\n`scripts/pitched_categories.txt`\n\n- Create / sign in at\n[huggingface.co](https://huggingface.co). - Visit\n[stabilityai/stable-audio-3-medium](https://huggingface.co/stabilityai/stable-audio-3-medium)and click**Agree and access repository**. SA3 also requires accepting the Gemma Terms of Use (linked from the same page). If you switch models via`--model`\n\n, accept the license on that model's page too —[stable-audio-3-small-sfx](https://huggingface.co/stabilityai/stable-audio-3-small-sfx)is the lighter 0.6B SFX-tuned alternative. - Create a read-only access token at\n[huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). Save it in your password manager — you'll paste it once per pod.\n\n- Create / sign in at\n[runpod.io](https://runpod.io). Add a payment method. - Add your Mac SSH public key under\n[Settings → SSH Keys](https://www.runpod.io/console/user/settings):Paste it into the form. (If\n\n```\npbcopy < ~/.ssh/id_ed25519.pub        # copies key to clipboard\n```\n\n`~/.ssh/id_ed25519.pub`\n\ndoesn't exist:`ssh-keygen -t ed25519`\n\nfirst, accept defaults.)\n\n[runpod.io/console/pods](https://www.runpod.io/console/pods) → **Deploy → GPU Pod**:\n\n| Setting | Value |\n|---|---|\n| GPU | RTX A6000 (48 GB VRAM, ~$0.49/hr) |\n| Template | most recent RunPod PyTorch with CUDA 12.x |\n| Container Disk | 50 GB (default) |\n| Volume Disk | 100 GB at `/workspace` |\n| Expose | SSH (port 22) — default |\n\nClick **Deploy On-Demand**. Wait ~30 sec until status is `RUNNING`\n\n.\n\nOn the pod's card click **Connect → SSH over exposed TCP** and copy the SSH\ncommand. It looks like:\n\n```\nssh root@<POD_IP> -p <POD_PORT> -i ~/.ssh/id_ed25519\n```\n\nFrom your Mac terminal, paste the SSH command from step 1. Type `yes`\n\nto\naccept the host key on first connect.\n\nIf you get `Permission denied (publickey)`\n\n:\n\n```\nssh-add ~/.ssh/id_ed25519\n```\n\n…then retry.\n\nOn the pod:\n\n```\ncd /workspace && \\\ngit clone https://github.com/shiehn/sas-sample-generator.git && \\\ncd /workspace/sas-sample-generator && \\\n./scripts/setup.sh 2>&1 | tee /workspace/setup.log\n```\n\n**Why these paths matter** (and the reason this used to be slow): `/workspace`\n\nis a network filesystem (MooseFS) — fine for big sequential reads/writes\n(model weights, generated audio) but painfully slow for many-tiny-files (a\nPython venv). The script installs the venv at ** /root/.venv**, which is on\nthe pod's container-local SSD, and only keeps the HuggingFace cache and\noutputs on\n\n`/workspace`\n\n. Roughly:\n\n```\n/root/.venv                   ← Python venv          (fast SSD; ~5 min install)\n/workspace/sas-sample-generator   ← cloned repo\n/workspace/.cache/huggingface ← model weights        (downloaded once)\n/workspace/outputs            ← generated WAVs\n```\n\nYou're done when you see:\n\n```\n[setup] cuda available: True\n[setup] device:         NVIDIA RTX A6000\n[setup] done.\n[setup] next: source /root/.venv/bin/activate\nsource /root/.venv/bin/activate\nhf auth login\n```\n\nPaste your HF token (One-Time Setup A). Answer `n`\n\nto \"Add token as git\ncredential\".\n\nThe 14 prompt files are already in `prompts/<category>.txt`\n\n. To run them\nas-is, **skip to step 6**.\n\nTo customize:\n\n**Edit content**:`nano prompts/kick.txt`\n\n(or scp over your own version, or edit on Mac →`git push`\n\n→`git pull`\n\non the pod).**Subset which categories run**: editand comment out the lines you want to skip. Useful for prompt iteration on a single category.`scripts/categories.txt`\n\nWrap the run in `tmux`\n\nfirst so an SSH drop doesn't kill the job:\n\n```\ntmux new -s sas\n./scripts/run_all.sh 2>&1 | tee /workspace/run.log\n```\n\n(Detach with `Ctrl-b d`\n\n; reattach later with `tmux attach -t sas`\n\n.)\n\nThe wrapper runs, in order:\n\n**Build JSONLs**—`list_to_jsonl.py`\n\nturns each`prompts/<cat>.txt`\n\ninto`prompts/<cat>.jsonl`\n\n, stamping a per-category`variants`\n\ncount from(5–6 candidates per prompt; more for high-value kick/snare/clap/808).`scripts/drum_gate_config.py`\n\n**Generate + gate, with retry-to-target**—`run_retry.py`\n\ndrives`batch_generate.py`\n\n(one model load,**batched**—`BATCH_SIZE`\n\ngenerations per call) then`gate_drums.py`\n\n, which best-of-N selects one winner per prompt and rejects clipped / silent / off-band / multi-hit / wrong-decay samples (per-category profiles in`drum_gate_config.py`\n\n). Any prompt whose candidates*all*fail is re-rolled with fresh seeds (up to`MAX_RETRIES=2`\n\n); if a category still has fewer than`TARGET=150`\n\nsurvivors, more prompt lines are topped up. Winners land in`gated_drums/<cat>/`\n\n.**Post-process**—`postprocess_oneshots.py`\n\ntrims, LUFS-normalizes to**-16 LUFS**(-1 dBFS peak ceiling), and mono-downmixes the gated winners into`processed/<cat>/`\n\n. Each WAV ships with a sibling`<id>.txt`\n\nholding its exact prompt; the prompt is also embedded in the WAV's RIFF INFO comment chunk so Logic / Ableton / Audacity /`ffprobe`\n\n/ macOS Get-Info show it. Merging runs is a single`rsync`\n\n— no manifest to reconcile.\n\nFirst call downloads Stable Audio 3 (~5–8 GB for medium, ~3 min, one-time). SA3\nconverges in ~8 diffusion steps (vs 120 for SAO 1.0), so generation is fast — with\nthe gate + retry, the **bottleneck is now the CPU gate**, not generation.\n\nLUFS normalization keeps kicks, hats, and splashes at the same perceived volume\nin your sampler. Revert to peak-only with `--normalize peak`\n\n, or override with\n`--target-lufs -14`\n\n(streaming-hot) / `-10`\n\n(commercial-hot).\n\n**Useful env knobs** (all optional):\n\n```\nONLY=\"kick clap\" LIMIT=10 ./scripts/run_all.sh   # tiny test slice (TARGET auto-0)\nGATE=0          ./scripts/run_all.sh             # legacy: keep all raw, no gate\nBATCH_SIZE=32   ./scripts/run_all.sh             # bigger batches on an 80GB GPU\nTARGET=200 MAX_RETRIES=3 ./scripts/run_all.sh    # chase a higher survivor count\nSTEPS=4         ./scripts/run_all.sh             # cheaper/faster iteration\n```\n\n**Single-category iteration** (tuning prompts):\n\n```\n# Edit prompts/kick.txt, then run just that category. --skip-existing means\n# only new/changed prompt lines hit the GPU.\nONLY=kick ./scripts/run_all.sh\nls /workspace/outputs/processed/\n# Should show one subdir per enabled drum role (24 by default)\nfind /workspace/outputs/processed -name \"*.wav\" | wc -l\n# ~150 per category survive the gate (TARGET) → ~3,600 for all 24\n```\n\nEvery `<id>.wav`\n\nships with a sibling `<id>.txt`\n\ncontaining its\ngeneration prompt. Spot-check the pairing:\n\n```\ntest \"$(find /workspace/outputs/processed -name '*.wav' | wc -l)\" \\\n   = \"$(find /workspace/outputs/processed -name '*.txt' | wc -l)\" \\\n   && echo \"wav/txt pairing OK\"\n```\n\nOn the pod:\n\n```\ncd /workspace\ntar czf run.tar.gz outputs/processed\nls -lh run.tar.gz\n```\n\n(We use `tar`\n\nrather than `zip`\n\nbecause the stock RunPod PyTorch image\ndoesn't ship `zip`\n\n. `tar`\n\nis preinstalled everywhere. `tar`\n\nalso recurses\ninto the per-category subdirs automatically.)\n\nIn a **second** Mac terminal (don't close the SSH session yet — you still\nneed it for step 10):\n\n```\ncd ~/Downloads\nscp -P <POD_PORT> root@<POD_IP>:/workspace/run.tar.gz .\ntar xzf run.tar.gz\nopen outputs/processed                 # Finder + QuickLook to audition\n```\n\n`<POD_PORT>`\n\nand `<POD_IP>`\n\nare the same ones from your step-1 SSH command.\nThe unpacked structure is one folder per category:\n\n```\noutputs/processed/\n  kick/        kick-c1da23da.wav   kick-e5d95885.wav   ...\n  snare-standard/   snare-standard-...wav\n  hat-closed/  ...\n  ...etc\n```\n\nThis is the step you will forget. The pod bills **$0.49/hr** for as long as\nit exists, whether you're using it or not.\n\n**Idle overnight**≈ $12** Forgotten for a week**≈ $80** Forgotten for a month**≈ $350\n\nIn the [RunPod console](https://www.runpod.io/console/pods), click your pod's\ncard → **Terminate**. Confirm.\n\nTermination wipes `/workspace`\n\n. That's fine — you have the zip on your Mac.\nNext month, you start fresh from step 1.\n\n```\nsas-sample-generator/\n├── README.md                                   ← you are here\n├── stable_audio_open_batch_oneshot_guide.md    ← long-form background\n├── requirements.txt\n├── docs/SAMPLE_GEN_V3_PLAN.md                  ← v3 design rationale & locked decisions\n├── prompts/\n│   ├── <drum-cat>.txt        (24 files, 200 prompts each)  kick.txt, clap.txt, 808.txt, …\n│   └── pitched/<cat>.txt     (28 files, 200 prompts each)  pianos.txt, basses.txt, pads.txt, …\n├── scripts/\n│   ├── setup.sh                       ← bootstrap a pod (venv + deps + SA3 tools)\n│   ├── gen_prompts.py                 ← (re)generate the 200-prompt corpora\n│   │\n│   ├── run_all.sh                     ← DRUM pipeline driver\n│   ├── categories.txt                 ← which drum roles to run\n│   ├── category_config.py             ← per-role negatives + durations\n│   ├── drum_gate_config.py            ← per-role gate profiles + variant counts\n│   ├── gate_drums.py                  ← drum quality gate + best-of-N\n│   ├── list_to_jsonl.py               ← .txt → .jsonl (drums)\n│   ├── postprocess_oneshots.py        ← trim / LUFS / mono / tag\n│   │\n│   ├── run_pitched.sh                 ← PITCHED pipeline driver\n│   ├── pitched_categories.txt         ← which instruments to run\n│   ├── pitched_category_config.py     ← target pitches, durations, zones, variants\n│   ├── gate_pitched.py                ← pitch + quality gate + best-of-N\n│   ├── list_to_jsonl_pitched.py       ← .txt → .jsonl (multi-source fan-out)\n│   ├── enrich_pitched.py              ← pitch-correct + multi-zone render + manifest\n│   ├── pitch_report.py                ← measured-vs-target pitch-accuracy report\n│   ├── repair_instrument_pitch.py     ← post-hoc single-source pitch fix\n│   │\n│   ├── run_retry.py                   ← generate+gate+retry-to-target (both pipelines)\n│   ├── batch_generate.py              ← batched SA3 inference\n│   ├── build_pack.py                  ← deterministic versioned pack zips\n│   ├── README-PACKS.md                ← pack build + publish runbook\n│   └── benchmark.py / sync.sh         ← optional helpers\n├── tests/                             ← CPU-only suite (see \"Tests\" below)\n└── outputs/                           ← gitignored; all generated audio lands here\n    ├── raw/<cat>/<id>_vNN.wav         ← SA3 candidates (N variants/prompt)\n    ├── raw/<cat>/_metadata/<id>_vNN.json  ← seed, model, gen params (stays on pod)\n    ├── gated_drums/<cat>/<id>.wav     ← drum gate winners\n    ├── processed/<cat>/<id>.{wav,txt} ← final drums (trimmed/LUFS/mono; ICMT = prompt)\n    ├── gated/<cat>/<id>.{wav,gate.json}   ← pitched gate winners + scores\n    ├── _reports/pitch_summary.{json,md}   ← pitch-accuracy report (report stage)\n    └── instruments/<cat>/<id>/        ← final instruments (sources/, zones/, manifest.json)\n```\n\nSibling pipeline to the drum one. Same Stable Audio 3 generator, different\ndownstream: a 5-stage quality+pitch gate (prefilter, onset, sustain-plateau,\nCREPE/spectral pitch with a sub-bass octave cross-check, BasicPitch polyphony\nwhen the TF/numpy ABI is happy), **multi-source pitch correction**, and zone\npre-rendering (RubberBand R3, formant-preserving). Emits a per-instrument\n`manifest.json`\n\nconsumed by `sas-instrument-plugin`\n\n.\n\n**Multi-source real-pitch sampling (the v3 headline).** Wide-range instruments\nare generated at **2–4 real source pitches** spanning their natural register\n(e.g. basses at E1/E2/E3). Enrich assigns each playable zone to its *nearest*\nreal source, so no zone is pitch-shifted more than ~half the inter-source gap —\nsmall, artifact-free shifts instead of stretching one root ±12 semitones. Toggle\nwith `SAS_MULTI_SOURCE=1`\n\n(default; `0`\n\nreproduces v1 single-source for an A/B).\n\n**28 categories ship, 200 prompts each (5,600 prompts):**\n\n| Category | Source pitch(es) | Dur | Var | Notes |\n|---|---|---|---|---|\n`synths` |\nC3 | 5.0s | 6 | analog mono, FM, wavetable, acid |\n`lead-supersaw` |\nC4·C5 | 5.0s | 6 | multi-source |\n`lead-fm` |\nC4 | 5.0s | 6 | |\n`lead-acid` |\nC3 | 4.0s | 6 | 303-style |\n`pluck-synth` |\nC4 | 3.0s | 6 | |\n`plucks` |\nC4 | 3.0s | 6 | |\n`keys` |\nC3 | 5.0s | 6 | Rhodes, Wurli, clav, DX7 |\n`pianos` |\nC2·C3·C4·C5 | 5.0s | 6 | multi-source, step 2 |\n`organs` |\nC3 | 8.0s | 8 | open-ended |\n`basses` |\nE1·E2·E3 | 6.0s | 8 | multi-source, 30 Hz floor |\n`808-bass` |\nC2·C3 | 6.0s | 8 | multi-source, 25 Hz floor |\n`reese-bass` |\nE2·E3 | 6.0s | 8 | multi-source |\n`pads` |\nC3 | 12.0s | 8 | open-ended, step 2 |\n`strings` |\nA2·A3·A4 | 8.0s | 8 | multi-source, open-ended, step 2 |\n`brass` |\nA2·A3·A4 | 6.0s | 8 | multi-source, open-ended |\n`winds` |\nD3·D4·D5 | 5.0s | 8 | multi-source, open-ended, step 2 |\n`accordion` |\nF#3 | 6.0s | 8 | open-ended |\n`bells` |\nC5 | 4.0s | 6 | glockenspiel, FM, music box |\n`mallets` |\nC4 | 3.0s | 6 | marimba, vibes, kalimba |\n`percussion` |\nC4 | 2.0s | 6 | tonal/tuned only |\n`timpani` |\nF2·F3 | 4.0s | 8 | multi-source, tuned drum |\n`guitars` |\nE2·E3·E4 | 4.0s | 6 | multi-source |\n`banjos` |\nC4·G4 | 3.0s | 6 | multi-source |\n`mandolin` |\nA4 | 3.0s | 6 | |\n`harp` |\nC3·C4·C5 | 4.0s | 6 | multi-source |\n`sitar` |\nC4 | 4.0s | 6 | |\n`vocals` |\nA3 | 5.0s | 20 | choir, chops, vocoded (SA3 vocals are weak) |\n`choir` |\nA3 | 10.0s | 18 | open-ended |\n\n\"Var\" = SA3 candidates per (prompt × source pitch); the gate keeps the best.\nPitches use the C4 = MIDI 60 convention; multi-source counts apply with\n`SAS_MULTI_SOURCE=1`\n\n(default). Full config in\n[ scripts/pitched_category_config.py](/shiehn/sas-sample-generator/blob/main/scripts/pitched_category_config.py).\n\nTo **subset** which categories run, edit\n[ scripts/pitched_categories.txt](/shiehn/sas-sample-generator/blob/main/scripts/pitched_categories.txt) —\ncomment any line with\n\n`#`\n\nto skip that category.`scripts/setup.sh`\n\ninstalls **everything** the pitched pipeline needs:\n\n- All Python deps from\n`requirements.txt`\n\n(librosa, torchcrepe, basic-pitch, pyloudnorm, soxr, …) - All system packages —\n`rsync`\n\n(transfer to Mac),`tmux`\n\n(long sessions survive disconnect),`rubberband-cli`\n\n(enrich shells out to it directly for formant-preserving pitch shift),`ffmpeg`\n\n(audio inspection),`zip`\n\n/`unzip`\n\n(archives) - The compiled stable-audio-tools from git main (the PyPI release doesn't support SA3-medium)\n- CUDA-12.8 torch wheels (Blackwell-compatible, also works on older Hopper/Ampere/Ada)\n\nYou should **never** need to apt-get or pip install anything on a pod\nafter running `setup.sh`\n\n. If you do, treat it as a bug in `setup.sh`\n\nand\nadd it there.\n\n```\nbrew install rubberband                          # pitch-shift backend\ncd ~/path/to/sas-sample-generator                # your local repo\npip install -r requirements.txt                  # one-time, ~5 min\n```\n\nDesigned to be safely repeatable from a cold start. The whole pipeline:\n**~15 min setup + ~30 min generate + ~10 min gate + ~30 min enrich + transfer**.\n\n[runpod.io/console/pods](https://www.runpod.io/console/pods) → **Deploy → GPU Pod**:\n\n| Setting | Value | Why |\n|---|---|---|\n| GPU | RTX A6000 / 4090 / 5090 / L40S / A100 (24+ GB VRAM) |\nSA3-medium fits in 16 GB; 24 GB gives headroom |\n| Template | most recent RunPod PyTorch with CUDA 12.x |\nmatches our cu128 wheels |\nContainer Disk |\n100 GB |\npersistent across pod restart; holds venv + HF model cache + outputs |\nNetwork Volume |\nNone |\nRunPod's \"migrate to new host\" flow has been known to attach a tiny 10 GB network volume — don't let it. We use container disk only |\n| Expose | SSH (port 22, default) |\n\n**Critical:** the field is named \"Container Disk\" — the persistent SSD. Do NOT confuse with \"Network Volume\" or \"Volume Disk\".\n\nClick **Deploy On-Demand**. Wait ~30 sec for status `RUNNING`\n\n.\n\nCopy the SSH command from **Connect → SSH over exposed TCP**. It looks like:\n\n```\nssh root@<POD_IP> -p <POD_PORT> -i ~/.ssh/id_ed25519\nssh root@<POD_IP> -p <POD_PORT> -i ~/.ssh/id_ed25519\n```\n\nType `yes`\n\non first connect. If `Permission denied`\n\n: `ssh-add ~/.ssh/id_ed25519`\n\n.\n\n```\ncd /workspace && \\\ngit clone https://github.com/shiehn/sas-sample-generator.git && \\\ncd /workspace/sas-sample-generator && \\\n./scripts/setup.sh 2>&1 | tee /root/setup.log\n```\n\nLook for these \"OK\" markers near the end of setup.log:\n\n```\n[setup]   rsync:          rsync version 3.x.x ...\n[setup]   tmux:           tmux 3.x\n[setup]   rubberband:     /usr/bin/rubberband\n[setup]   ffmpeg:         ffmpeg version ...\n[setup] cuda available: True\n[setup] device:         NVIDIA RTX 4090\n[setup] done.\n```\n\nIf `cuda available: False`\n\n→ you deployed onto a CPU template; terminate, redeploy with PyTorch GPU.\n\n```\nsource /root/.venv/bin/activate\nhf auth login\n```\n\nPaste your HF read token. Answer `n`\n\nto \"Add token as git credential\".\n\n**First time on a new HF account:** in your browser, visit\n[stabilityai/stable-audio-3-medium](https://huggingface.co/stabilityai/stable-audio-3-medium)\nand accept BOTH the SA3 community license **AND** the underlying Gemma\nterms. Without both, the model download fails with `GatedRepoError`\n\n.\nToken IS your account — accept while logged into the same HF account\nyour token belongs to.\n\nVerify access (under 5 seconds):\n\n```\nhf download stabilityai/stable-audio-3-medium model_config.json --local-dir /tmp/sa3-test\nls /tmp/sa3-test/\n```\n\nIf `model_config.json`\n\nis listed: cleared.\n\nThe default ships with **all 16** categories enabled. For a quick test\nor a focused run:\n\n```\n# Edit scripts/pitched_categories.txt — comment out any category with `#`\nnano scripts/pitched_categories.txt\n```\n\nAfter the test, restore with `git checkout scripts/pitched_categories.txt`\n\n.\n\nRun **generate + gate + report** on the GPU pod; enrich runs later on your Mac\n(it's CPU-bound).\n\n```\ntmux new -s pitched\n\n# Inside tmux:\ncd /workspace/sas-sample-generator\nsource /root/.venv/bin/activate\nsource /workspace/.bash_env\n\nSTAGES=generate,gate,report ./scripts/run_pitched.sh 2>&1 | tee outputs/run.log\n```\n\n`run_pitched.sh`\n\nbuilds the JSONLs (multi-source fan-out — one job per\nprompt × source pitch, with the per-category `variants`\n\ncount), then\n`run_retry.py`\n\ndrives **batched** generation + `gate_pitched.py`\n\nwith the same\n**retry-to-target** loop as drums (re-roll all-fail prompts up to\n`MAX_RETRIES=2`\n\n, top up until `TARGET=150`\n\ninstruments survive). The `report`\n\nstage then writes `outputs/_reports/pitch_summary.{json,md}`\n\nso you can read\nmeasured-vs-target pitch accuracy *before* transferring.\n\n**Detach with Ctrl-b d.** The run keeps going even if SSH drops.\n\n**Reattach later** (from any new SSH session — possibly a new IP/port if migrated):\n\n```\ntmux attach -t pitched\n```\n\nMonitor from outside tmux:\n\n```\ntail -f outputs/run.log\nnvidia-smi\n```\n\n**Throughput:** batched generation is ~1–3 h; the **CPU gate is the bottleneck**\n(torchcrepe + basic-pitch + librosa per variant). For all 28 categories at full\nvariant counts, budget up to ~a day of pod time. Dial variant counts down in\n`pitched_category_config.py`\n\n, or run a subset, if wall-clock matters.\n\nUseful knobs: `ONLY=pianos LIMIT=5`\n\n(tiny slice), `BATCH_SIZE=32`\n\n(80 GB GPU),\n`MAX_RETRIES=0`\n\n(one pass, no retry), `INIT_ANCHOR=1`\n\n(experimental init_audio\npitch anchoring — default off), `SAS_MULTI_SOURCE=0`\n\n(single-source A/B).\n\nWhen `STAGES=generate,gate`\n\nfinishes, before transferring:\n\n```\n# Per-category pass rates\nfor d in outputs/gated/*/; do\n  cat=$(basename \"$d\")\n  [[ \"$cat\" == \"_failures\" ]] && continue\n  passed=$(ls \"$d\"*.wav 2>/dev/null | wc -l)\n  failed=$(ls \"$d/_failures\"/*.json 2>/dev/null | wc -l)\n  total=$((passed + failed))\n  if [[ $total -gt 0 ]]; then\n    rate=$((passed * 100 / total))\n    printf \"  %-18s passed=%3d  failed=%3d  pass-rate=%d%%\\n\" \"$cat\" \"$passed\" \"$failed\" \"$rate\"\n  fi\ndone\n\necho \"Total gated: $(find outputs/gated -name '*.wav' -not -path '*_failures*' | wc -l)\"\ndu -sh outputs/gated\n```\n\nExpected (with the current thresholds, 2026-05-22): **80–100% pass rate per category**. If a category is below 50%, look in `outputs/gated/<cat>/_failures/<id>.json`\n\nto see why prompts are failing.\n\nThe pod has `rsync`\n\ninstalled by `setup.sh`\n\n. On your Mac:\n\n```\nmkdir -p ~/sas-pitched-out\nrsync -avzP -e \"ssh -p <POD_PORT> -i ~/.ssh/id_ed25519\" \\\n  root@<POD_IP>:/workspace/sas-sample-generator/outputs/gated/ \\\n  ~/sas-pitched-out/gated/\n```\n\nFor ~4 GB at typical RunPod / home upload speeds, expect 10–20 min.\n`rsync`\n\nresumes on interruption — just re-run the same command if SSH drops.\n\nVerify locally:\n\n```\nfind ~/sas-pitched-out/gated -name '*.wav' -not -path '*_failures*' | wc -l   # should match step 7\ndu -sh ~/sas-pitched-out/gated\ncd ~/path/to/sas-sample-generator\ngit pull                                # pick up any threshold updates\npip install -r requirements.txt         # idempotent\n\nexport SAS_OUTPUTS_DIR=~/sas-pitched-out\nSTAGES=enrich ./scripts/run_pitched.sh\n```\n\nEnrich groups the surviving source pitches of each prompt into **one\nmulti-source instrument** under `~/sas-pitched-out/instruments/<cat>/<id>/`\n\n:\n\n`sources/<midi>.wav`\n\n—**24-bit** real source samples (1 per source pitch: 2–4 for multi-source categories), pitch-corrected + normalized to -20 LUFS`zones/<midi>.wav`\n\n—**16-bit WAV** pre-rendered playable zones (every 2–3 semitones), each rendered from its nearest real source (was 24-bit FLAC pre-v3; WAV is memory-mapped by the Tracktion sampler with no decode stall)`manifest.json`\n\n—`schema_version: 1`\n\n, disjoint ordered zones`prompt.txt`\n\n— original positive prompt\n\nIt parallelizes across instruments (`ProcessPoolExecutor`\n\n) and shells out to the\n`rubberband`\n\nCLI for pitch shifts (`brew install rubberband`\n\non the Mac).\n\n[runpod.io/console/pods](https://www.runpod.io/console/pods) → pod card → **Terminate** (NOT Stop). Compute billing stops immediately. Volume billing (if any auto-created Network Volume snuck in) stops only on Terminate.\n\nThen [runpod.io/console/user/storage](https://www.runpod.io/console/user/storage) → **Network Volumes** → check for any `outside_*`\n\norphan from a migration → Delete.\n\nRunPod sometimes moves your pod to a different physical host mid-run. Symptoms:\n\n- SSH connection drops mid-session\n`Connection refused`\n\nwhen reconnecting on the same IP/port- Pod shows \"Stopped\" briefly, then \"Running\" again at a new address\n\n**The pod, the venv, the HF cache, and all outputs/ data persist on the container disk** as long as the pod isn't terminated. You just need fresh connection info.\n\n- Open RunPod console → click your pod card → check the\n**Connect → SSH over exposed TCP** panel for the new IP and port (both can change). - Clear the old SSH host key on your Mac:\n\n```\nssh-keygen -R '[<NEW_IP>]:<NEW_PORT>'\n```\n\n- SSH back in with the new details. Run\n`tmux attach -t pitched`\n\n— your run is still going. - If you were mid-rsync, just re-run the rsync command with the new\n`-p <NEW_PORT>`\n\nand`root@<NEW_IP>`\n\n— it picks up where it stopped.\n\nThis bit us twice this session (May 2026). Symptoms are unambiguous; recovery takes 30 seconds.\n\nSA3 doesn't reliably hit a target pitch from a text prompt — that's a known limitation of text-to-audio diffusion models. Enrich now compensates intelligently:\n\n| If measured pitch is… | Enrich does… | Result |\n|---|---|---|\nwithin `max_correction_semitones` of target (default 3) |\nshifts all the way to the original target | Sample is at exactly the prompted MIDI note; preserves prompt semantics |\n| further away than that | snaps to the nearest integer semitone |\nSample is at the closest \"logical\" MIDI note (always ≤50 cent shift, no audible artifacts) |\n\nEither way: every output sample lands on an **exact MIDI semitone** with the smallest possible pitch shift. The zone rendering loop centers on that effective root, so the sampler always has a clean zone at the sample's actual pitch.\n\n`max_correction_semitones`\n\nis per-category in `scripts/pitched_category_config.py`\n\n. Set to `0`\n\nto always snap to nearest semitone (never shift to target). Set to a large value (24+) to always shift to target.\n\n| Stage | What it checks | What rejection means |\n|---|---|---|\n`prefilter` |\nClipping, dead channels, all-silent buffers | Sample is broken at the file level |\n`onset` |\nTime from buffer start to first transient | `slow_onset` → SA3 added a fade-in / silence preamble (>300ms) |\n`sustain` |\nLongest plateau within 12 dB of peak RMS | `short_stab` → audio decays too fast or has no held region |\n`pitch` |\nCREPE periodicity + measured-vs-target | `no_voiced_frames` / `unconfident` → unpitched output; (tolerance 9999) so enrich's snap-to-nearest-semitone can do its job`wrong_pitch` is OFF by default |\n`polyphony` |\nBasicPitch note count after vibrato bypass | Disabled when TF/numpy ABI mismatches (common on RunPod) — gate prints one warning at start, then runs without it |\n\nThe gate scores winners by `confidence² × exp(-|cents|/50) × sus_quality`\n\n. With `wrong_pitch`\n\ndisabled, the pitch term collapses to ~0 for far-off samples, so all variants of a prompt can tie at score=0.000 — the picker just grabs v00 by default in that case. Acceptable for now.\n\n```\noutputs/\n├── raw/<category>/                                  ← SA3 candidates (N variants per prompt × source pitch)\n│   ├── <id>_v00.wav, <id>_v01.wav, ...\n│   └── _metadata/<id>_v0N.json                      ← seed, model, generation_seconds, batch_size\n├── gated/<category>/                                ← gate winners only\n│   ├── <id>.wav                                     ← chosen variant\n│   ├── <id>.gate.json                               ← per-gate scores + measured pitch\n│   └── _failures/<id>.json                          ← prompts where ALL variants rejected\n├── _reports/pitch_summary.{json,md}                 ← measured-vs-target accuracy (report stage)\n└── instruments/<category>/<instrument-id>/          ← final library, sampler-consumable\n    ├── sources/<midi>.wav                           ← 24-bit real source pitches (1–4)\n    ├── zones/<midi>.wav                             ← 16-bit WAV pre-rendered zones\n    ├── manifest.json                                ← schema_version 1, disjoint ordered zones\n    └── prompt.txt                                   ← original positive prompt\n```\n\nThe v3 full run is much bigger than a quick slice: 28 categories × 200 prompts ×\n2–4 source pitches × 6–20 variants = tens of thousands of candidates. Generation\nis cheap and batched; the **CPU gate dominates wall-clock** (per-variant CREPE +\nbasic-pitch + librosa). On an **A100 80 GB** (~~$0.89–1.89/hr) the whole pitched\ncampaign runs in roughly ~~$15–35**, with drums adding\na few dollars. The original $200 budget is never the binding constraint.**half a day to a day** → **\n\nTo cut cost/time: run fewer categories (`ONLY=…`\n\n), lower variant counts in\n`pitched_category_config.py`\n\n, or set `MAX_RETRIES=0`\n\n. A single-category slice\n(`ONLY=pianos LIMIT=5`\n\n) is a few minutes and a few cents — use it to dial prompts\nbefore committing the full campaign. Enrich is local on your Mac → **$0**.\n\n```\nprompts/pitched/<category>.txt                # one prompt per line, # comments\nscripts/pitched_categories.txt                # which categories to run (comment to skip)\nscripts/pitched_category_config.py            # per-category target pitch, duration, sustain thresholds, etc.\n```\n\nFast iteration on a single category:\n\n```\n# 1. ONLY=<cat> overrides the enable-list — no need to edit pitched_categories.txt\n# 2. Edit prompts/pitched/<cat>.txt\n# 3. Re-run generate+gate+report (LIMIT caps prompts; sets TARGET=0)\nONLY=<cat> LIMIT=10 STAGES=generate,gate,report ./scripts/run_pitched.sh\n# 4. Listen to outputs/gated/<cat>/*.wav, read outputs/_reports/pitch_summary.md, re-run\n```\n\nRe-author / extend the corpora with\n[ scripts/gen_prompts.py](/shiehn/sas-sample-generator/blob/main/scripts/gen_prompts.py) (deterministic, preserves\nexisting lines so hashes stay stable, holds the EDM ratio).\n\n`--skip-existing`\n\nin `batch_generate.py`\n\nmeans re-running won't regenerate samples you already have — only new prompt lines hit the GPU.\n\n`STAGES=generate,gate,enrich,report`\n\n— pitched only; comma-separated subset (default: all four)`STEPS=8`\n\n— diffusion steps (SA3 converges fast)`BATCH_SIZE=16`\n\n— generations per model call (32–64 on an 80 GB GPU)`TARGET=150`\n\n— per-category minimum surviving samples (`0`\n\n= no top-up)`MAX_RETRIES=2`\n\n— re-roll rounds for all-fail prompts before topping up (`0`\n\n= one pass)`ONLY=\"a b\"`\n\n/`LIMIT=N`\n\n— run a subset of categories / cap prompts per category (test slice; auto-sets`TARGET=0`\n\n)`GATE=0`\n\n— drums only; skip the quality gate (legacy keep-all)`INIT_ANCHOR=1`\n\n— pitched only; experimental init_audio pitch anchoring (default off)`SAS_MULTI_SOURCE=0`\n\n— pitched only; disable multi-source (single root, span 12; for an A/B)`SAS_OUTPUTS_DIR=/path`\n\n— override outputs location (default`/workspace/outputs`\n\non pod,`./outputs`\n\nlocal)\n\nPer-prompt **variant counts are per-category now** (in `drum_gate_config.py`\n\n/\n`pitched_category_config.py`\n\n), not a global env var.\n\nThe `sas-instrument-plugin`\n\nwalks `outputs/instruments/<cat>/<id>/`\n\n, parses\neach `manifest.json`\n\n, and uses the `zones[]`\n\narray to call\n`host.setTrackInstrumentSampler`\n\non the chosen track. Disjoint zones +\nper-zone `root_midi`\n\nmean the engine pitch-shifts the nearest\npre-rendered zone for any played MIDI note, with the smart-corrected\nsample as the unshifted root. Since enrich locks every sample to an\ninteger MIDI semitone, the sampler never has to deal with off-pitch\nsources.\n\nGenerated audio ships to the [Signals & Sorcery](https://signalsandsorcery.com)\napp as **versioned pack zips** built by\n[ scripts/build_pack.py](/shiehn/sas-sample-generator/blob/main/scripts/build_pack.py) — deterministic (fixed mtimes +\nsorted entries → byte-identical zip + sha256 from an identical source tree):\n\n| Pack | Source dir | Zip | Approx size (v3) |\n|---|---|---|---|\n| Drums | `outputs/processed/` |\n`sas-drum-pack-v{N}.zip` |\n~2–3 GB (24 roles) |\n| Instruments | `outputs/instruments/` |\n`sas-instrument-pack-v{N}.zip` |\n~20–24 GB (28 cats × ~150) |\n| Loops | `outputs/loops/` |\n`sas-loop-library-v{N}.zip` |\nexternal loop library (not generated here) |\n\n```\npython scripts/build_pack.py --smoke-test                   # cheap determinism check\npython scripts/build_pack.py --pack drums --version 1       # real build → ./dist/\npython scripts/build_pack.py --pack instruments --version 1\n```\n\n**Ready-to-consume directories (no zip / download step).** To produce the two\nlibraries as folders you can drop straight into the app's install location —\n`<userData>/sample-packs/{drums,instruments}/`\n\n— use `--format dir`\n\n, or the\nwrapper. Each emits `dist/<subdir>/`\n\nwith the `_pack-version.json`\n\nmarker at its\nroot: exactly the tree the app expects on disk, ready out of the box.\n\n```\n# after gating + enriching (processed/ + instruments/ are populated):\nDRUM_VERSION=3 INSTRUMENT_VERSION=3 ./scripts/build_libraries.sh   # → dist/drums/, dist/instruments/\n# drop into your local app (macOS path shown):\nrsync -a dist/drums/        ~/Library/Application\\ Support/signals-and-sorcery/sample-packs/drums/\nrsync -a dist/instruments/  ~/Library/Application\\ Support/signals-and-sorcery/sample-packs/instruments/\n```\n\nThe marker `version`\n\nmust equal that pack's `expectedVersion`\n\nin\n`sas-app/src/shared/constants/sample-packs.ts`\n\n(plain string match) or the app\ntreats the folder as a different version. For GCP distribution use `--format zip`\n\n(default) and follow the publish runbook below.\n\nThe build prints `sizeBytes`\n\n+ `sha256`\n\n; paste those into\n`sas-app/src/shared/constants/sample-packs.ts`\n\n(bump `expectedVersion`\n\n+ the\ndownload URL) so the app detects the new version and prompts a re-download.\n**Never overwrite a published version — always bump.** Full publish runbook\n(GCP upload, version rules, what's in/excluded, the v3 WAV-zone note):\n[ scripts/README-PACKS.md](/shiehn/sas-sample-generator/blob/main/scripts/README-PACKS.md).\n\nCPU-only suite — no GPU, no Stable Audio. It covers everything *around* model\ngeneration: pitch detection, both gates, multi-source enrich, retry-to-target\nhelpers, config/prompt/enable wiring, loudness targets, `list_to_jsonl`\n\n, the\npitch report, and the deterministic pack builder.\n\n```\n./tests/run_tests.sh\n```\n\nKeep it green — run before and after any pipeline-code change. Deps live in the\nproject `.venv`\n\n(numpy, soundfile, pyloudnorm, librosa); the `rubberband`\n\nCLI is\noptional (the enrich test degrades gracefully without it).\n\n| Symptom | Most likely cause | Fix |\n|---|---|---|\n`Permission denied (publickey)` on ssh |\nprivate key not loaded into agent | `ssh-add ~/.ssh/id_ed25519` |\n`setup.sh` hangs at `Installing collected packages:` for >5 min |\nsomething redirected the venv onto `/workspace` (MooseFS); script defaults to `/root/.venv` for a reason |\ncheck `echo $VENV_DIR` — should be `/root/.venv` . If overridden, unset it and re-run |\n`cuda available: False` after `setup.sh` |\npicked a CPU template | terminate; re-deploy with PyTorch GPU template |\n`huggingface_hub.utils._errors.GatedRepoError` |\ndidn't accept the SA3 license (or the Gemma terms it inherits) | visit the\n|\n\n`batch_generate.py`\n\nerrors `CUDA out of memory`\n\n`--default-duration`\n\nor `--num-waveforms-per-prompt 1`\n\n`one shot, no loop, no hi hats, no snare`\n\nto every prompt`dry, no reverb, no ambience`\n\nto prompts`negative_prompt`\n\nmay be excluding the target — bug in `scripts/category_config.py`\n\n`scripts/category_config.py`\n\n, audit the negative for that category; nothing in it should match the target sound`run_all.sh`\n\nskips a category`prompts/<cat>.txt`\n\nis missing or has only comments`prompts/`\n\nhas the .txt file and contains non-comment lines`rejected/`\n\nwith `reason=lufs_unmeasurable`\n\n`pyloudnorm`\n\nand check they cluster near `--target-lufs`\n\n. The pipeline log at `/workspace/run.log`\n\n(from step 6) records the as-run postprocess flags for each category.`tmux new -s sas`\n\nBEFORE running, reattach with `tmux attach -t sas`\n\nStable Audio 3 needs only ~8 diffusion steps (vs 120 for SAO 1.0), so generation\nis fast and batched. Since v3 added the quality gate + retry-to-target, the\n**CPU gate — not generation — sets the wall-clock**, and the per-category\nminimum-survivor target (`TARGET=150`\n\n) re-rolls until each category fills.\n\n| Run shape | Where the time goes | Rough cost |\n|---|---|---|\nSingle-category slice (`ONLY=… LIMIT=…` ) |\na few minutes, gate-bound | a few cents |\nFull drum run (24 roles, gate + retry) |\n~1 h on a big GPU | ~$1–3 |\nFull pitched run (28 cats, multi-source, gate + retry) |\n½–1 day, gate-bound | ~$15–35 |\n\nKeep a pod alive between iterations and `--skip-existing`\n\nskips anything already\ngenerated — so re-running after a prompt tweak only pays for the new lines.\nTerminate as soon as generate + gate finish (enrich is local and free).\n\nThis is a **public repo**. Never commit:\n\n- Hugging Face tokens\n- RunPod API keys\n- B2 / R2 / S3 keys\n- SSH private keys\n- The generated WAVs (gitignored already)\n\n`.gitignore`\n\ncovers `.env`\n\n, `*.token`\n\n, `*.secret`\n\n, `outputs/*`\n\n. If you ever\n`git add`\n\na file containing a secret by mistake: **rotate the secret first**,\nthen `git rm`\n\n+ commit + push. Treat anything that hit `main`\n\nas compromised.\n\n[ stable_audio_open_batch_oneshot_guide.md](/shiehn/sas-sample-generator/blob/main/stable_audio_open_batch_oneshot_guide.md)\ncovers:\n\n- Why Stable Audio 3 vs alternatives (and what changed from SAO 1.0)\n- Prompt-design rules and category-specific templates\n- Optional persistent Network Volume layout (for users running multiple times per week)\n- Optional rclone push to Backblaze B2 / Cloudflare R2 instead of\n`scp`\n\n- Optional custom Docker image\n- Cost-control deep dive\n\nThis is one piece of a larger ecosystem around the\n[Signals & Sorcery](https://signalsandsorcery.com) audio app.\n\n**Plugin SDK & templates**\n\n[sas-plugin-sdk](https://github.com/shiehn/sas-plugin-sdk)— types, components, and hooks for building generator plugins[sas-plugin-template](https://github.com/shiehn/sas-plugin-template)— starter template for new plugins[sas-chat-plugin](https://github.com/shiehn/sas-chat-plugin)— in-app conversational agent\n\n**Built-in plugins**\n\n[sas-stems-plugin](https://github.com/shiehn/sas-stems-plugin)— default AI audio-from-text + stem-splitting plugin[sas-loops-plugin](https://github.com/shiehn/sas-loops-plugin)— default audio loop / sample plugin[sas-synth-plugin](https://github.com/shiehn/sas-synth-plugin)— default synth plugin[sas-texture-plugin](https://github.com/shiehn/sas-texture-plugin)— texture/ambient plugin[sas-recorder-plugin](https://github.com/shiehn/sas-recorder-plugin)— line-in recording plugin\n\n**Audio tooling**\n\n[sas-audio-processor](https://github.com/shiehn/sas-audio-processor)— audio processing utilities[Signals2Surge](https://github.com/shiehn/Signals2Surge)— synth patch transfer to Surge XT\n\n**Infrastructure**\n\n[signals-and-sorcery-server](https://github.com/shiehn/signals-and-sorcery-server)— DAWNet API + WebSocket server[signals-and-sorcery-docs](https://github.com/shiehn/signals-and-sorcery-docs)— public docs\n\n**Other**\n\n[signalsandsorcery-game-ui](https://github.com/shiehn/signalsandsorcery-game-ui)— LLM-powered RPG frontend[SignalsAndSorcery](https://github.com/shiehn/SignalsAndSorcery)— earlier VueJS + Web Audio sample arrangement tool[Errantry](https://github.com/shiehn/Errantry)— E2E testing for agent-facing CLIs (drives this project's CLI too)", "url": "https://wpnews.pro/news/show-hn-stable-audio-3-one-shot-sample-generator-110gb-download", "canonical_source": "https://github.com/shiehn/sas-sample-generator", "published_at": "2026-05-31 13:05:24+00:00", "updated_at": "2026-05-31 13:18:46.619597+00:00", "lang": "en", "topics": ["generative-ai", "ai-tools", "ai-products"], "entities": ["Stable Audio 3", "Signals & Sorcery", "RunPod", "Stability AI", "Hugging Face", "Google Cloud Platform"], "alternates": {"html": "https://wpnews.pro/news/show-hn-stable-audio-3-one-shot-sample-generator-110gb-download", "markdown": "https://wpnews.pro/news/show-hn-stable-audio-3-one-shot-sample-generator-110gb-download.md", "text": "https://wpnews.pro/news/show-hn-stable-audio-3-one-shot-sample-generator-110gb-download.txt", "jsonld": "https://wpnews.pro/news/show-hn-stable-audio-3-one-shot-sample-generator-110gb-download.jsonld"}}