cd /news/ai-tools/show-hn-ploof-the-agent-native-cli-f… · home topics ai-tools article
[ARTICLE · art-43664] src=github.com ↗ pub= topic=ai-tools verified=true sentiment=↑ positive

Show HN: Ploof – The agent-native CLI for generating images, video, and audio

Ploof, an agent-native CLI for generating images, video, and audio, has been released on GitHub. The tool is designed to be operated by coding agents like Claude Code, Cursor, or Codex, which can install it and generate assets from prompts without manual command typing. It supports multiple providers including OpenAI and fal.ai, batch processing, and local authentication profiles.

read12 min views1 publishedJun 29, 2026
Show HN: Ploof – The agent-native CLI for generating images, video, and audio
Image: source

The agent-native CLI for generating images, video, and audio.

Hand it to Claude Code, Cursor, or Codex — they install it, read ploof learn

, and create your assets for you. Works great by hand, too.

Ploof turns a prompt into a file — and it's designed to be driven by your coding agent. The usual path isn't typing ploof

commands yourself; it's telling Claude Code, Cursor, or Codex what you want and letting it install ploof, read the built-in ploof learn

reference, and generate the assets on your behalf. No SDK wiring, no polling loops, no glue code — and it's a sharp manual CLI when you want it.

  • 🤖 Agent-native— built to be operated by coding agents:ploof learn

self-documents theinstalledversion, output is JSON/JSONL-clean, and flags stay stable. - 🎨 Every modality— images, video, and audio: generate, edit, extend, transcribe, translate. - 🔌 Multi-provider— OpenAI today, plus fal.ai's entire model marketplace viamodel run

. - 📦 Batch + parallel— declare assets in YAML, wire up dependencies, run them concurrently with one command. - 🔑 Local auth profiles— multiple keys per provider in~/.ploof

, with env-var overrides for CI. - 🧾 Reproducible— every asset gets a<file>.json

sidecar recording the prompt, params, and provider metadata.

Images Video Audio Any endpoint
OpenAI
generate · edit · variations generate · edit · extend · library · characters speech (TTS) · transcribe · translate
fal.ai
✓ marketplace via model run

More providers are planned — the provider registry is built to grow.

  • Use it with your coding agent
  • Install
  • Quick start
  • Authentication
  • Images
  • Video
  • Audio
  • Run any model endpoint
  • Batch manifests
  • Output and scripting
  • For AI agents
  • Configuration
  • Reference
  • Contributing

This is the main way to use ploof. You don't run the commands yourself — you tell your coding agent what you want, and it installs ploof, reads the built-in reference, authenticates, and generates the assets for you.

Paste this into Claude Code, Cursor, Codex, or any agent, and fill in the last line:

Use the ploof CLI to generate assets for this project.

Setup:
1. Install it if it isn't already: `bun i -g @miketromba/ploof` (or `npm i -g @miketromba/ploof`).
2. Run `ploof learn` and follow it — that's the canonical, always-current reference for the installed version.
3. If `ploof whoami openai` (or `ploof whoami fal`) shows I'm not authenticated, walk me through `ploof login`.

Task: <describe the asset you want — e.g. "a 1024x1024 hero image of a matte black water bottle on marble, saved to assets/hero.png">

Your agent takes it from ploof learn

and does the rest. Working in this repo often? Have it run ploof skill install

once to drop a bootstrap skill so the workflow auto-loads next time.

Why it works:ploof learn

prints a complete, version-matched guide to stdout, and every command emits clean JSON/JSONL with predictable exit codes — so agents operate ploof reliably instead of guessing or relying on stale training data.[More on the agent integration ↓]

bun i -g @miketromba/ploof

Requires Node 18+ (Bun optional). Your agent normally handles this for you (see above).

npm, pnpm, yarn, or run without installing #

npm  install -g @miketromba/ploof
pnpm add     -g @miketromba/ploof
yarn global add @miketromba/ploof

bunx @miketromba/ploof --help
npx  @miketromba/ploof --help

Prefer to drive it yourself — or want to see exactly what your agent will be doing? The manual path:

bun i -g @miketromba/ploof

ploof login openai --api-key sk-...

ploof image generate \
  --prompt "Studio product photo of a matte black water bottle on marble" \
  --out hero.png

hero.png

lands on disk next to hero.png.json

, a sidecar recording the exact prompt and parameters used. Run ploof --help

to see every command, or ploof learn

for the agent-oriented tour.

Credentials live in ~/.ploof/credentials.json

. Log in once per provider:

ploof login openai --api-key sk-...
ploof login fal    --api-key <fal-key>

ploof whoami openai      # show the active credential
ploof profiles           # list every stored profile
ploof logout fal         # remove credentials

Omit --api-key

and Ploof reads the matching env var, or securely prompts (no echo) in an interactive terminal.

Multiple keys? Name them with --profile

, then select per command:

ploof login openai --api-key sk-personal --profile personal
ploof login openai --api-key sk-work --profile work --no-default
ploof image generate --prompt "..." --profile work --out out.png

Env vars override stored credentials — ideal for CI:

Provider Variables
OpenAI PLOOF_OPENAI_API_KEY or OPENAI_API_KEY
fal.ai PLOOF_FAL_KEY or FAL_KEY (or split PLOOF_FAL_KEY_ID + PLOOF_FAL_KEY_SECRET )

OpenAI org / project / base URL can be set with --organization

, --project

, --base-url

(or PLOOF_OPENAI_ORG

, PLOOF_OPENAI_PROJECT

, PLOOF_OPENAI_BASE_URL

).

OpenAI image generation and editing default to gpt-image-2

. Image inputs accept local paths, http(s)

URLs, or -

for stdin.

ploof image generate \
  --prompt "Editorial portrait, dramatic side light" \
  --out assets/portrait.png \
  --size 1024x1024 --quality high

ploof image edit \
  --image product.png --image reference.png --mask mask.png \
  --prompt "Replace the background with a clean marble countertop" \
  --out assets/edited.png

ploof image variation --image product.png --out assets/variation.png

Image flags #

Flag Description
--model
Image model (default gpt-image-2 )
--size
e.g. 1024x1024
--quality
e.g. low , medium , high
--format / --output-format
png , jpeg , webp , …
--n
Number of images (--out file gets -1 , -2 , …)
--image (edit)
Input/context image; repeat for multiple
--mask (edit)
Mask for inpainting
--input-fidelity (edit)
OpenAI input fidelity
--background , --moderation , --style , --user , --stream , --output-compression , --partial-images , --response-format
Provider settings
--param key=value / --json '{…}'
Any provider-specific parameter

variation

is aliased as variations

and uses OpenAI's legacy endpoint, which currently supports only dall-e-2

. If it returns a 404, use image edit

for image-to-image instead.

OpenAI's asynchronous Videos API, defaulting to sora-2

. Pass --out

(or --download

) and Ploof waits for the job to finish, then downloads it.

ploof video generate \
  --prompt "Wide tracking shot of a paper city at blue hour" \
  --size 1280x720 --seconds 4 \
  --out assets/clip.mp4

ploof video extend --video-id video_abc123 --seconds 4 \
  --prompt "Camera rises over the rooftops" --out assets/extended.mp4

ploof video list --limit 20
ploof video status video_abc123
ploof video download video_abc123 --variant thumbnail --out thumb.webp
ploof video delete video_abc123

Video flags & characters #

Flag Description
--model
sora-2 , sora-2-pro , …
--size / --seconds
Resolution / duration
`--input-reference <path url
First-frame image reference
--character <id>
Reusable character; repeat for several
--wait / --download
Poll to completion / download after wait
--variant
video , thumbnail , or spritesheet
--poll-interval / --timeout
Polling cadence / max wait (seconds)

video edit

and video extend

accept either --video-id

(a completed OpenAI video) or --video

(an uploaded source), where your project is eligible. Reusable characters:

ploof video character create --name Mossy --video character.mp4
ploof video character get char_abc123

Speech defaults to gpt-4o-mini-tts

/ alloy

/ mp3

. Transcription defaults to gpt-4o-mini-transcribe

; translation to whisper-1

.

ploof audio generate --text "Ploof can speak." --voice alloy --out assets/speech.mp3

ploof audio transcribe --audio assets/speech.mp3 --out assets/transcript.json

ploof audio translate --audio assets/spanish.mp3 --format text --out assets/translation.txt

Audio flags #

Generate (generate

, aliased speech

/ tts

): --model

, --voice

, --voice-id

, --instructions

, --format

(mp3

, opus

, aac

, flac

, wav

, pcm

), --speed

.

Transcribe: --model

, --language

, --prompt

, --format

, --temperature

, --include

, --timestamp-granularity

, --chunking-strategy

, --known-speaker-name

, --known-speaker-reference

.

Translate: --model

, --prompt

, --format

, --temperature

.

Ploof writes finished files, so streaming-only transport settings (e.g. stream=true

) are rejected — they don't produce a complete asset.

model run

calls a model endpoint directly through the provider's official client — defaulting to fal.ai. Ploof uploads local inputs to provider storage, submits to the queue, polls to completion, and writes the returned files or text to disk.

ploof model run \
  --provider fal --model fal-ai/flux/dev \
  --prompt "Friendly CLI mascot icon, transparent background" \
  --param image_size=square_hd \
  --out assets/icon.png

Map local assets to the endpoint's exact input fields with --input field=path

(repeatable):

ploof model run --provider fal --model <endpoint-id> \
  --prompt "Animate this into a short loop" \
  --input image_url=assets/source.png --param duration=4 \
  --out assets/loop.mp4

The media commands work against fal too — just pass --provider fal --model <endpoint-id>

:

ploof image generate --provider fal --model fal-ai/flux/dev \
  --prompt "Soft clay mascot icon" --param image_size=square_hd --out assets/mascot.png

Pass endpoint settings with --param key=value

or --json '{…}'

. Queue controls: --start-timeout

, --timeout

, --poll-interval

, --priority low|normal

, --storage-expires-in

.

Describe many assets in YAML (or JSON), wire dependencies with needs

, reuse one task's output as another's input, and run them in parallel:

version: 1
parallel: 4
tasks:
  - id: base
    kind: image.generate
    prompt: "Studio product photo"
    params: { model: gpt-image-2, size: 1024x1024, quality: high }
    output: assets/base.png

  - id: final
    kind: image.edit
    needs: [base]
    inputs:
      images:
        - task: base          # reuse base's output
      mask: ./mask.png
    prompt: "Add a premium background"
    output: assets/final.png

  - id: clip
    kind: video.generate
    prompt: "Slow dolly through a miniature paper city"
    params: { model: sora-2, size: 1280x720, seconds: "4" }
    wait: true
    download: true
    output: assets/clip.mp4

  - id: icon
    kind: model.run
    provider: fal
    model: fal-ai/flux/dev
    prompt: "Small mascot icon"
    params: { image_size: square_hd }
    output: assets/icon.png
ploof run assets.yaml --parallel 4
ploof run assets.yaml --dry-run --output json   # validate the plan, no API calls

Media tasks default to provider: openai

; model.run

defaults to provider: fal

. Relative paths resolve from the manifest's location, and every CLI operation is available as a task kind (image.*

, video.*

, audio.*

, model.run

).

Task fields & input references #

Fields:id

,kind

,provider

,profile

,needs

,model

,prompt

,text

,output

,params

,sidecar

,inputs

,videoId

,characterId

,name

,wait

,download

,variants

,pollIntervalMs

,timeoutMs

.accepts a string,inputs.images

{ source }

, or{ task }

(uses that task's first output).inputs.video(s)

,inputs.mask

,inputs.reference

, andinputs.audio

use the same shape.preserves exact input keys, somodel.run

inputs.image_url

maps to the provider fieldimage_url

.- Always --dry-run

before an expensive batch.

Human-readable in a terminal, machine-readable in a pipe — automatically:

ploof image generate --prompt "..." --output json
ploof run assets.yaml --output jsonl
ploof video list --fields id,outputs,metadata.video.status
Format When
auto (default)
table in a TTY, compact when piped
table
Human-readable columns
compact
One line per asset, easy to grep
json / jsonl
Programmatic / streaming

Every result is a stable object:

{
  "kind": "video.generate",
  "provider": "openai",
  "outputs": ["assets/clip.mp4"],
  "metadata": { "video": { "id": "video_…", "status": "completed" } }
}

Sidecars: unless disabled, each asset gets a <output>.json

beside it recording the operation, prompt, params, outputs, and provider metadata — reproducible by default. Narrow output with --fields a,b.c

, and set the default format via --output

, the PLOOF_OUTPUT

env var, or ploof config set output …

.

The copy-paste setup above is all most agents need. Here's what's happening under the hood — two commands carry the integration:

ploof learn          # canonical, version-matched agent reference (prints to stdout)
ploof skill install  # install a bootstrap skill into your agent

ploof learn

is the source of truth — it documents every command, default, and gotcha for the exact installed version, so an agent never works from stale memory. The installed skill is intentionally tiny: it just points back at ploof learn

, keeping guidance in lockstep with the package. Combined with --output json

(or jsonl

), --fields

selection, and predictable exit codes, ploof is built for hands-off automation.

ploof config list
ploof config set output compact
ploof config set defaultParallel 8
ploof config set sidecar false
ploof config reset

Stored at ~/.ploof/config.json

, separate from credentials.

Key Default Meaning
output
auto
Default output format
defaultParallel
4
Default run concurrency
sidecar
true
Write <file>.json metadata
noColor
false
Disable ANSI color

Global flags #

Flag Description
-o, --output <format>
auto , table , compact , json , jsonl
-f, --fields <list>
Comma-separated field selection
-d, --detail
Full detail view
-q, --quiet
Data only, no hints
--no-color
Disable color
--verbose
Debug output to stderr
-y, --yes
Skip confirmation prompts
-V, --version / -h, --help
Version / help

Run ploof <command> --help

for any subcommand.

Environment variables #

Variable Purpose
PLOOF_OPENAI_API_KEY , OPENAI_API_KEY
OpenAI key
PLOOF_OPENAI_ORG , PLOOF_OPENAI_PROJECT , PLOOF_OPENAI_BASE_URL
OpenAI org / project / base URL
PLOOF_FAL_KEY , FAL_KEY
fal.ai key
PLOOF_FAL_KEY_ID + PLOOF_FAL_KEY_SECRET (or FAL_KEY_ID + FAL_KEY_SECRET )
fal.ai split key
PLOOF_OUTPUT
Default output format
bun install
bun run dev -- --help     # run locally
bun test                  # unit + integration (mocked, no API spend)
bun run typecheck
bun run lint
bun run build

The default suite runs real ploof

commands against a local OpenAI mock plus fal unit tests, so no credits are spent. Live tests are opt-in:

PLOOF_OPENAI_API_KEY=sk-... bun test tests/e2e
PLOOF_FAL_KEY=...           bun test tests/e2e/fal-live.test.ts

Releases publish from GitHub Actions on a v*

tag via npm Trusted Publishing. See SPEC.md for the full specification and release details.

MIT © Michael Tromba

── more in #ai-tools 4 stories · sorted by recency
── more on @ploof 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/show-hn-ploof-the-ag…] indexed:0 read:12min 2026-06-29 ·