{"slug": "show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide", "title": "Show HN: Prismag – Per-block model routing for the terminal and any IDE", "summary": "Developer Rufus SD released Prismag, an open-source CLI tool that routes different parts of a single prompt to different AI models using @@tags, enabling per-block model selection in terminals and IDEs like Cursor and Claude Code. The tool addresses the limitation of current AI coding tools that force a single model per conversation or require manual context switching across multiple chats.", "body_md": "**One prompt enters. Each block routes to the right model.**\n\nTag any block with `@@model`\n\nand PRISMAG sends it to the model you chose —\nplanning to Opus, implementation to Composer, summaries to a fast model —\nwithout switching the IDE picker or juggling chats.\n\n```\nprismag> @@opus: design the auth flow   @@composer: implement the middleware\n\n  ── @@opus → claude-4.6-opus-high-thinking ───────────────────────────\n  Use short-lived access tokens with rotating refresh tokens because…\n\n  ── @@composer → composer-2.5-fast ───────────────────────────────────\n  // middleware/auth.go\n  func RequireAuth(next http.Handler) http.Handler { … }\n\n  routed 2 blocks · chained · 1.8s\n```\n\nToday's AI coding tools force a binary choice:\n\n- Pick\n**one model** for the whole conversation, or - Open\n**multiple chats** and split the work by hand.\n\nNeither matches how you actually work. Planning wants depth (Opus). Implementation wants speed (Composer). Review wants a different lens entirely.\n\n| Without PRISMAG | With PRISMAG |\n|---|---|\n| One model per chat | A model per block, in one prompt |\n| Switch the picker between tasks | `@@opus:` … `@@composer:` … and go |\n| Manual context copy-paste between chats | Output of block N chains into block N+1 |\n| Auto-routing by cost/latency (OpenRouter) | You choose the model per block |\n| YAML/Python pipelines (LangGraph/CrewAI) | Chat-native `@@` syntax, zero config |\n\n```\nPrompt with @@tags ──▶ parser ──▶ orchestrator ──▶ model backends ──▶ sectioned result\n                                       ▲\n                                       └── ContextStore (in-memory · or maind)\n```\n\n- The trigger is\n, not`@@`\n\n`@`\n\n— a bare`@`\n\ncollides with the IDE's mention menu.`@@`\n\ntravels as plain text through every chat surface. - Routing is\n**deterministic** and owned by the CLI +`registry.yaml`\n\n. - Blocks run\n**serial + chained** by default (output N → context N+1), or`--parallel`\n\nfor independent blocks. - Context flows through a pluggable store — in-memory by default, or\n[maind](https://github.com/rufus-SD/maind)for encrypted, cross-session memory.\n\n```\n# Go 1.26+\ngo install github.com/rufus-SD/prismag@latest\n\n# or clone and build\ngit clone https://github.com/rufus-SD/prismag.git\ncd prismag && make install\n# 1. Guided onboarding — environment, optional API keys, model discovery, registry\nprismag setup\n\n# 2. Wire routing into your editor (auto-detects the tool)\nprismag init\n\n# 3. Route a prompt\nprismag run \"@@opus: plan the cache layer\" \"@@composer: implement it\"\n```\n\nOr just run `prismag`\n\nwith no args to drop into the interactive `prismag>`\n\nsession.\n\nPRISMAG works in **two ways**, from the same global config:\n\n**CLI / REPL**— runs in any terminal, on any OS. Executes each block via provider APIs using your keys. Universal, deterministic.** In your IDE**—`prismag init`\n\nwrites a rule that teaches the agent to route`@@`\n\nblocks through PRISMAG. Where the IDE supports per-task subagents, each block is dispatched to its own subagent + model using your subscription (no API keys needed).\n\n| Editor | Rule file | Dispatch |\n|---|---|---|\nCursor |\n`.cursor/rules/prismag-routing.mdc` + `.cursor/agents/` |\nsubagents (any model) |\nClaude Code |\n`CLAUDE.md` + `.claude/agents/` |\nsubagents (Claude) + API fallback |\nWindsurf |\n`.windsurf/rules/prismag-routing.md` |\nruns via `prismag run` |\nGitHub Copilot |\n`.github/copilot-instructions.md` |\nruns via `prismag run` |\nCline |\n`.clinerules/prismag-routing.md` |\nruns via `prismag run` |\nRoo Code |\n`.roo/rules/prismag-routing.md` |\nruns via `prismag run` |\nAider |\n`CONVENTIONS.md` |\nruns via `prismag run` |\ngeneric |\n`.prismag/rules.md` |\nruns via `prismag run` |\n\n```\nprismag connect cursor      # or: claude, windsurf, copilot, cline, roo, aider, generic\n```\n\nSubagent dispatch gives true per-block model switching where the editor exposes\nit (Cursor, Claude Code). Everywhere else, the agent runs `prismag run`\n\nand shows\nthe sectioned output verbatim — same routing, same result.\n\n```\n@@<alias>: <task>\ncontext shared with every block goes here, before the first tag\n\n@@opus: review the security implications of this auth module\n@@composer: write the unit tests for AuthService\n@@fast: summarize the diff in 3 bullets\n```\n\n`@@alias`\n\nis case-insensitive and maps to a model via`registry.yaml`\n\n.- Text before the first\n`@@`\n\nis shared context for all blocks. **Serial + chained** by default;`--parallel`\n\nfor independent blocks.- Chained runs fail fast; parallel runs tolerate partial failure.\n\n```\naliases:\n  opus:\n    model: claude-opus-4-6        # concrete id + offline fallback\n    match: claude-opus-4-6        # family resolved against the live model list\n    provider: anthropic\n    agent: opus-planner           # subagent used when routing in-IDE\n    description: Deep reasoning, architecture, security review\n  composer:\n    model: composer-2.5-fast\n    provider: cursor\n    agent: composer-implementer\n    description: Fast implementation, multi-file edits\n  fast:\n    model: gpt-5.3-codex\n    provider: openai\n    description: Cheap, quick summaries and simple transforms\n```\n\nTwo optional top-level keys remove friction for everyday use:\n\n```\ndefault: opus4.8       # untagged prompts route here, so `prismag \"do X\"` needs no @@tag\nexec:                  # CLI tool-loop defaults — set permissions once, no flags per run\n  enabled: true        # let blocks act on this machine (write files, …)\n  shell: true          # also allow run_shell\n  approve: ask         # ask = confirm each action y/N (default) · auto = no prompt\n  # root: ~/Desktop    # optional: confine file actions to one tree\n```\n\nThe same model has a different id in every context — `claude-opus-4-8`\n\non the\nAnthropic API, `claude-opus-4-8-thinking-high`\n\nin Cursor, a local tag in Ollama.\nPinning one string breaks the moment a provider renames or bumps a model.\n\nSo PRISMAG treats an alias as a **family** and resolves it to a currently-valid id\nfrom the live model list for the active context (queried with your keys in the\nCLI, cached 12h; the agent-maintained cache in the IDE). It picks the best match\ndeterministically, self-heals across renames, and falls back to the pinned `model`\n\nwhen offline. Set `match:`\n\nto make the family explicit; otherwise `model`\n\ndoubles\nas it. Inspect what's available any time with `prismag models`\n\n.\n\n| Command | What it does |\n|---|---|\n`prismag` |\nInteractive `prismag>` session (or onboarding on first run) |\n`prismag setup` |\nFirst-time setup: keys, model discovery, starter registry |\n`prismag init [tool]` |\nWire routing into this project (auto-detects the editor) |\n`prismag connect <tool>` |\nWrite the integration rule (+ subagents where supported) |\n`prismag run \"@@...\"` |\nRoute and execute a tagged prompt (untagged → `default:` alias; `--exec` /`exec:` lets blocks act) |\n`prismag route \"@@...\"` |\nShow the delegation plan without executing (`--json` too) |\n`prismag list` |\nList `@@aliases` with availability marks |\n`prismag models` |\nShow models available right now |\n`prismag doctor` |\nDiagnose keys, registry, and environment |\n`prismag sessions` |\nList saved REPL session transcripts |\n`prismag resume [id]` |\nReopen a past session with its context |\n\nPRISMAG calls provider APIs **directly** — keys go straight to the vendor, never\nto a gateway. Keys are read from the environment, a `~/.config/prismag/.env`\n\n, or\nstored encrypted in [maind](https://github.com/rufus-SD/maind) when present.\n\n```\nANTHROPIC_API_KEY only:            + OPENAI_API_KEY:\n  @@opus      ✓ ready                @@opus      ✓ ready\n  @@fast      ✗ needs OPENAI_API_KEY @@fast      ✓ ready\n```\n\nInside an IDE that dispatches subagents, blocks route via your subscription — no API keys required.\n\nRoute any block to a model running on your own machine — no API key, no cloud,\n$0 per token. Both [Ollama](https://ollama.com) and\n[vLLM](https://github.com/vllm-project/vllm) expose an OpenAI-compatible API, so\nPRISMAG talks to them natively (streaming included).\n\n```\nollama pull qwen2.5-coder:7b        # serves on http://localhost:11434\naliases:\n  local:\n    model: qwen2.5-coder:7b\n    provider: ollama                # or: vllm\n    # base_url: http://localhost:11434/v1   # optional override\n    description: Local model — private, free, offline\nprismag run \"@@local: refactor this function\"   # runs entirely on your box\n```\n\nEndpoints default to `http://localhost:11434/v1`\n\n(Ollama) and\n`http://localhost:8000/v1`\n\n(vLLM); override per-alias with `base_url`\n\nor globally\nwith `OLLAMA_BASE_URL`\n\n/ `VLLM_BASE_URL`\n\n. Mix freely — plan locally, implement in\nthe cloud: `@@local: draft`\n\nthen `@@opus: review`\n\n.\n\nBy default a CLI block returns **text** — PRISMAG is a router, not an agent. Turn\non exec and a block can take real actions through a small, **permission-gated**\ntool loop: it asks before every step, so you grant rights action-by-action.\n\nSet it once in `registry.yaml`\n\n(`exec.enabled: true`\n\n) plus a `default:`\n\nalias, and\nthe everyday flow needs no tag and no flags — like an agent that asks first:\n\n```\nprismag \"create a folder on my desktop named poems\"\n  ⚠ allow run_shell: mkdir -p ~/Desktop/poems ? [y/N] y\n  ✓ run_shell: mkdir -p ~/Desktop/poems\n```\n\nPrefer per-run control instead? Skip the config and pass `--exec`\n\n(flags always\noverride config):\n\n```\nprismag run --exec \"@@opus4.8: create ~/Desktop/poem.txt with a short flower poem\"\n```\n\n- Tools:\n`write_file`\n\n,`read_file`\n\n, and`run_shell`\n\n(`exec.shell: true`\n\n/`--exec-shell`\n\n). - Every action needs approval;\n`approve: auto`\n\n(or`--yes`\n\n) skips the prompt (use with care), and a non-interactive shell denies by default.`root:`\n\nconfines file actions to one tree. **Destructive commands are refused by default**—`rm -rf /`\n\n,`mkfs`\n\n,`dd of=/dev/…`\n\n, fork bombs,`shutdown`\n\n, etc. are blocked*even if approved*, so a careless`y`\n\n(or`approve: auto`\n\n) can't wreck your machine. Ordinary deletes still work via the normal prompt. Override only with`exec.allow_destructive: true`\n\n.- The protocol is provider-agnostic (a fenced\n`prismag`\n\nJSON action), so it works on Anthropic, OpenAI, OpenRouter,**and local** Ollama/vLLM models alike. **CLI-only by design**: inside an IDE the agent already has its own tools, so PRISMAG just emits a delegation plan there. In the`prismag>`\n\nREPL, toggle it with`:exec`\n\n(`:exec shell`\n\n,`:exec yes`\n\n,`:exec off`\n\n).\n\nPRISMAG already *is* the router, so it calls provider REST APIs directly with no\nself-hosted proxy, DB, or admin UI to trust and patch. That keeps the\ndependency/supply-chain surface tiny — direct APIs, a single static binary.\n\nPRISMAG is a routing protocol any agent can speak — no SDK required. Shell out to\n`prismag route --json`\n\nto get a deterministic plan (which model runs which block),\nthen dispatch with your own model access; or `prismag run --api`\n\nto have PRISMAG\nexecute and return the result. See [INTEGRATIONS.md](/rufus-SD/prismag/blob/main/INTEGRATIONS.md).\n\n[maind](https://github.com/rufus-SD/maind) is the optional memory backend: an\nencrypted, local-first store the CLI and your IDE agent share. With both wired in,\ncontext survives across blocks, sessions, and editors.\n\nSee [CONTRIBUTING.md](/rufus-SD/prismag/blob/main/CONTRIBUTING.md).\n\nSee [SECURITY.md](/rufus-SD/prismag/blob/main/SECURITY.md) for credential handling and vulnerability reporting.", "url": "https://wpnews.pro/news/show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide", "canonical_source": "https://github.com/rufus-SD/prismag", "published_at": "2026-06-22 09:00:55+00:00", "updated_at": "2026-06-22 09:09:46.200191+00:00", "lang": "en", "topics": ["developer-tools", "ai-tools", "large-language-models", "ai-agents"], "entities": ["Prismag", "Rufus SD", "Cursor", "Claude Code", "Windsurf", "GitHub Copilot", "Cline", "Roo Code"], "alternates": {"html": "https://wpnews.pro/news/show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide", "markdown": "https://wpnews.pro/news/show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide.md", "text": "https://wpnews.pro/news/show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide.txt", "jsonld": "https://wpnews.pro/news/show-hn-prismag-per-block-model-routing-for-the-terminal-and-any-ide.jsonld"}}