cd /news/developer-tools/show-hn-prismag-per-block-model-rout… · home topics developer-tools article
[ARTICLE · art-36311] src=github.com ↗ pub= topic=developer-tools verified=true sentiment=↑ positive

Show HN: Prismag – Per-block model routing for the terminal and any IDE

Developer Rufus SD released Prismag, an open-source CLI tool that routes different parts of a single prompt to different AI models using @@tags, enabling per-block model selection in terminals and IDEs like Cursor and Claude Code. The tool addresses the limitation of current AI coding tools that force a single model per conversation or require manual context switching across multiple chats.

read8 min views1 publishedJun 22, 2026
Show HN: Prismag – Per-block model routing for the terminal and any IDE
Image: source

One prompt enters. Each block routes to the right model.

Tag any block with @@model

and PRISMAG sends it to the model you chose — planning to Opus, implementation to Composer, summaries to a fast model — without switching the IDE picker or juggling chats.

prismag> @@opus: design the auth flow   @@composer: implement the middleware

  ── @@opus → claude-4.6-opus-high-thinking ───────────────────────────
  Use short-lived access tokens with rotating refresh tokens because…

  ── @@composer → composer-2.5-fast ───────────────────────────────────
  // middleware/auth.go
  func RequireAuth(next http.Handler) http.Handler { … }

  routed 2 blocks · chained · 1.8s

Today's AI coding tools force a binary choice:

  • Pick one model for the whole conversation, or - Open multiple chats and split the work by hand.

Neither matches how you actually work. Planning wants depth (Opus). Implementation wants speed (Composer). Review wants a different lens entirely.

Without PRISMAG With PRISMAG
One model per chat A model per block, in one prompt
Switch the picker between tasks @@opus:@@composer: … and go
Manual context copy-paste between chats Output of block N chains into block N+1
Auto-routing by cost/latency (OpenRouter) You choose the model per block
YAML/Python pipelines (LangGraph/CrewAI) Chat-native @@ syntax, zero config
Prompt with @@tags ──▶ parser ──▶ orchestrator ──▶ model backends ──▶ sectioned result
                                       ▲
                                       └── ContextStore (in-memory · or maind)
  • The trigger is , not@@

@

— a bare@

collides with the IDE's mention menu.@@

travels as plain text through every chat surface. - Routing is deterministic and owned by the CLI +registry.yaml

. - Blocks run serial + chained by default (output N → context N+1), or--parallel

for independent blocks. - Context flows through a pluggable store — in-memory by default, or maindfor encrypted, cross-session memory.

go install github.com/rufus-SD/prismag@latest

git clone https://github.com/rufus-SD/prismag.git
cd prismag && make install
prismag setup

prismag init

prismag run "@@opus: plan the cache layer" "@@composer: implement it"

Or just run prismag

with no args to drop into the interactive prismag>

session.

PRISMAG works in two ways, from the same global config:

CLI / REPL— runs in any terminal, on any OS. Executes each block via provider APIs using your keys. Universal, deterministic.** In your IDE**—prismag init

writes a rule that teaches the agent to route@@

blocks through PRISMAG. Where the IDE supports per-task subagents, each block is dispatched to its own subagent + model using your subscription (no API keys needed).

Editor Rule file Dispatch
Cursor
.cursor/rules/prismag-routing.mdc + .cursor/agents/
subagents (any model)
Claude Code
CLAUDE.md + .claude/agents/
subagents (Claude) + API fallback
Windsurf
.windsurf/rules/prismag-routing.md
runs via prismag run
GitHub Copilot
.github/copilot-instructions.md
runs via prismag run
Cline
.clinerules/prismag-routing.md
runs via prismag run
Roo Code
.roo/rules/prismag-routing.md
runs via prismag run
Aider
CONVENTIONS.md
runs via prismag run
generic
.prismag/rules.md
runs via prismag run
prismag connect cursor      # or: claude, windsurf, copilot, cline, roo, aider, generic

Subagent dispatch gives true per-block model switching where the editor exposes it (Cursor, Claude Code). Everywhere else, the agent runs prismag run

and shows the sectioned output verbatim — same routing, same result.

@@<alias>: <task>
context shared with every block goes here, before the first tag

@@opus: review the security implications of this auth module
@@composer: write the unit tests for AuthService
@@fast: summarize the diff in 3 bullets

@@alias

is case-insensitive and maps to a model viaregistry.yaml

.- Text before the first @@

is shared context for all blocks. Serial + chained by default;--parallel

for independent blocks.- Chained runs fail fast; parallel runs tolerate partial failure.

aliases:
  opus:
    model: claude-opus-4-6        # concrete id + offline fallback
    match: claude-opus-4-6        # family resolved against the live model list
    provider: anthropic
    agent: opus-planner           # subagent used when routing in-IDE
    description: Deep reasoning, architecture, security review
  composer:
    model: composer-2.5-fast
    provider: cursor
    agent: composer-implementer
    description: Fast implementation, multi-file edits
  fast:
    model: gpt-5.3-codex
    provider: openai
    description: Cheap, quick summaries and simple transforms

Two optional top-level keys remove friction for everyday use:

default: opus4.8       # untagged prompts route here, so `prismag "do X"` needs no @@tag
exec:                  # CLI tool-loop defaults — set permissions once, no flags per run
  enabled: true        # let blocks act on this machine (write files, …)
  shell: true          # also allow run_shell
  approve: ask         # ask = confirm each action y/N (default) · auto = no prompt

The same model has a different id in every context — claude-opus-4-8

on the Anthropic API, claude-opus-4-8-thinking-high

in Cursor, a local tag in Ollama. Pinning one string breaks the moment a provider renames or bumps a model.

So PRISMAG treats an alias as a family and resolves it to a currently-valid id from the live model list for the active context (queried with your keys in the CLI, cached 12h; the agent-maintained cache in the IDE). It picks the best match deterministically, self-heals across renames, and falls back to the pinned model

when offline. Set match:

to make the family explicit; otherwise model

doubles as it. Inspect what's available any time with prismag models

.

Command What it does
prismag
Interactive prismag> session (or onboarding on first run)
prismag setup
First-time setup: keys, model discovery, starter registry
prismag init [tool]
Wire routing into this project (auto-detects the editor)
prismag connect <tool>
Write the integration rule (+ subagents where supported)
prismag run "@@..."
Route and execute a tagged prompt (untagged → default: alias; --exec /exec: lets blocks act)
prismag route "@@..."
Show the delegation plan without executing (--json too)
prismag list
List @@aliases with availability marks
prismag models
Show models available right now
prismag doctor
Diagnose keys, registry, and environment
prismag sessions
List saved REPL session transcripts
prismag resume [id]
Reopen a past session with its context

PRISMAG calls provider APIs directly — keys go straight to the vendor, never to a gateway. Keys are read from the environment, a ~/.config/prismag/.env

, or stored encrypted in maind when present.

ANTHROPIC_API_KEY only:            + OPENAI_API_KEY:
  @@opus      ✓ ready                @@opus      ✓ ready
  @@fast      ✗ needs OPENAI_API_KEY @@fast      ✓ ready

Inside an IDE that dispatches subagents, blocks route via your subscription — no API keys required.

Route any block to a model running on your own machine — no API key, no cloud, $0 per token. Both Ollama and vLLM expose an OpenAI-compatible API, so PRISMAG talks to them natively (streaming included).

ollama pull qwen2.5-coder:7b        # serves on http://localhost:11434
aliases:
  local:
    model: qwen2.5-coder:7b
    provider: ollama                # or: vllm
    description: Local model — private, free, offline
prismag run "@@local: refactor this function"   # runs entirely on your box

Endpoints default to http://localhost:11434/v1

(Ollama) and http://localhost:8000/v1

(vLLM); override per-alias with base_url

or globally with OLLAMA_BASE_URL

/ VLLM_BASE_URL

. Mix freely — plan locally, implement in the cloud: @@local: draft

then @@opus: review

.

By default a CLI block returns text — PRISMAG is a router, not an agent. Turn on exec and a block can take real actions through a small, permission-gated tool loop: it asks before every step, so you grant rights action-by-action.

Set it once in registry.yaml

(exec.enabled: true

) plus a default:

alias, and the everyday flow needs no tag and no flags — like an agent that asks first:

prismag "create a folder on my desktop named poems"
  ⚠ allow run_shell: mkdir -p ~/Desktop/poems ? [y/N] y
  ✓ run_shell: mkdir -p ~/Desktop/poems

Prefer per-run control instead? Skip the config and pass --exec

(flags always override config):

prismag run --exec "@@opus4.8: create ~/Desktop/poem.txt with a short flower poem"
  • Tools: write_file

,read_file

, andrun_shell

(exec.shell: true

/--exec-shell

). - Every action needs approval; approve: auto

(or--yes

) skips the prompt (use with care), and a non-interactive shell denies by default.root:

confines file actions to one tree. Destructive commands are refused by defaultrm -rf /

,mkfs

,dd of=/dev/…

, fork bombs,shutdown

, etc. are blockedeven if approved, so a carelessy

(orapprove: auto

) can't wreck your machine. Ordinary deletes still work via the normal prompt. Override only withexec.allow_destructive: true

.- The protocol is provider-agnostic (a fenced prismag

JSON action), so it works on Anthropic, OpenAI, OpenRouter,and local Ollama/vLLM models alike. CLI-only by design: inside an IDE the agent already has its own tools, so PRISMAG just emits a delegation plan there. In theprismag>

REPL, toggle it with:exec

(:exec shell

,:exec yes

,:exec off

).

PRISMAG already is the router, so it calls provider REST APIs directly with no self-hosted proxy, DB, or admin UI to trust and patch. That keeps the dependency/supply-chain surface tiny — direct APIs, a single static binary.

PRISMAG is a routing protocol any agent can speak — no SDK required. Shell out to prismag route --json

to get a deterministic plan (which model runs which block), then dispatch with your own model access; or prismag run --api

to have PRISMAG execute and return the result. See INTEGRATIONS.md.

maind is the optional memory backend: an encrypted, local-first store the CLI and your IDE agent share. With both wired in, context survives across blocks, sessions, and editors.

See CONTRIBUTING.md.

See SECURITY.md for credential handling and vulnerability reporting.

── more in #developer-tools 4 stories · sorted by recency
── more on @prismag 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/show-hn-prismag-per-…] indexed:0 read:8min 2026-06-22 ·