One prompt enters. Each block routes to the right model.
Tag any block with @@model
and PRISMAG sends it to the model you chose — planning to Opus, implementation to Composer, summaries to a fast model — without switching the IDE picker or juggling chats.
prismag> @@opus: design the auth flow @@composer: implement the middleware
── @@opus → claude-4.6-opus-high-thinking ───────────────────────────
Use short-lived access tokens with rotating refresh tokens because…
── @@composer → composer-2.5-fast ───────────────────────────────────
// middleware/auth.go
func RequireAuth(next http.Handler) http.Handler { … }
routed 2 blocks · chained · 1.8s
Today's AI coding tools force a binary choice:
- Pick one model for the whole conversation, or - Open multiple chats and split the work by hand.
Neither matches how you actually work. Planning wants depth (Opus). Implementation wants speed (Composer). Review wants a different lens entirely.
| Without PRISMAG | With PRISMAG |
|---|---|
| One model per chat | A model per block, in one prompt |
| Switch the picker between tasks | @@opus: … @@composer: … and go |
| Manual context copy-paste between chats | Output of block N chains into block N+1 |
| Auto-routing by cost/latency (OpenRouter) | You choose the model per block |
| YAML/Python pipelines (LangGraph/CrewAI) | Chat-native @@ syntax, zero config |
Prompt with @@tags ──▶ parser ──▶ orchestrator ──▶ model backends ──▶ sectioned result
▲
└── ContextStore (in-memory · or maind)
- The trigger is
, not
@@
@
— a bare@
collides with the IDE's mention menu.@@
travels as plain text through every chat surface. - Routing is
deterministic and owned by the CLI +registry.yaml
. - Blocks run
serial + chained by default (output N → context N+1), or--parallel
for independent blocks. - Context flows through a pluggable store — in-memory by default, or maindfor encrypted, cross-session memory.
go install github.com/rufus-SD/prismag@latest
git clone https://github.com/rufus-SD/prismag.git
cd prismag && make install
prismag setup
prismag init
prismag run "@@opus: plan the cache layer" "@@composer: implement it"
Or just run prismag
with no args to drop into the interactive prismag>
session.
PRISMAG works in two ways, from the same global config:
CLI / REPL— runs in any terminal, on any OS. Executes each block via provider APIs using your keys. Universal, deterministic.** In your IDE**—prismag init
writes a rule that teaches the agent to route@@
blocks through PRISMAG. Where the IDE supports per-task subagents, each block is dispatched to its own subagent + model using your subscription (no API keys needed).
| Editor | Rule file | Dispatch |
|---|---|---|
| Cursor | ||
.cursor/rules/prismag-routing.mdc + .cursor/agents/ |
||
| subagents (any model) | ||
| Claude Code | ||
CLAUDE.md + .claude/agents/ |
||
| subagents (Claude) + API fallback | ||
| Windsurf | ||
.windsurf/rules/prismag-routing.md |
||
runs via prismag run |
||
| GitHub Copilot | ||
.github/copilot-instructions.md |
||
runs via prismag run |
||
| Cline | ||
.clinerules/prismag-routing.md |
||
runs via prismag run |
||
| Roo Code | ||
.roo/rules/prismag-routing.md |
||
runs via prismag run |
||
| Aider | ||
CONVENTIONS.md |
||
runs via prismag run |
||
| generic | ||
.prismag/rules.md |
||
runs via prismag run |
prismag connect cursor # or: claude, windsurf, copilot, cline, roo, aider, generic
Subagent dispatch gives true per-block model switching where the editor exposes
it (Cursor, Claude Code). Everywhere else, the agent runs prismag run
and shows the sectioned output verbatim — same routing, same result.
@@<alias>: <task>
context shared with every block goes here, before the first tag
@@opus: review the security implications of this auth module
@@composer: write the unit tests for AuthService
@@fast: summarize the diff in 3 bullets
@@alias
is case-insensitive and maps to a model viaregistry.yaml
.- Text before the first
@@
is shared context for all blocks. Serial + chained by default;--parallel
for independent blocks.- Chained runs fail fast; parallel runs tolerate partial failure.
aliases:
opus:
model: claude-opus-4-6 # concrete id + offline fallback
match: claude-opus-4-6 # family resolved against the live model list
provider: anthropic
agent: opus-planner # subagent used when routing in-IDE
description: Deep reasoning, architecture, security review
composer:
model: composer-2.5-fast
provider: cursor
agent: composer-implementer
description: Fast implementation, multi-file edits
fast:
model: gpt-5.3-codex
provider: openai
description: Cheap, quick summaries and simple transforms
Two optional top-level keys remove friction for everyday use:
default: opus4.8 # untagged prompts route here, so `prismag "do X"` needs no @@tag
exec: # CLI tool-loop defaults — set permissions once, no flags per run
enabled: true # let blocks act on this machine (write files, …)
shell: true # also allow run_shell
approve: ask # ask = confirm each action y/N (default) · auto = no prompt
The same model has a different id in every context — claude-opus-4-8
on the
Anthropic API, claude-opus-4-8-thinking-high
in Cursor, a local tag in Ollama. Pinning one string breaks the moment a provider renames or bumps a model.
So PRISMAG treats an alias as a family and resolves it to a currently-valid id
from the live model list for the active context (queried with your keys in the
CLI, cached 12h; the agent-maintained cache in the IDE). It picks the best match
deterministically, self-heals across renames, and falls back to the pinned model
when offline. Set match:
to make the family explicit; otherwise model
doubles
as it. Inspect what's available any time with prismag models
.
| Command | What it does |
|---|---|
prismag |
|
Interactive prismag> session (or onboarding on first run) |
|
prismag setup |
|
| First-time setup: keys, model discovery, starter registry | |
prismag init [tool] |
|
| Wire routing into this project (auto-detects the editor) | |
prismag connect <tool> |
|
| Write the integration rule (+ subagents where supported) | |
prismag run "@@..." |
|
Route and execute a tagged prompt (untagged → default: alias; --exec /exec: lets blocks act) |
|
prismag route "@@..." |
|
Show the delegation plan without executing (--json too) |
|
prismag list |
|
List @@aliases with availability marks |
|
prismag models |
|
| Show models available right now | |
prismag doctor |
|
| Diagnose keys, registry, and environment | |
prismag sessions |
|
| List saved REPL session transcripts | |
prismag resume [id] |
|
| Reopen a past session with its context |
PRISMAG calls provider APIs directly — keys go straight to the vendor, never
to a gateway. Keys are read from the environment, a ~/.config/prismag/.env
, or stored encrypted in maind when present.
ANTHROPIC_API_KEY only: + OPENAI_API_KEY:
@@opus ✓ ready @@opus ✓ ready
@@fast ✗ needs OPENAI_API_KEY @@fast ✓ ready
Inside an IDE that dispatches subagents, blocks route via your subscription — no API keys required.
Route any block to a model running on your own machine — no API key, no cloud, $0 per token. Both Ollama and vLLM expose an OpenAI-compatible API, so PRISMAG talks to them natively (streaming included).
ollama pull qwen2.5-coder:7b # serves on http://localhost:11434
aliases:
local:
model: qwen2.5-coder:7b
provider: ollama # or: vllm
description: Local model — private, free, offline
prismag run "@@local: refactor this function" # runs entirely on your box
Endpoints default to http://localhost:11434/v1
(Ollama) and
http://localhost:8000/v1
(vLLM); override per-alias with base_url
or globally
with OLLAMA_BASE_URL
/ VLLM_BASE_URL
. Mix freely — plan locally, implement in
the cloud: @@local: draft
then @@opus: review
.
By default a CLI block returns text — PRISMAG is a router, not an agent. Turn on exec and a block can take real actions through a small, permission-gated tool loop: it asks before every step, so you grant rights action-by-action.
Set it once in registry.yaml
(exec.enabled: true
) plus a default:
alias, and the everyday flow needs no tag and no flags — like an agent that asks first:
prismag "create a folder on my desktop named poems"
⚠ allow run_shell: mkdir -p ~/Desktop/poems ? [y/N] y
✓ run_shell: mkdir -p ~/Desktop/poems
Prefer per-run control instead? Skip the config and pass --exec
(flags always override config):
prismag run --exec "@@opus4.8: create ~/Desktop/poem.txt with a short flower poem"
- Tools:
write_file
,read_file
, andrun_shell
(exec.shell: true
/--exec-shell
). - Every action needs approval;
approve: auto
(or--yes
) skips the prompt (use with care), and a non-interactive shell denies by default.root:
confines file actions to one tree. Destructive commands are refused by default—rm -rf /
,mkfs
,dd of=/dev/…
, fork bombs,shutdown
, etc. are blockedeven if approved, so a carelessy
(orapprove: auto
) can't wreck your machine. Ordinary deletes still work via the normal prompt. Override only withexec.allow_destructive: true
.- The protocol is provider-agnostic (a fenced
prismag
JSON action), so it works on Anthropic, OpenAI, OpenRouter,and local Ollama/vLLM models alike. CLI-only by design: inside an IDE the agent already has its own tools, so PRISMAG just emits a delegation plan there. In theprismag>
REPL, toggle it with:exec
(:exec shell
,:exec yes
,:exec off
).
PRISMAG already is the router, so it calls provider REST APIs directly with no self-hosted proxy, DB, or admin UI to trust and patch. That keeps the dependency/supply-chain surface tiny — direct APIs, a single static binary.
PRISMAG is a routing protocol any agent can speak — no SDK required. Shell out to
prismag route --json
to get a deterministic plan (which model runs which block),
then dispatch with your own model access; or prismag run --api
to have PRISMAG execute and return the result. See INTEGRATIONS.md.
maind is the optional memory backend: an encrypted, local-first store the CLI and your IDE agent share. With both wired in, context survives across blocks, sessions, and editors.
See CONTRIBUTING.md.
See SECURITY.md for credential handling and vulnerability reporting.