TinyAgents is a recursive language-model (RLM) harness for Rust. It is a typed, durable runtime where language models call models, agents call agents, graphs run graphs, and a model can author, compile, and run the very workflow it is standing inside — all as inspectable, checkpointed, policy-checked Rust.
Most agent frameworks stuff everything into one ever-growing context window and hope the model copes. Recursive Language Models (RLMs) take a different stance: a long prompt is treated as an external environment that the model explores through a REPL — examining it, decomposing it, and recursively calling itself (or sub-models) over snippets instead of swallowing the whole thing at once. This mitigates "context rot" and lets effective context exceed the raw window.
The idea comes from recent research:
Paper:"Recursive Language Models," Alex L. Zhang, Tim Kraska, Omar Khattab (MIT CSAIL), 2025 —arXiv:2512.24601Blog: Alex L. Zhang, "Recursive Language Models" —https://alexzhang13.github.io/blog/2025/rlm/Reference implementation:https://github.com/alexzhang13/rlm
TinyAgents is inspired by and architected around the RLM execution model — a production-shaped Rust harness for building RLM-style systems. It does not claim to reproduce the paper's benchmark numbers; instead it brings the execution model to Rust as concrete, implemented surfaces:
Sub-agents (agents calling agents). A harness agent is exposedas a toolto another agent, so orchestration is literally a model calling a model (SubAgent
,SubAgentSession
,SubAgentTool
).Recursion policy + depth tracking. The runtime tracksroot_run_id
/parent_run_id
, enforces a recursion limit, and rolls child runs' events, usage, and cost up to the parent as first-class observable runs.Graphs that run graphs. A node can embed another compiled graph, and the.ragsh
REPL can drive a graph from inside a graph node (graph → REPL → graph).The REPL as the RLM core. In.ragsh
, context and prompts are runtimevalues, not just prompt text. The model writes small programs, inspects their output, calls sub-models / sub-agents / sub-graphs as functions, and iterates — the RLM/CodeAct loop.Self-authoring (the deepest recursion). A model can emit a.rag
blueprint that compiles through thesameregistry-bound compiler path as a human-authored file, then runs on thesameruntime the model is already executing in. The harness can describe and re-enter itself.
Two languages, one runtime: .rag
(declarative blueprint) and .ragsh
(imperative REPL) both lower into the exact same graph
harness
types as hand-written Rust — a language whose programs are the runtime that interprets them.
Harness— provider-neutral model calls, typed tools, middleware, structured output, streaming, usage/cost accounting, retries and limits, response caching, memory/embeddings, summarization, steering, and a testkit.Graph runtime— LangGraph-style durable, typed state graphs:START
/END
, nodes, edges, conditional routing, commands,Send
fanout, reducers/channels, checkpoints, interrupts, subgraphs, streaming, topology export, and time travel.Registry— a named capability catalog (models, tools, agents, graphs, stores, middleware, policy) that.rag
and.ragsh
bind by name.— a declarative, side-effect-free blueprint format that compiles (lexer → parser → compiler) into the runtime; the safe boundary for agent-authored plans..rag
expressive language— imperative, capability-bound interactive orchestration; the RLM/CodeAct loop surface..ragsh
REPL languageRecursion & sub-agents— agents-as-tools, subgraphs, depth tracking, and a recursion policy so deep call trees stay bounded and observable.** Durability & checkpoints**— resume long runs, replay history, and travel back in time across superstep boundaries.** Provider-neutral**— one interface across hosted and local providers; swap models without rewriting workflows.** Observability**— normalized events, usage, and cost that roll up across recursive child runs.** Structured output & streaming**— typed responses and incremental token streams at the harness boundary.
+-----------------------+ +-----------------------+
| .rag blueprint | | .ragsh REPL |
| declarative workflow | | imperative RLM loop |
+-----------+-----------+ +-----------+-----------+
\ /
\ compile / lower (by name) /
v v
+-------------+ +-------------------------------------------+
| Application |------->| Capability Registry |
| Rust code | | models | tools | agents | graphs | policy |
+------+------+ +---------------------+---------------------+
| |
| v
| +-------------------------------------------+
+------------->| Durable Graph Runtime |
| typed state | nodes | edges | checkpoints |
+---------------------+---------------------+
|
v
+-------------------------------------------+
| Agent Harness |
| prompts | tools | middleware | usage/cost |
+----+--------------------------+-----------+
| |
v v
+------------------+ +------------------+
| Model Providers | | Typed Tools |
| OpenAI/Anthropic | | local functions |
| Ollama/etc. | | external systems |
+------------------+ +------------------+
The recursion loop — agents call agents, and graphs run graphs:
+-------+
| START |
+---+---+
|
v
+-------------+ a sub-agent is just a tool,
| Agent Node | and a tool may itself be a
+------+------+ whole compiled graph...
|
+------+-------------------------+
| | |
needs tool calls sub-agent done
| | |
v v v
+-----------+ +---------------+ +-----+
| Tool Node | | SubAgent / | | END |
+-----+-----+ | Subgraph Node | +-----+
| +-------+-------+
| | depth +1, recursion policy,
| | child run rolls up usage/cost
+-- loops back --+--- re-enters the runtime ---+
to Agent Node (graph -> REPL -> graph)
Add TinyAgents to your project:
[dependencies]
tinyagents = "0.1"
The default build is offline. To enable hosted providers, turn on the openai
feature:
[dependencies]
tinyagents = { version = "0.1", features = ["openai"] }
To explore locally:
git clone git@github.com:tinyhumansai/rustagents.git
cd rustagents
cargo run --example basic_graph
OpenAI-backed examples need the feature flag and an API key:
export OPENAI_API_KEY=...
cargo run --features openai --example openai_chat
All live in examples/:
— a minimal typed state graph:basic_graph
START
, nodes, edges,END
.— conditional routing, fanout, and richer topology.complex_graph
— checkpoints, resume, and time-travel over supersteps.durable_graph
— the agent ↔ tool loop the harness runs.agent_loop_tools
—orchestrator_subagents
recursion in action: an orchestrator agent that calls sub-agents as tools, with depth tracking and rolled-up usage.—openai_self_blueprint
the deepest recursion: a model authors a.rag
blueprint that is compiled and run on the same runtime.— load and run a declarativerag_blueprint
.rag
workflow.— a single provider-backed chat turn.openai_chat
— tool calling against a hosted model.openai_tools
— typed structured output.openai_structured
— a provider-backed agent driven inside a graph.openai_graph_agent
OpenAI-backed examples require --features openai
and OPENAI_API_KEY
.
Contributors working directly in the repository should also read the checked-in architecture specification under docs/spec/README.md.
cargo fmt --check
cargo clippy --all-targets -- -D warnings
cargo build --all-targets
cargo test
TinyAgents welcomes focused contributions that improve the graph runtime,
harness contracts, the registry, the .rag
/ .ragsh
languages, provider adapters, tests, examples, and documentation.
Read CONTRIBUTING.md before opening a pull request.
TinyAgents is licensed under GPL-3.0-only.
Built by TinyHumans for the Rust agent ecosystem.