A program is a tree — building a Verbose compiler in Verbose A developer building the experimental Verbose language has written a Verbose compiler front end in Verbose itself, for a toy subset of the language. The front end, which includes a tokenizer, parser, analyses, interpreter, and type checker, compiles to a ~60 KB ELF binary that reads programs and reports errors. This work moves compiler logic from Rust into Verbose, reducing the trusted computing base to a small verifier and enabling full auditability of binaries. Verbose is a small experimental language I'm building. Its compiler proves properties about your code — like termination — and emits tiny, readable x86-64 machine code: no runtime, no GC, no libc. This post stands on its own you don't need the rest of the series . What it's about: I'm now writing a Verbose compiler in Verbose itself , and this is the foundation brick — how you represent a program as data so a compiler can work on it. English version of an article from my French series, originally on arcker.org. After cryptography, we take on something more vertiginous: a Verbose compiler written in Verbose . The language starting to describe itself. Let's be honest up front — it matters. This is not yet verbosec compiling the entirety of its own source. What exists today is a complete front end — tokenizer, parser, analyses, interpreter, type checker — written in Verbose , for a toy subset of the language. The whole thing compiled by verbosec to native machine code. Not an interpreted demo: a ~60 KB ELF binary that reads your program and tells you what's wrong with it. That's examples/vexprparse.verbose : 102 concepts, 219 rules. We'll walk through it brick by brick. This chapter lays the foundation without which nothing else exists: how to represent a program . The question deserves an answer, because it isn't just an exercise — it touches Verbose's whole thesis. Today, the compiler verbosec is written in Rust . And some of the logic — certain primitives — is Rust that emits x86-64 directly, with no Verbose source . The concrete consequence: to audit a Verbose binary, to really understand what it does, at some point you have to read Rust. And trust that Rust — and whoever, or whatever, wrote it. That's precisely what Verbose refuses. The whole series rests on four words: you don't trust, you verify . You read the source, declared and proven. If the path from source to binary runs through unverifiable Rust, trust leaks out there. Writing the front end in Verbose moves that logic into the language itself: the tokenizer, the parser, the analyses become a .verbose file, verified under Verbose's proof regime, then compiled native. The auditor reads Verbose, not Rust. The remaining Rust shrinks to a small, stable, trusted-once base the verifier . Per-binary trust moves from Rust to the proven source. And it's the ultimate dogfooding: a compiler is the hardest thing to express. If Verbose can describe its own front end, under its own proof regime, then the language isn't a toy — it holds up on the most demanding task there is. A compiler can do nothing with flat text. x + y 2 , to a human, is a string of characters; to a compiler, it's a structure — a tree, where the multiplication nests under the addition operator precedence : The text "x + y 2" is really a tree: + / \ x / \ y 2 Everything starts there. Before evaluating, type-checking, or catching an undefined variable — you first have to turn the text into that tree. That's the AST Abstract Syntax Tree . And to build it, you need a way to represent a tree as data . This is where the earlier chapters pay off. A tree is declared in Verbose as a sum type — a type that can take several shapes — some of whose shapes reference themselves : concept Ast variants: AstNum of value : number AstVar of start : number, len : number AstBin of op : number, lhs : Ast, rhs : Ast AstNeg of inner : Ast AstIf of cond : Ast, thn : Ast, els : Ast AstCall of callee start : number, callee len : number, args : ArgList ... Read AstBin : a binary operation holds an operator, a left subtree Ast, and a right subtree Ast . The type contains itself. That's the recursion of a tree: an addition whose two sides are, themselves, expressions. AstIf holds three condition, AstNum and AstVar are Ast .Our example then becomes, exactly: AstBin + , AstVar x , AstBin , AstVar y , AstNum 2 The tree drawn above, written as a value. And a.b.c ? AstField AstField AstVar a , b , c — the nesting follows the structure. One problem remains. Verbose has no heap and no pointers — one of the reasons its binaries are so small and so verifiable. So how do you build a tree of arbitrary size? The answer: an arena . All nodes live in a single bounded space, and a node points to its children by their index , not by a pointer. concept group VExpr max depth: 4096, max nodes: 65535 arena: 0 AstVar x 1 AstVar y 2 AstNum 2 3 AstBin , lhs=1, rhs=2 ← references indices 1 and 2 4 AstBin + , lhs=0, rhs=3 ← the root The tree is built bottom-up: leaves first, then the nodes that link them. max depth: 4096, max nodes: 65535 aren't decorative — they're the static bounds the verifier needs to prove everything stays finite. No dynamic allocation, no possible overflow, and yet a tree of any shape. In the same group live the tokens, the environments, and the diagnostics — all variants of VExpr , all linked by index. One arena for the whole front end. Because everything else plugs into it. The tokenizer will produce Token s in this arena. The parser will consume them to build Ast s. The analyses will walk the tree to find your mistakes. The interpreter will descend it to compute a result. Without a way to represent the tree — recursive, bounded, verifiable — there's no compiler at all. And it's the direct payoff of what we built before: the recursion of chapter 1 https://arcker.org/blog/2026-05-25-from-idea-to-binary/ , the termination proofs of chapter 3 https://arcker.org/blog/2026-05-26-proving-termination/ . An AST is the recursive structure par excellence — and Verbose represents it under the same guarantees as everything else: bounded, pointerless, proven finite. The program has become data. The next chapter builds it from raw text: the tokenizer. Originally published on arcker.org, where the full series lives.