{"slug": "a-program-is-a-tree-building-a-verbose-compiler-in-verbose", "title": "A program is a tree — building a Verbose compiler in Verbose", "summary": "A developer building the experimental Verbose language has written a Verbose compiler front end in Verbose itself, for a toy subset of the language. The front end, which includes a tokenizer, parser, analyses, interpreter, and type checker, compiles to a ~60 KB ELF binary that reads programs and reports errors. This work moves compiler logic from Rust into Verbose, reducing the trusted computing base to a small verifier and enabling full auditability of binaries.", "body_md": "**Verbose is a small experimental language I'm building.** Its compiler proves properties about your code — like termination — and emits tiny, readable x86-64 machine code: no runtime, no GC, no libc. This post stands on its own (you don't need the rest of the series). What it's about: I'm now writing a Verbose compiler *in Verbose itself*, and this is the foundation brick — how you represent a program as data so a compiler can work on it.\n\n*(English version of an article from my French series, originally on arcker.org.)*\n\nAfter cryptography, we take on something more vertiginous: **a Verbose compiler written in Verbose**. The language starting to describe itself.\n\nLet's be honest up front — it matters. This is not (yet) verbosec compiling the entirety of its own source. What exists today is a **complete front end** — tokenizer, parser, analyses, interpreter, type checker — written *in Verbose*, for a **toy subset** of the language. The whole thing compiled by verbosec to native machine code. Not an interpreted demo: a ~60 KB ELF binary that reads your program and tells you what's wrong with it. That's `examples/vexprparse.verbose`\n\n: 102 concepts, 219 rules.\n\nWe'll walk through it brick by brick. This chapter lays the foundation without which nothing else exists: **how to represent a program**.\n\nThe question deserves an answer, because it isn't just an exercise — it touches Verbose's whole thesis.\n\nToday, the compiler (verbosec) is written in **Rust**. And some of the logic — certain primitives — is Rust that emits x86-64 directly, *with no Verbose source*. The concrete consequence: to audit a Verbose binary, to *really* understand what it does, at some point you have to read Rust. And trust that Rust — and whoever, or whatever, wrote it.\n\nThat's precisely what Verbose refuses. The whole series rests on four words: *you don't trust, you verify*. You read the source, declared and proven. If the path from source to binary runs through unverifiable Rust, trust leaks out there.\n\nWriting the front end *in Verbose* moves that logic into the language itself: the tokenizer, the parser, the analyses become a `.verbose`\n\nfile, verified under Verbose's proof regime, then compiled native. The auditor reads Verbose, not Rust. The remaining Rust shrinks to a small, stable, **trusted-once** base (the verifier). Per-binary trust moves from Rust to the proven source.\n\nAnd it's the ultimate dogfooding: a compiler is the hardest thing to express. If Verbose can describe its own front end, under its own proof regime, then the language isn't a toy — it holds up on the most demanding task there is.\n\nA compiler can do nothing with flat text. `x + y * 2`\n\n, to a human, is a string of characters; to a compiler, it's a **structure** — a tree, where the multiplication nests under the addition (operator precedence):\n\n```\n  The text  \"x + y * 2\"  is really a tree:\n\n            ( + )\n           /     \\\n         x      ( * )\n               /     \\\n             y         2\n```\n\nEverything starts there. Before evaluating, type-checking, or catching an undefined variable — you first have to turn the text into that tree. That's the AST (*Abstract Syntax Tree*). And to build it, you need a way to represent a tree **as data**.\n\nThis is where the earlier chapters pay off. A tree is declared in Verbose as a **sum type** — a type that can take several shapes — some of whose shapes **reference themselves**:\n\n```\nconcept Ast\n  variants:\n    AstNum  of (value : number)\n    AstVar  of (start : number, len : number)\n    AstBin  of (op : number, lhs : Ast, rhs : Ast)\n    AstNeg  of (inner : Ast)\n    AstIf   of (cond : Ast, thn : Ast, els : Ast)\n    AstCall of (callee_start : number, callee_len : number, args : ArgList)\n    ...\n```\n\nRead `AstBin`\n\n: a binary operation holds an operator, **a left subtree Ast, and a right subtree Ast**. The type contains itself. That's the recursion of a tree: an addition whose two sides are, themselves, expressions.\n\n`AstIf`\n\nholds three (condition, `AstNum`\n\nand `AstVar`\n\nare `Ast`\n\n.Our example then becomes, exactly:\n\n```\n  AstBin( + ,\n          AstVar(x),\n          AstBin( * , AstVar(y), AstNum(2)) )\n```\n\nThe tree drawn above, written as a value. And `a.b.c`\n\n? `AstField(AstField(AstVar(a), b), c)`\n\n— the nesting follows the structure.\n\nOne problem remains. Verbose has no heap and no pointers — one of the reasons its binaries are so small and so verifiable. So how do you build a tree of arbitrary size?\n\nThe answer: an **arena**. All nodes live in a single bounded space, and a node points to its children by their **index**, not by a pointer.\n\n```\n  concept_group VExpr [max_depth: 4096, max_nodes: 65535]\n\n  arena:  [0]  AstVar(x)\n          [1]  AstVar(y)\n          [2]  AstNum(2)\n          [3]  AstBin( * , lhs=1, rhs=2)    ← references indices 1 and 2\n          [4]  AstBin( + , lhs=0, rhs=3)    ← the root\n```\n\nThe tree is built bottom-up: leaves first, then the nodes that link them. `max_depth: 4096, max_nodes: 65535`\n\naren't decorative — they're the **static bounds** the verifier needs to prove everything stays finite. No dynamic allocation, no possible overflow, and yet a tree of any shape.\n\nIn the same group live the tokens, the environments, and the diagnostics — all variants of `VExpr`\n\n, all linked by index. One arena for the whole front end.\n\nBecause everything else plugs into it. The tokenizer will produce `Token`\n\ns in this arena. The parser will consume them to build `Ast`\n\ns. The analyses will walk the tree to find your mistakes. The interpreter will descend it to compute a result. Without a way to represent the tree — recursive, bounded, verifiable — there's no compiler at all.\n\nAnd it's the direct payoff of what we built before: the recursion of [chapter 1](https://arcker.org/blog/2026-05-25-from-idea-to-binary/), the termination proofs of [chapter 3](https://arcker.org/blog/2026-05-26-proving-termination/). An AST is *the* recursive structure par excellence — and Verbose represents it under the same guarantees as everything else: bounded, pointerless, proven finite.\n\nThe program has become data. The next chapter builds it from raw text: the tokenizer.\n\n*Originally published on arcker.org, where the full series lives.*", "url": "https://wpnews.pro/news/a-program-is-a-tree-building-a-verbose-compiler-in-verbose", "canonical_source": "https://dev.to/arcker/a-program-is-a-tree-building-a-verbose-compiler-in-verbose-4927", "published_at": "2026-06-14 09:11:25+00:00", "updated_at": "2026-06-14 09:40:59.418116+00:00", "lang": "en", "topics": ["developer-tools"], "entities": ["Verbose", "verbosec", "Rust", "x86-64", "ELF", "AST"], "alternates": {"html": "https://wpnews.pro/news/a-program-is-a-tree-building-a-verbose-compiler-in-verbose", "markdown": "https://wpnews.pro/news/a-program-is-a-tree-building-a-verbose-compiler-in-verbose.md", "text": "https://wpnews.pro/news/a-program-is-a-tree-building-a-verbose-compiler-in-verbose.txt", "jsonld": "https://wpnews.pro/news/a-program-is-a-tree-building-a-verbose-compiler-in-verbose.jsonld"}}