# A program is a tree — building a Verbose compiler in Verbose

> Source: <https://dev.to/arcker/a-program-is-a-tree-building-a-verbose-compiler-in-verbose-4927>
> Published: 2026-06-14 09:11:25+00:00

**Verbose is a small experimental language I'm building.** Its compiler proves properties about your code — like termination — and emits tiny, readable x86-64 machine code: no runtime, no GC, no libc. This post stands on its own (you don't need the rest of the series). What it's about: I'm now writing a Verbose compiler *in Verbose itself*, and this is the foundation brick — how you represent a program as data so a compiler can work on it.

*(English version of an article from my French series, originally on arcker.org.)*

After cryptography, we take on something more vertiginous: **a Verbose compiler written in Verbose**. The language starting to describe itself.

Let's be honest up front — it matters. This is not (yet) verbosec compiling the entirety of its own source. What exists today is a **complete front end** — tokenizer, parser, analyses, interpreter, type checker — written *in Verbose*, for a **toy subset** of the language. The whole thing compiled by verbosec to native machine code. Not an interpreted demo: a ~60 KB ELF binary that reads your program and tells you what's wrong with it. That's `examples/vexprparse.verbose`

: 102 concepts, 219 rules.

We'll walk through it brick by brick. This chapter lays the foundation without which nothing else exists: **how to represent a program**.

The question deserves an answer, because it isn't just an exercise — it touches Verbose's whole thesis.

Today, the compiler (verbosec) is written in **Rust**. And some of the logic — certain primitives — is Rust that emits x86-64 directly, *with no Verbose source*. The concrete consequence: to audit a Verbose binary, to *really* understand what it does, at some point you have to read Rust. And trust that Rust — and whoever, or whatever, wrote it.

That's precisely what Verbose refuses. The whole series rests on four words: *you don't trust, you verify*. You read the source, declared and proven. If the path from source to binary runs through unverifiable Rust, trust leaks out there.

Writing the front end *in Verbose* moves that logic into the language itself: the tokenizer, the parser, the analyses become a `.verbose`

file, verified under Verbose's proof regime, then compiled native. The auditor reads Verbose, not Rust. The remaining Rust shrinks to a small, stable, **trusted-once** base (the verifier). Per-binary trust moves from Rust to the proven source.

And it's the ultimate dogfooding: a compiler is the hardest thing to express. If Verbose can describe its own front end, under its own proof regime, then the language isn't a toy — it holds up on the most demanding task there is.

A compiler can do nothing with flat text. `x + y * 2`

, to a human, is a string of characters; to a compiler, it's a **structure** — a tree, where the multiplication nests under the addition (operator precedence):

```
  The text  "x + y * 2"  is really a tree:

            ( + )
           /     \
         x      ( * )
               /     \
             y         2
```

Everything starts there. Before evaluating, type-checking, or catching an undefined variable — you first have to turn the text into that tree. That's the AST (*Abstract Syntax Tree*). And to build it, you need a way to represent a tree **as data**.

This is where the earlier chapters pay off. A tree is declared in Verbose as a **sum type** — a type that can take several shapes — some of whose shapes **reference themselves**:

```
concept Ast
  variants:
    AstNum  of (value : number)
    AstVar  of (start : number, len : number)
    AstBin  of (op : number, lhs : Ast, rhs : Ast)
    AstNeg  of (inner : Ast)
    AstIf   of (cond : Ast, thn : Ast, els : Ast)
    AstCall of (callee_start : number, callee_len : number, args : ArgList)
    ...
```

Read `AstBin`

: a binary operation holds an operator, **a left subtree Ast, and a right subtree Ast**. The type contains itself. That's the recursion of a tree: an addition whose two sides are, themselves, expressions.

`AstIf`

holds three (condition, `AstNum`

and `AstVar`

are `Ast`

.Our example then becomes, exactly:

```
  AstBin( + ,
          AstVar(x),
          AstBin( * , AstVar(y), AstNum(2)) )
```

The tree drawn above, written as a value. And `a.b.c`

? `AstField(AstField(AstVar(a), b), c)`

— the nesting follows the structure.

One problem remains. Verbose has no heap and no pointers — one of the reasons its binaries are so small and so verifiable. So how do you build a tree of arbitrary size?

The answer: an **arena**. All nodes live in a single bounded space, and a node points to its children by their **index**, not by a pointer.

```
  concept_group VExpr [max_depth: 4096, max_nodes: 65535]

  arena:  [0]  AstVar(x)
          [1]  AstVar(y)
          [2]  AstNum(2)
          [3]  AstBin( * , lhs=1, rhs=2)    ← references indices 1 and 2
          [4]  AstBin( + , lhs=0, rhs=3)    ← the root
```

The tree is built bottom-up: leaves first, then the nodes that link them. `max_depth: 4096, max_nodes: 65535`

aren't decorative — they're the **static bounds** the verifier needs to prove everything stays finite. No dynamic allocation, no possible overflow, and yet a tree of any shape.

In the same group live the tokens, the environments, and the diagnostics — all variants of `VExpr`

, all linked by index. One arena for the whole front end.

Because everything else plugs into it. The tokenizer will produce `Token`

s in this arena. The parser will consume them to build `Ast`

s. The analyses will walk the tree to find your mistakes. The interpreter will descend it to compute a result. Without a way to represent the tree — recursive, bounded, verifiable — there's no compiler at all.

And it's the direct payoff of what we built before: the recursion of [chapter 1](https://arcker.org/blog/2026-05-25-from-idea-to-binary/), the termination proofs of [chapter 3](https://arcker.org/blog/2026-05-26-proving-termination/). An AST is *the* recursive structure par excellence — and Verbose represents it under the same guarantees as everything else: bounded, pointerless, proven finite.

The program has become data. The next chapter builds it from raw text: the tokenizer.

*Originally published on arcker.org, where the full series lives.*
