# How I tracked down a 36GB memory leak in a Claude Code memory server

> Source: <https://dev.to/jazzmax/how-i-tracked-down-a-36gb-memory-leak-in-a-claude-code-memory-server-27bo>
> Published: 2026-06-22 05:45:29+00:00

A debugging story about heap snapshots, native memory that `--max-old-space-size`

can't touch, and a WebAssembly filesystem quietly hoarding files.

I run a small service that gives a team of Claude Code users one shared memory store. Mechanically it's a Node/Express proxy that wraps a stdio MCP server (`ruflo`

) and exposes it over HTTP. You don't need the product to follow the bug — just one fact: a long-lived Node process serves memory operations, and underneath it uses **sql.js** (SQLite compiled to WebAssembly) to hold the store.

One instance in production kept growing. Not spiking — *creeping*. ~36 GB RSS over six weeks, then the cgroup OOM-killer would reap it and the clock reset. Classic leak shape.

The proxy and the wrapped MCP child are separate processes. `ps`

settled it fast: the proxy sat flat at ~60 MB; the `ruflo mcp start`

child was the one ballooning. So the leak was below my code, in the wrapped process. Good — narrower problem.

First instinct on a Node leak is the V8 heap. So I looked at `process.memoryUsage()`

on the live child:

```
rss            1385 MB
heapTotal        24 MB
heapUsed         21 MB
external       1286 MB
arrayBuffers    995 MB
```

This is the whole story in five numbers. `heapTotal`

— the V8 JS heap — is flat at 24 MB. The growth is entirely in ** external / arrayBuffers**: native memory backing

`ArrayBuffer`

s, That immediately kills two "obvious" fixes:

`--max-old-space-size`

So: what holds ~1 GB of `ArrayBuffer`

s?

I opened the inspector on the live process (`kill -USR1 <pid>`

, then connected over the WebSocket — Node 22 has a global `WebSocket`

, so a 30-line script does it) and took a `HeapProfiler.takeHeapSnapshot`

. The snapshot was only ~18 MB, which is itself a clue: if the leak were *hundreds of thousands of small* JS objects, the graph would be huge. A small graph holding a lot of bytes means **a few big buffers**.

Parsing the snapshot (the format is just `nodes`

/ `edges`

/ `strings`

arrays), the top retained objects were unambiguous:

```
203 × native:system / JSArrayBufferData @ 11.0 MB = 2233 MB
```

203 buffers, **11 MB each**. And 11 MB was exactly the size of the on-disk `memory.db`

. The retainer chain:

```
JSArrayBufferData (11 MB)
  <- ArrayBuffer
  <- Buffer
  <- (MEMFS file node).contents
  <- FS.nodes  (an Array)
  <- Context  (the sql.js Emscripten module — has WebAssembly.Memory, HEAPF32, createNode, /dev/tty…)
  <- SqlJsBackend.db
```

That `Context`

with `createNode`

, `/dev/tty`

, and a `WebAssembly.Memory`

is the tell: it's **Emscripten's in-memory filesystem (MEMFS)**. The file names confirmed it — each buffer was a MEMFS file called `dbfile_<random>`

, and there were ~200 of them, each a full copy of the database.

Here's the mechanism. sql.js's `Database`

constructor writes its input bytes into a MEMFS file (`dbfile_<random>`

) via `FS.createDataFile`

. `Database.prototype.close()`

is what removes it (`FS.unlink`

). And the sql.js module is a **process-wide singleton** — one MEMFS shared by every `Database`

you ever open.

The backend opened the database like this, per operation path, with no caching:

```
this.db = new SQL.Database(fs.readFileSync(path)); // loads the whole 11MB image
// ...used, then the wrapper goes out of scope
```

When that JS `Database`

wrapper is dropped, V8 garbage-collects the *wrapper object* — but **GC has no idea about the MEMFS file** it created inside the WASM module. Only an explicit `close()`

unlinks it. No `close()`

→ the 11 MB `dbfile_<random>`

lives in MEMFS forever. One leaked DB image per open. Multiply by traffic and you get 36 GB.

This is the trap in one sentence: **garbage-collecting a JS handle does not free native/WASM memory it allocated.** The GC sees a tiny wrapper; the cost is in a buffer the GC doesn't manage.

**Containment (ship today).** I added an RSS watchdog to the proxy: it reads the child's RSS from `/proc/<pid>/status`

, and when it crosses a threshold it gracefully respawns the child once it's idle (reusing an existing single-flight reconnect path — kill the old child, spawn a fresh one). A respawn drops the entire bloated MEMFS at once. Symptomatic, but it bounds memory with zero dropped requests.

**Root cause (fix it properly).** Cache the backend per database path so the DB opens **once** and is reused, instead of a fresh `SQL.Database`

per call. No repeated loads → no new `dbfile_*`

. I bake this as a build-time patch into the image and filed it upstream with the snapshot.

The earlier hard OOM-kills had interrupted a sql.js write mid-flight and left one `memory.db`

corrupted — `database disk image is malformed`

, busted overflow pages in the B-tree. Recovery turned into its own adventure:

`.recover`

(SQLite's salvage mode) reconstructed the bulk of the rows by walking the B-tree fragments.`-wal`

), which `.recover`

doesn't replay, and some sat on the corrupted pages. I ended up parsing WAL frames by hand (apply page images by page number) and carving SQLite leaf-page records directly to recover the rest.Lesson burned in: **a WAL-mode SQLite backup is three files** — `db`

+ `-wal`

+ `-shm`

. Copy only the `.db`

and you get exactly that "malformed" error, because the latest committed state is still in the WAL.

`heapTotal`

flat + `external`

/`arrayBuffers`

rising = native leak. Don't reach for `--max-old-space-size`

; it can't help.`JSArrayBufferData`

nodes and their retainer chain pointed straight at the owning structure. A small snapshot holding big bytes = few large buffers.Upstream writeup with the full retainer trace: [ ruvnet/ruflo#2432](https://github.com/ruvnet/ruflo/issues/2432). The wrapper itself, if you're curious:

`jazz-max/ruflo-hub`
