A reverse-engineering pipeline that turns a firmware binary and its (possibly-wrong) disassembly into a working Ghidra processor specification. When you hit a proprietary processor with no documentation and no Ghidra support, this tool recovers the real encoding of each instruction β which bits are the opcode, which are registers, which are immediates β and writes out a SLEIGH spec you can load directly into Ghidra to decompile the firmware.
Under the hood it is an agentic workflow: a fixed pipeline where each step is a large language model prompted for a narrow job. The workflow is orchestrated by deterministic code β not by the LLMs themselves β and every SLEIGH constructor generated at the end is verified by compiling it with Ghidra's sleigh
binary before being accepted. Failed compilations are fed back to the model for up to three repair attempts.
Objdump
β
βΌ
Bootstrap βββ deterministic clustering (no LLM)
β
βΌ
ββ Processing Loop βββββββββββββββββββββββββββ
β Text Interpreter β Bit Interpreter βββ β
β β Knowledge Manager β β
β β Supervisor β β
β β split ββββββ β
β βββ next cluster βββββββββββ€
ββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
Knowledge Base
β
βΌ
SLEIGH Generator βββ compile-verify-retry loop
β
βΌ
Ghidra .slaspec
Instructions are grouped into clusters by structure (byte size, token pattern, fixed-bit mask). Each cluster is then analyzed by a chain of specialized LLM steps:
Text Interpreter extracts the text pattern (add {REG1}, {REG2}, {REG3}
).Bit Interpreter maps each placeholder to a bit range using field-correlation tools; can request a split if a cluster mixes encodings.Knowledge Manager integrates per-cluster evidence into a typed knowledge base of registers, instructions, addressing modes, and architecture traits.Supervisor is primarily a deterministic gatekeeper (structural checks on match rates, unmapped placeholders, opcode overlap). It only invokes an LLM when a check fails, and it can either accept, re-run a specific agent with feedback, or escalate to the human via the TUI.
When the knowledge base is complete, a separate SLEIGH generator builds the Ghidra spec in two phases: a deterministic skeleton of all constructors marked unimpl
, then an LLM fills in the p-code semantics one instruction at a time, compiling each against Ghidra's sleigh
binary and retrying on failure.
Designed as a co-pilot for the analyst, not a replacement: the TUI exposes every decision, the supervisor escalates ambiguous clusters to a human, and the full LLM conversation, tool-call, and token-usage history is written to disk.
Tested on LEGv8, MIPS, pi32v2, and x86.
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
./docker/run.sh integration_tests/mips
pip install -e ".[all]"
python -m main --config config.yaml
Input: a firmware binary and an objdump disassembly β even one produced against the wrong architecture. The tool does not solve the disassembly problem itself; output quality scales with input disassembly quality.
Output: a Ghidra .slaspec
file plus a JSON knowledge base of registers, instruction encodings, addressing modes, and architecture traits.
Full documentation β architecture, agent internals, worked examples, configuration reference β lives in the wiki:
pip install -e ".[docs]"
cd wiki && mkdocs serve
Then open http://localhost:8000.
- Python >= 3.11
ANTHROPIC_API_KEY
environment variable- Docker (optional, for
run.sh
) - Ghidra (required for the SLEIGH compile-verify step)