cd /news/artificial-intelligence/i-built-the-unoq-s-claw-a-tiny-agent… · home topics artificial-intelligence article
[ARTICLE · art-4529] src=dev.to pub= topic=artificial-intelligence verified=true sentiment=↑ positive

I Built The UnoQ's Claw: A Tiny Agentic AI Assistant That Lives Inside an Arduino Uno Q

QClaw, an agentic AI assistant that runs entirely offline on an Arduino Uno Q board, hosting a local language model (Qwen3.5 0.8B) to autonomously generate, compile, and flash firmware to the board's microcontroller without any cloud API calls or external hardware probes. It features a dual-path runtime—an agentic path with eight tools for full sketch lifecycle management and a direct path for faster factual queries—and leverages the Uno Q's unique dual-silicon design, where the MPU controls the MCU via GPIO SWD for sub-second flashing.

read5 min views8 publishedMay 21, 2026

Every "AI on hardware" demo you have ever seen has a LLM behind it. The user talks to a board via a terminal or Telegram, and the board calls an API to have a cloud model do the work. QClaw flips that arrangement. The Arduino Uno Q hosts the language model, runs the agent loop, drives the compile toolchain, and flashes its own microcontroller.

Ask QClaw to scroll "QClaw" across the LED matrix and it does. End to end. On the board. Offline.

QClaw has an eight-tool agentic surface, a fifteen-skill pre-router, and a direct OpenOCD flash route that makes autonomous uploads actually execute. A dual-path runtime lets you pick speed or full hardware control on the same model.

Why the Uno Q Is the Right Board for This

The Arduino Uno Q is a split-silicon device. It looks like a classic Arduino on the outside, but it is two boards in a trench-coat:

The MPU and MCU share the same PCB. The MPU can hold the MCU in reset and reprogram its flash directly through GPIO pins wired to SWD via the linuxgpiod

driver. No USB cable between them. No probe. No second machine. That is the genius of QClaw.

The agentic loop orchestrates the full sketch lifecycle across the Arduino Uno Q's dual-silicon topology, with the MPU driving the loop and the MCU executing the resulting firmware. This is how QClaw generates, compiles, flashes, and observes.

The QClaw arduino

tool invokes OpenOCD directly at the correct address. The tool compiles with arduino-cli compile --fqbn arduino:zephyr:unoq --export-binaries

, picks up the resulting .elf-zsk.bin

, and pipes it through OpenOCD over the GPIO SWD bridge. No SSH, no network credentials, no remote OCD tunnel. Just MPU to MCU, on the same board. Sub-second flash once the binary is on disk.

Four gigs of RAM is also more than enough to host a Qwen3.5 0.8B Q4_0 model with an 8K context window, mlocked, with q8_0 KV cache. QClaw lives on the Uno Q at around 1.3 GB. Decode runs at roughly 8 tokens per second. Slow next to a desktop GPU, fast enough that a sketch compiles and flashes before you have finished your coffee.

**How To Use QClaw **

QClaw ships two runtimes on top of the same llama-server backend, the same SOUL.md

, and the same 23-rule pre-router. They differ in what wraps the LLM call.

Agentic path (make qclaw-agentic

). the qclaw Go gateway sits in front of the model. It runs channel adapters (terminal, SSH, Telegram), the multi-iteration agent loop, the pre-router, and the eight-tool dispatcher. This is the production default. It is the only path that can actually compile and flash a sketch.

Direct path (make qclaw-direct

). A thin Python REPL POSTs directly to llama-server

after running the same pre-router rules in Python. No loop, no tools, no Telegram. About 33 percent lower latency on pure factual prompts at equivalent correctness, because there is no tool schema in the prompt and no second iteration.

Use the agentic path when you want a sketch flashed or a frame captured. Use the direct path when you just want to ask which pins on the Uno Q do PWM.

Drop in two commands and you have a session:

git clone https://github.com/laurenvil/Uno-QClaw.git ~/ArduinoApps/QClaw     

cd ~/ArduinoApps/QClaw     

git submodule update --init --recursive     


cd yzma && make download-llama.cpp && cd ..     


mkdir -p ~/models     

wget -O ~/models/Qwen_Qwen3.5-0.8B-Q4_0.gguf \     

     'https://huggingface.co/Qwen/Qwen3.5-0.8B-GGUF/resolve/main/Qwen3.5-0.8B-Q4_0.gguf'     


make qclaw-install     


make qclaw-agentic    # full agent loop + 8 tools (compile/upload/camera/sysfs_led/network/i2cdetect)     

make qclaw-direct     # pre-router + direct API (fast Q&A, no tools)

make qclaw-install

builds the Go binary, copies the system prompt and the fifteen-skill tree into ~/.qclaw/workspace/

, installs arduino-cli

plus the arduino:zephyr

core, and runs an interactive wizard that sets up the optional Telegram gateway.

Once it is running, the agent has eight narrowly-scoped tools available:

read_file

, write_file

, list_dir

for workspace navigation

arduino

for compile and flash via OpenOCD

camera

for single-frame V4L2 capture through GStreamer

sysfs_led

for the MPU-side RGB LEDs at /sys/class/leds/*

network

for hostname, interfaces, and the default gateway, all read-only stdlib Go

i2cdetect

for listing and scanning Linux I²C buses with -y -r

only

No general exec

. No general shell. Every tool validates its arguments against an allow-list. The total tool schema is around 3.4K characters, leaving plenty of room for the system prompt at an 8K context window.

The Pre-Router: Skills, Not RAG

The pre-router is the part of QClaw that does the heavy lifting on a 0.8B model. It is not RAG. It is a flat table of 23 keyword regex rules across 15 skills. When you send a message, the pre-router scans it, finds matching rules, and inlines the relevant SKILL.md

plus its referenced files directly into the system prompt before the LLM call.

The model never has to call read_file

for canonical skill content. The content is already there. At 0.8B scale, a read_file

call costs a full LLM iteration, roughly 10 to 20 minutes of cold prefill plus decode. The pre-router amortizes that to zero.

**The skills cover: **

Sketch fundamentals: blink, breathe, button, potentiometer, servo, compile and upload, CAN bus, DAC, OPAMP

The 13x8 LED matrix with the canonical Arduino_LED_Matrix template

Uno Q hardware: pin tables, voltage rules, connectors, power

Dual-chip workflow: Bridge RPC, App Lab, Bricks

Linux-side capabilities: Wi-Fi, Bluetooth, camera, OpenCV, microphone, sysfs LEDs

Plug-and-play Modulino sensors

Each skill is just a directory under workspace/skills/<name>/

with a SKILL.md

and optional reference files. Adding a new one is a matter of writing the markdown and adding a regex rule.

**Try It **

Repo:https://github.com/laurenvil/Uno-QClaw

Issues, forks, and pull requests are welcome at https://github.com/laurenvil/Uno-QClaw. If you have an Arduino Uno Q on your desk, you can have a self-flashing AI assistant sitting on it tonight, with the Ethernet cable unplugged.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/i-built-the-unoq-s-c…] indexed:0 read:5min 2026-05-21 ·