cd /news/artificial-intelligence/skill-distillation · home topics artificial-intelligence article
[ARTICLE · art-17920] src=tomtunguz.com pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Skill Distillation

A developer has created a system called "skill distillation" that uses frontier AI models like Opus 4.7 and GPT-5.1 to write procedural skill files, which are then executed by smaller local models like Qwen 35B or Gemma 26B running on personal computers. The system, built around a personal agent called Pi, transfers procedural knowledge through markdown files rather than compressing model weights, allowing the smaller model to follow step-by-step instructions without needing to understand the underlying task. This approach creates inspectable, versionable, and hot-swappable skills that can be automatically generated, tested, and refined overnight based on historical logs.

read2 min publishedMay 29, 2026

I’ve been using state-of-the-art models to teach small models running on my computer how I work.

My personal agent, based on Pi, runs my inbox, my deal pipeline, my blog publishing, my calendar, & my research. It looks less like a chatbot & more like a small operating system.

The first layer is ** QMD**, a local markdown knowledge base of about eighty workflow files in

~/memories

. Before answering any procedural question, the agent searches QMD for the right playbook.The second layer is Skills, atomic SKILL.md

files that describe one job each. The skills are written by a frontier model. So are the evaluations that grade them. The same system writes, tests, and rewrites each skill until accuracy converges. It also checks recall against QMD, so the right keywords always surface the right skill.

The third layer is the Agent Loop, a model running Plan → Tool Call → Observe → Refine, calling out to seventeen Rust APIs & a handful of MCP integrations.

One of the techniques I’ve started to use is skill distillation. A frontier model, Opus 4.7, GPT-5.1, Gemini 3 Pro, authors & refines the skill files. A smaller model, Qwen 35B or Gemma 26B running locally, executes them. The teacher transfers procedural knowledge to the student through markdown. The skill is inspectable, versionable, & hot-swappable.

This is fundamentally different from classical knowledge distillation, which compresses a big model’s soft probability outputs into a smaller model’s weights. It’s different from instruction tuning, which bakes behavior into weights through prompt-response pairs. It’s different from RAG, which retrieves facts.

Skill distillation retrieves procedures. The smaller model doesn’t have to know how to evaluate a company. It just has to know how to follow the steps.

Every night a system runs through historical logs to understand what new skills should be generated, mirroring the loop that Pete Koomen described at Y Combinator earlier this week.

The frontier model becomes a teacher. The library becomes the company’s institutional knowledge. The student becomes whichever model happens to be cheapest this quarter.

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/skill-distillation] indexed:0 read:2min 2026-05-29 ·