{"slug": "thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model", "title": "Thousand Token Wood: shipping a multi-agent economy on a 3B model", "summary": "A team of developers built Thousand Token Wood, a multi-agent economic simulation where five AI-powered woodland creatures trade goods using a 3-billion-parameter Qwen2.5-3B model. The simulation, created for the Build Small Hackathon, demonstrates that small models can reliably generate valid JSON and power real-time multi-agent economies when engineered with designed scarcity and sharp prompting, despite their weak reasoning capabilities. The project reveals that emergent market behaviors like bubbles, crashes, and wealth inequality arise naturally from the agents' interactions, making small models a practical and cost-effective tool for running complex simulations.", "body_md": "Viewer • Updated • 40\n\n# Thousand Token Wood: shipping a multi-agent economy on a 3B model\n\n[Team Article](/blog)Published June 5, 2026\n\n*A Build Small Hackathon field report on what a 3-billion-parameter council of traders can and cannot do.*\n\nTry it first: the [Space](https://huggingface.co/spaces/build-small-hackathon/thousand-token-wood-sim), and the open [agent traces](https://huggingface.co/datasets/build-small-hackathon/thousand-token-wood-traces).\n\nI built **Thousand Token Wood** for the Build Small Hackathon. It is a tiny economy: five woodland creatures, each its own agent on **Qwen2.5-3B**, trade five goods for pebbles, gossip, hoard, and panic. You poke the wood and watch bubbles, crashes, and a widening wealth gap appear on their own. The model is served with vLLM on Modal; a Gradio app is the window onto the wood.\n\nThis is a field report on the engineering, written for people who build with small models. The short version: a 3B model is a reliable format generator and an unreliable reasoner, emergent systems need designed scarcity, and the best demos sit where a technical constraint meets something you already understand deeply.\n\n## Why small is the design, not the limit\n\nA living economy needs many agents thinking many times per run. That is exactly where a frontier model is the wrong tool: too slow and too costly to run a council of traders every tick. A small model is what makes a real-time multi-agent simulation feasible. Every creature decides in a single batched GPU call per turn.\n\n## The first economy was dead on arrival\n\nThe naive version did nothing. Production outran consumption, so every creature was self-sufficient and never had a reason to trade. The market cleared once and went silent. The fix was to engineer scarcity:\n\n- Diet variety: a creature can eat only one unit of any single food per meal, so surviving means buying foods it does not grow.\n- Spoilage: perishable food rots if hoarded, forcing surplus to be sold while it still has value.\n- A winter fuel crisis: every creature must burn firewood each turn, the need rises over time, and only one creature makes firewood.\n\nThat last mechanic drives the drama. One supplier cannot meet rising demand, so the woodcutter gets rich and everyone else competes for warmth.\n\n## Valid JSON, weak judgment\n\nWith scarcity in place, the honest small-model lesson surfaced. The 3B emitted valid JSON on 100% of calls, but its economic judgment was poor: a creature that produced acorns would post an order to buy acorns, the one thing it had in surplus.\n\nThe fix was not a bigger model, it was a sharper prompt. I told each agent what it produced and must never buy, computed the exact list of goods it was short on, and gave it one worked example. Decision quality jumped and the creatures began trading to their roles. The whole loop is wrapped in a tolerant JSON parse-and-repair layer, so a malformed response degrades to a no-op instead of crashing the simulation.\n\nA second lesson came from wellbeing. I first modeled it as an accumulator, and any chronic shortfall ground every creature to zero over a run, a death spiral that was no fun to watch and that punished the agents' imperfect optimization. I reframed it as a mean-reverting mood that recovers when a creature is fed and warm and never hits zero. Stakes belong in pebbles, prices, and status, not starvation.\n\n## Then it started telling stories\n\nThe feature I am most pleased with ties the project to market history. The player can draw a Wood Legend: a famous episode reskinned as woodland folklore. Tulip Mania becomes the Great Acorn Mania. The South Sea Bubble becomes the Hollow Log Trading Company. The 1929 bank runs become the Run on Oona's Hoard.\n\nThese are not flavor text. Each legend fires real shocks, and the agents react. In one run I drew the Run on Oona's Hoard, the rumor that the owl's vault was empty. Oona began liquidating her honey to raise pebbles, and the flood of supply crashed the honey price from 10 to 3 over the next turns. A reskinned bank run made an agent dump assets and moved a market price. None of it was scripted.\n\nFor that to be visible, prices had to move. They were frozen because the agents quoted back the reference price I showed them. The fix was to let the market reference drift with residual supply and demand after each round: heavy unfilled buying pushes a price up, a glut pushes it down. Prices now trend during scarcity and stay calm in balanced trade.\n\n## What actually happened\n\nA representative fifteen-turn run, with a drought and a winter rumor injected partway:\n\n| Metric | Result |\n|---|---|\n| Valid JSON actions | 100% (75 of 75 calls) |\n| Trades per turn | sustained 3 to 9, never silent |\n| Honey price | crashed 10 to 3 during the bank-run legend |\n| Firewood price | rose 4 to 7 as winter scarcity bit |\n| Wealth gap (Gini) | widened 0.14 to 0.38 |\n| Outcome | the woodcutter ended richest, the hoarder broke |\n\nThe reasoning behind every one of those moves is in the open [traces dataset](https://huggingface.co/datasets/build-small-hackathon/thousand-token-wood-traces): each row is a creature's full prompt, raw response, parsed actions, and private thought.\n\n## Takeaways for building with small models\n\nMost of the engineering is closing the gap between a small model's reliable formatting and its unreliable reasoning, with structure and prompting rather than scale. Emergent systems need designed scarcity; abundance is boring. And the most compelling small-model demos do not need invented drama. Three centuries of market history had it ready, and a council of 3B agents was enough to play it out.\n\nSmall models, big adventures. Try the [Space](https://huggingface.co/spaces/build-small-hackathon/thousand-token-wood-sim).\n\n*Originally published on Medium.*", "url": "https://wpnews.pro/news/thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model", "canonical_source": "https://huggingface.co/blog/build-small-hackathon/thousand-token-wood-sim", "published_at": "2026-06-05 22:18:46+00:00", "updated_at": "2026-06-05 22:42:06.842786+00:00", "lang": "en", "topics": ["ai-agents", "large-language-models", "ai-research", "ai-products", "ai-infrastructure"], "entities": ["Qwen2.5-3B", "vLLM", "Modal", "Gradio", "Hugging Face", "Thousand Token Wood", "Build Small Hackathon"], "alternates": {"html": "https://wpnews.pro/news/thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model", "markdown": "https://wpnews.pro/news/thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model.md", "text": "https://wpnews.pro/news/thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model.txt", "jsonld": "https://wpnews.pro/news/thousand-token-wood-shipping-a-multi-agent-economy-on-a-3b-model.jsonld"}}