{"slug": "teaching-a-computer-to-play-4x-how-the-annhexation-ai-works", "title": "Teaching a Computer to Play 4X: How the Annhexation AI Works", "summary": "A developer has built a layered AI system for the 4X strategy game Annhexation that separates strategy, planning, and execution into three distinct layers. The AI uses a prioritized goal stack that evaluates and scores potential objectives—from early expansion and military pushes to wonder racing and nuclear first strikes—against personality weights and world factors each turn. This decoupled architecture ensures the computer opponent maintains coherent long-term strategies, such as holding a military campaign goal for twenty-plus turns, rather than making twitchy turn-by-turn decisions.", "body_md": "Building a believable computer opponent for a 4X strategy game is one of those problems that turns out to be bottomless. I'd use the cliche it looks simple from the outside... but I don't think thats true, I thought this would be a tough nut from the outset. I've built a chess playing engine before and that was far simpler to get a strong opponent - though it helps that that is such a well understood and documented problem. The player wants an opponent that *explores, expands, exploits and exterminates* with apparent intent — one that musters an army over several turns, marches it across a continent, lands it on your shore and takes your city, all while you watched it coming and couldn't quite stop it. They do **not** want an opponent that teleports units, reads your mind, or sits inert in its starting cities until you wander into range.\n\nThis post is a tour through the [Annhexation AI](https://annhexation.com) — explaining how it makes decisions, what it remembers between turns, and how the same core machinery produces eight distinct civilizations and four difficulty levels. Annhexation isn't open source, so rather than quote the implementation I'll describe the design and illustrate the interesting bits with pseudocode.\n\nI should note that the AI is still under development but after a lot of bashing with a hammer its feeling in a pretty decent place.\n\nThe single most important design decision in the Annhexation AI is that strategy, planning and execution are decoupled. These are three layers that are seperated on purpose and an AI turn flows through three layers:\n\nThe payoff of this separation is, hopefully, coherence over time. A greedy turn-by-turn AI looks twitchy: it builds an army, gets distracted, disbands it, builds another. By contrast, an Annhexation AI that adopts a `militaryPush`\n\ngoal will hold that goal for twenty-plus turns, funnelling production, research and unit movement toward a single objective until the city falls, the campaign demonstrably fails, or something seismic interrupts it. Strategy should be sticky while execution is flexible.\n\nA complete turn runs as an ordered sequence of discrete phases — from threat assessment and diplomacy through combat, movement, production and fortification:\n\n```\nfunction runTurn(player, world, aiState):\n    detectEvents(aiState, world)          # diff against last turn → fire interrupts\n    aiState.goals = evaluateStrategy(player, world, aiState)\n    plans = buildOperationalPlans(aiState.goals, player, world)\n    executeTactics(plans, player, world)  # the phase sequence (see below)\n    aiState.snapshot = snapshot(world)     # remember this turn for next time\n    return aiState\n```\n\nAt the heart of the strategic layer is a **prioritized goal stack**. Each turn the AI either keeps its current goals or re-evaluates them, and the menu of things it can want is rich:\n\n`earlyExpand`\n\n— plant N cities before consolidating`earlyRush`\n\n— exploit the opening with an aggressive early attack`infrastructureConsolidation`\n\n— buildings, population, growth`militaryPush`\n\n— sustained warfare against a chosen player`defensiveWar`\n\n/ `counterattack`\n\n— react to aggression, retake what was lost`navalInvasion`\n\n— assault a distant landmass`wonderRace`\n\n, `scienceVictoryPush`\n\n, `scoreOptimisation`\n\n— the peaceful victory paths`raidWar`\n\n, `asymmetricWar`\n\n— economic harassment instead of conquest`warPreparation`\n\n, `nuclearFirstStrike`\n\n, `recovery`\n\n— the situational specialsGoals don't fire on rigid rules rather they're scored against each other and the highest-utility ones win. The scoring blends several signals:\n\nEvery score is then multiplied by a **personality weight**. Roughly:\n\n```\nfunction scoreGoals(player, world, personality):\n    scores = {}\n    for goal in CANDIDATE_GOALS:\n        base = goal.baseValue(player, world)\n        world_factors = proximity × forceBalance × catchUp × opportunity\n        scores[goal] = base × world_factors × personality.weightFor(goal)\n    return sortDescending(scores)\n\n# e.g. early-expand ≈ base × siteRatio × proximityAdj × catchUp × personality.expansion\n```\n\nTwo of those terms are about the world; one is about who this civ is. That's how the same evaluation function produces a cautious turtle and a rampaging horde.\n\nThe top goal (priority 0) drives the turn. Secondary goals queue behind it, ready to take over the moment an interrupt fires.\n\nA 4X AI that only looks at its own empire plays in a vacuum. Annhexation's AI explicitly models every player it has met before deciding who to fight.\n\nThe AI profiles each known rival across roughly eleven dimensions, each normalised to `[0, 1]`\n\n:\n\n`militarisation`\n\n, `development`\n\n, `expansionism`\n\n, `techPace`\n\n`exposure`\n\nand `coastalExposure`\n\n(undefended or weakly-garrisoned cities)`borderTension`\n\nand `aggression`\n\n(forces massed near `wonderFocus`\n\n, `scienceFocus`\n\n, and the all-important `isRunawayLeader`\n\nflagIt also tracks trends — rising, flat or falling over the last five turns — so the AI reacts to a rival who is *accelerating*, not just one who is currently strong. Those snapshots are kept in persistent state so trend detection survives across turns.\n\nA second pass turns those profiles into a war-target ranking. For each rival it weighs:\n\n```\nfunction scoreWarTargets(rivals, me, personality):\n    for r in rivals:\n        affinity      = personality.aggression × r.borderTension\n        winnable      = clamp(myStrength / r.militarisation)\n        reachable     = 1 / (1 + travelCost(me, r))\n        distracted    = r.aggression_elsewhere\n        r.score       = affinity × winnable × reachable × (1 + distracted)\n    return sortDescending(rivals)\n```\n\nThe winner of that scoring becomes the target of a `militaryPush`\n\n, and the magnitude feeds back as an opportunity multiplier into goal evaluation. An exposed, accessible, distracted neighbour is a temptation the AI is built to notice and exploit.\n\nPersonality in Annhexation isn't a single \"aggression\" slider — it's a vector of about twenty weights (military production, attack appetite, expansion, wonder-building, research, naval production, raid preference, plus early-game tuning like second-city urgency and first-build preference).\n\nOn top of that sits the doctrine system — eight civ-specific playbooks that override those weights and the AI's unit-composition preferences:\n\n| Civ | Doctrine | Signature |\n|---|---|---|\n| Mongolia | `HORSE_RUSH` |\n+50% military production, +50% attack, double raid preference, cavalry-heavy armies |\n| Aztecs | `WARRIOR_RUSH` |\n+40% military & attack, −20% expansion, melee-heavy early aggression |\n| Russia | `EXPAND_WIDE` |\n+40% expansion, +30% garrison commitment |\n| Rome | `INFRA_FIRST` |\n+40% infrastructure, +30% expansion |\n| France | `WAR_FOR_SCIENCE` |\n+40% research, +30% science-victory focus |\n| Greece | `STRATEGIST` |\nbalanced militarisation across all domains |\n| Egypt | `TURTLE_WONDERS` |\n+50% wonders & culture, −20% military |\n| England | `COASTAL_ONLY` |\n+40% naval, +50% coastal-site preference, harbour priority |\n\nBecause the doctrine only modulates shared machinery, Egypt and Mongolia run the identical goal-evaluation and combat code — they simply weight it toward completely different ends. Mongolia drowns you in cavalry; Egypt hides behind wonders and culture; England fights for the coastline.\n\nCombined with unique per civ units this gives each civ a distinctive personality.\n\nOnce a goal is chosen, the operational layer turns intent into concrete plans.\n\nUnit quotas compute empire-wide demand for each unit class — settlers, workers, garrison, field army, reserve, naval, raiders — each scaled by goals, threat levels, personality and difficulty. During a `militaryPush`\n\nagainst a walled city, for instance, the garrison quota rises with threat level, melee demand jumps, and siege units become mandatory — you cannot crack walls without them, and the AI knows it.\n\nUnit composition picks the melee/ranged/siege/mounted ratio for an army. Against an unwalled city it loads up on ranged units (free damage); against walls it must bring siege. Doctrine tilts the mix, and resource gating caps it — no horses means no cavalry, no iron means no siege, full stop:\n\n```\nfunction targetComposition(target, doctrine, resources):\n    if target.walled: mix = {melee: 0.4, siege: 0.4, ranged: 0.2}\n    else:             mix = {melee: 0.4, ranged: 0.5, mounted: 0.1}\n    mix = applyDoctrineBias(mix, doctrine)   # HORSE_RUSH → more mounted, etc.\n    if not resources.horses: mix.mounted = 0\n    if not resources.iron:   mix.siege   = 0\n    return normalise(mix)\n```\n\nAttack plans are first-class, multi-turn objects with an explicit lifecycle:\n\n```\nmustering → gathering → advancing → besieging → assaulting\n                ↘ (naval) awaitingTransport → embarking → sailing → landing ↗\n```\n\nTarget selection scores enemy cities by proximity (−5 per hex of distance), with bonuses for being unwalled (+15), being a capital (+10), and sitting near iron or horses the AI needs (a big multiplier gated on personality and urgency). It goes for the weakest reachable target first — and it commits.\n\nCity production is a distributed priority queue: high-output cities feed global military needs first, low-output cities backfill settlers and workers. The priority cascade runs upgrades → settlers → garrison → military → naval → workers/roads → buildings → wonders, gated by the active goal.\n\nResearch follows the goal: an expanding AI beelines the wheel and animal husbandry. A science-victory AI walks a hardcoded path toward rocketry while a warring AI weights military techs. It searches the prerequisite tree but abandons paths longer than three techs — no hundred-turn detours. In theory!\n\nWorker management plans and caches road routes between cities and strategic resources, invalidating them when borders flip. Bottleneck detection explicitly diagnoses *why* military modernisation is stalled — waiting on a tech, lacking road access to iron, missing currency for trade — and escalates urgency the longer the bottleneck persists.\n\nWhen the planning is done, the AI executes the turn as an ordered sequence of phases. Roughly:\n\n```\nEvent detection & city-loss response      (compare against last turn's snapshot)\nEmergency garrison fill                    (enemy standing on a city tile)\nUnit upgrades & recalls\nRetreats                                   (pull damaged units that aren't committed)\nCombat                                     (city defence first, then general)\nNaval invasion lifecycle                   (drive the beachhead state machines)\nSettler escorts & transport convergence\nArmy movement                              (via the movement planner)\nBuild orders                               (worker tasks, roads)\nDiplomacy                                  (trade, war declarations)\nCity Defence Commander                     (per-city garrison assignment)\nGovernment & tech completion\nFortification & hidden-unit setup\n```\n\nA few pieces deserve a closer look.\n\n```\nfunction shouldAttack(attacker, defender, difficulty):\n    atk = attacker.strength × difficulty.combatEffectiveness\n    def = defender.strength × terrainBonus × fortifyBonus × garrisonBonus\n    winProb = clamp(0.5 + (atk - def) × 0.1, 0, 1)\n    return winProb ≥ attacker.riskTolerance\n```\n\nMovement shares a context across all units so two units never plan into the same tile (no accidental stacking). It uses strategic pathing with an A* fallback, plus **anti-oscillation** rules — it won't step back onto a tile it occupied in the last couple of turns unless it's hurt or there's an enemy adjacent — which kills the classic \"AI unit jitters back and forth forever\" bug.\n\nRetreat pulls units below an HP threshold (50% on Easy, down to 20% on Deity) or when outnumbered 2:1 nearby — but garrisons never retreat, assault-committed units only break below 15%, and loaded transports never run. Commitment is respected.\n\nThe City Defence Commander automates each threatened city's garrison through its own little state machine — `reinforcing → defending → critical → secure`\n\n— tracking the local force balance and issuing movement orders to defenders. Cities defend themselves intelligently without the strategic layer micromanaging every hex.\n\nNone of this multi-turn coherence works without persistence. The AI's state object is serialised between turns and carries, among other things:\n\n`counterattack`\n\nknows what to retakeThat last point drives the AI's reactivity. Each turn it diffs the current world against last turn's snapshot to spot captured or lost cities, fresh war declarations, lost wonders, completed techs, detected nukes, and pillaged tiles. Any of these can fire an interrupt that pre-empts the current goal — lose a city and the AI drops what it was doing to respond; lose your capital and `counterattack`\n\njumps the stack.\n\n```\nfunction detectEvents(aiState, world):\n    prev = aiState.snapshot\n    for change in diff(prev, world):\n        if change is CITY_LOST:        raise Interrupt(counterattack, change.city)\n        if change is WAR_DECLARED:     raise Interrupt(defensiveWar, change.by)\n        if change is NUKE_DETECTED:    raise Interrupt(recovery, change.where)\n        ...                            # wonders lost, tiles pillaged, techs done\n```\n\nDifficulty in Annhexation is partly *competence* and partly *bonus* — and the line between them is deliberate.\n\n| Easy | Normal | Hard | Deity | |\n|---|---|---|---|---|\n| Production / Research / Gold | 0.8× | 1.0× | 1.15× / 1.1× / 1.1× | 1.3× / 1.25× / 1.2× |\n| Combat phasing & focus fire | off | on | on | on |\n| Will retreat | no | yes | yes | yes |\n| Combat effectiveness | 0.95× | 1.0× | 1.08× | 1.15× |\n| Decision accuracy | ~60% | 100% | 100% | 100% |\n| Strategy re-evaluation | every 20 turns | 12 | 10 | 8 |\n\nSo an Easy AI isn't just weaker — it genuinely plays worse: it makes suboptimal choices more often, doesn't phase its combat, doesn't retreat damaged units, and reconsiders its strategy only sluggishly. A Deity AI plays the engine to its full ability and gets economic bonuses on top.\n\nThe higher difficulties also unlock a small, clearly-scoped set of adaptive cheats: a fog-of-war peek at rival posture, conditional production boosts while pursuing a goal, completion boosts on the home stretch of a wonder or spaceship, and an increased chance of coordinating a joint attack with another AI. These are bonuses with a purpose rather than omniscience.\n\nThe Annhexation AI deliberately trades short-term tactical perfection for long-term strategic coherence. Its unit movement is somewhat greedy; it will occasionally make a locally-suboptimal step. But it musters real armies, plans amphibious invasions across several turns, reads which neighbour is weak and accessible, holds a campaign together through a dozen turns of grinding siege, and reacts when you take one of its cities.\n\nThe architecture is what makes that possible: a sticky goal stack on top, multi-turn plans in the middle, flexible greedy execution at the bottom, and a persistent memory threading it all together — with personality and difficulty as multipliers reaching into every layer. The result is eight civilizations that *feel* different, four difficulty levels that genuinely play differently, and an opponent whose intentions you can usually see coming. Stopping them is the game.\n\nIt doesn't take long before you realise that working on the AI will need you to analyse a lot of games and a lot of data. You need to see why it did something - as the AI grows in complexity you'll find, or I found, that I would end up with units sat idle, units osciallating between two positions, hopeless attacks, settlers refusing to found cities. And all this can be impacted by all the possibilities that can emerge from the complex set of rules the AI follows and the situations that develop on the map.\n\nAnd so you need instrumentation, a way to interrogate it, and a way to play more games than you humanly can. At least as a solo developer!\n\nAnd so a big chunk of work turned out not to be the AI itself but building tools to let me use it and interrogate it.\n\nPlaying the game by hand to test the AI is hopeless — turns are slow, and you need hundreds of them across many games to spot patterns. So there's a command-line testbed that runs all-AI games with no rendering and no human in the loop:\n\n```\ntestbed new   --map continent --difficulty deity --players 6   # create an all-AI game\ntestbed run   <gameId> --turns 250 --snapshot-every 10         # advance it, headless\ntestbed inspect <gameId>                                        # one-shot state summary\ntestbed list                                                    # all games + winners\n```\n\n`run`\n\nadvances a game by N turns as fast as the machine will go, printing per-turn progress and bailing early if someone wins. `inspect`\n\ndumps a per-player table — civ, city count, unit count, gold, current research, alive or dead — and `list`\n\nshows every game in the diagnostics directory with its current turn and winner. This is what turns \"I think the Mongolian AI rushes too hard\" into \"I ran forty games and Mongolia wins by turn 90 in thirty of them\" — the difference between a hunch and a regression test. Everything is stored in a per-game directory (`state.json`\n\n, `ai-states.json`\n\n, a `run.log`\n\nof notable events like cities founded and wars declared) ready for inspection.\n\nThe CLI is great for volume but blind to *space* — it can't show you that the army is stuck because a single enemy scout is sitting on the only bridge. For that I run all-AI games inside the actual client. When a game has no human player the normal \"End Turn\" button is replaced by a testbed panel: buttons to advance 1, 5, 10, 20, 50 or 250 turns, and a \"view as\" dropdown that swaps the map's fog-of-war filter so you can watch the game unfold from any AI's perspective.\n\nLayered on top of that is an AI inspector that lets you select any AI unit or city and it surfaces the internal state that the JSON logs hold, but anchored to what you're looking at on the map:\n\n`militaryPush vs player_2 → city_42`\n\n, `scienceVictory: 4/4 parts, 5 techs left`\n\n)`gathering → besieging → assault`\n\n), unit fill (`5/8 units, siege needed`\n\n) and rally pointUnderneath both of those is the thing I lean on most: every AI writes a complete, structured record of its reasoning every single turn. Point an environment variable at a directory and each turn produces a pretty-printed JSON file per AI player — `turn-014-mongolia.json`\n\nand a companion full-state `ai-state-014-mongolia.json`\n\n.\n\nThese aren't log lines; they're a forensic snapshot of the entire decision. A single turn file captures the goal stack with its scores, the posture and opportunity score it assigned every rival, every city's production and classification, every unit's assignment (role, target, commitment, position, HP), the active attack plans — and, crucially, a command trace: an ordered list of every command the AI issued that turn, tagged with the phase that issued it, and `success: true`\n\nor a `blocked`\n\nreason straight from the engine. So when a move silently does nothing, the log tells you the engine rejected it and why.\n\nThere are dedicated traces for the gnarly subsystems too: a combat trace of every simulated fight, a naval lifecycle narrative for debugging amphibious invasions (the single most fiddly thing in the whole AI), and a `citySiteDecisions`\n\nlist recording every settle attempt and its outcome — `accepted`\n\n, `too-close-to-foreign-city`\n\n, `food-tiles-short`\n\n, `on-foreign-landmass-blocked`\n\n. That last one is the cure for the maddening \"why won't this settler settle?\" bug: the answer is right there in the file. Here's a heavily, heavily, trimmed example JSON from a turn:\n\n```\n{\n  \"turn\": 18, \"playerId\": \"player_4\", \"civilisation\": \"greece\",\n  \"doctrine\": \"STRATEGIST\", \"difficulty\": \"hard\",\n\n  \"goals\": [\n    { \"type\": \"earlyExpand\", \"priority\": 0, \"status\": \"active\", \"createdOnTurn\": 11,\n      \"targetCityCount\": 4, \"settlerCount\": 0,\n      \"bestSites\": [\n        { \"q\": 23, \"r\": 20, \"totalScore\": 111.4, \"penalties\": 0 },\n        { \"q\": 25, \"r\": 19, \"totalScore\": 109.6, \"penalties\": 0 }\n        /* … 277 more, descending … */\n      ] },\n    { \"type\": \"infrastructureConsolidation\", \"priority\": 1, \"status\": \"active\" },\n    { \"type\": \"warPreparation\", \"priority\": 2, \"status\": \"active\",\n      \"targetPlayerId\": \"player_1\", \"targetForceSize\": 4, \"currentForceSize\": 3 }\n  ],\n\n  \"postures\": {\n    \"player_2\": { \"militarisation\": 0.69, \"isRunawayLeader\": true, \"borderTension\": 0.27 }\n  },\n\n  \"cities\": [\n    { \"name\": \"Athens\", \"population\": 2, \"production\": \"library\", \"classification\": \"border\" }\n  ],\n\n  \"commandTrace\": [\n    { \"step\": \"10\", \"command\": \"moveUnit\", \"unitId\": \"unit_14\", \"role\": \"worker\",\n      \"from\": \"25,23\", \"to\": \"26,23\", \"success\": true },\n    { \"step\": \"10\", \"command\": \"buildImprovement\", \"unitId\": \"unit_14\", \"success\": true },\n    { \"step\": \"16\", \"command\": \"endTurn\", \"success\": true }\n  ]\n}\n```\n\nThe workflow ties together neatly. Run a few hundred turns headless with the CLI; spot a game that went wrong in the `list`\n\noutput; either replay it in the browser with the `F3`\n\ninspector or crack open the turn-N JSON and read, in order, exactly what the AI was thinking and what the engine let it do. Most of the \"the AI is being dumb\" moments turn out to be one specific, fixable thing — and these tools are how you find it instead of guessing.\n\nCreating an AI for a 4X is definitely quite an undertaking. Its pretty easy to get units moving around but getting the AI to act in ways that are both interesting and credible takes a lot of effort. Its not that the code is complicated but that their is so much interacting that small changes can result in difficult to predict second and third order effects.\n\nI spent countless hours on things that on the one hand seem simple \"stop a unit from oscillating between A and B\" but turn out to be really rather complex. While yes you can put in guards \"don't do this\" the guards themselves can have unforeseen effects and don't fix root problems.\n\nYou also can't automate all this away. Yes you can create test cases, yes you can have the AI play countless games against the AI, but an AI isn't a human and its the human the AI has to respond interestingly to.\n\nI've released Annhexation into early access now and the primary reason for that is the AI. I need more people to play it and then resolve the things that inevitably will emerge.\n\n[If you'd like to give it a go you can play it online, for free, now.](https://annhexation.com)", "url": "https://wpnews.pro/news/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works", "canonical_source": "https://dev.to/jamesrandall/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works-p1g", "published_at": "2026-05-30 08:15:34+00:00", "updated_at": "2026-05-30 08:41:21.049812+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-research", "ai-products"], "entities": ["Annhexation"], "alternates": {"html": "https://wpnews.pro/news/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works", "markdown": "https://wpnews.pro/news/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works.md", "text": "https://wpnews.pro/news/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works.txt", "jsonld": "https://wpnews.pro/news/teaching-a-computer-to-play-4x-how-the-annhexation-ai-works.jsonld"}}