Teaching a Computer to Play 4X: How the Annhexation AI Works A developer has built a layered AI system for the 4X strategy game Annhexation that separates strategy, planning, and execution into three distinct layers. The AI uses a prioritized goal stack that evaluates and scores potential objectives—from early expansion and military pushes to wonder racing and nuclear first strikes—against personality weights and world factors each turn. This decoupled architecture ensures the computer opponent maintains coherent long-term strategies, such as holding a military campaign goal for twenty-plus turns, rather than making twitchy turn-by-turn decisions. Building a believable computer opponent for a 4X strategy game is one of those problems that turns out to be bottomless. I'd use the cliche it looks simple from the outside... but I don't think thats true, I thought this would be a tough nut from the outset. I've built a chess playing engine before and that was far simpler to get a strong opponent - though it helps that that is such a well understood and documented problem. The player wants an opponent that explores, expands, exploits and exterminates with apparent intent — one that musters an army over several turns, marches it across a continent, lands it on your shore and takes your city, all while you watched it coming and couldn't quite stop it. They do not want an opponent that teleports units, reads your mind, or sits inert in its starting cities until you wander into range. This post is a tour through the Annhexation AI https://annhexation.com — explaining how it makes decisions, what it remembers between turns, and how the same core machinery produces eight distinct civilizations and four difficulty levels. Annhexation isn't open source, so rather than quote the implementation I'll describe the design and illustrate the interesting bits with pseudocode. I should note that the AI is still under development but after a lot of bashing with a hammer its feeling in a pretty decent place. The single most important design decision in the Annhexation AI is that strategy, planning and execution are decoupled. These are three layers that are seperated on purpose and an AI turn flows through three layers: The payoff of this separation is, hopefully, coherence over time. A greedy turn-by-turn AI looks twitchy: it builds an army, gets distracted, disbands it, builds another. By contrast, an Annhexation AI that adopts a militaryPush goal will hold that goal for twenty-plus turns, funnelling production, research and unit movement toward a single objective until the city falls, the campaign demonstrably fails, or something seismic interrupts it. Strategy should be sticky while execution is flexible. A complete turn runs as an ordered sequence of discrete phases — from threat assessment and diplomacy through combat, movement, production and fortification: function runTurn player, world, aiState : detectEvents aiState, world diff against last turn → fire interrupts aiState.goals = evaluateStrategy player, world, aiState plans = buildOperationalPlans aiState.goals, player, world executeTactics plans, player, world the phase sequence see below aiState.snapshot = snapshot world remember this turn for next time return aiState At the heart of the strategic layer is a prioritized goal stack . Each turn the AI either keeps its current goals or re-evaluates them, and the menu of things it can want is rich: earlyExpand — plant N cities before consolidating earlyRush — exploit the opening with an aggressive early attack infrastructureConsolidation — buildings, population, growth militaryPush — sustained warfare against a chosen player defensiveWar / counterattack — react to aggression, retake what was lost navalInvasion — assault a distant landmass wonderRace , scienceVictoryPush , scoreOptimisation — the peaceful victory paths raidWar , asymmetricWar — economic harassment instead of conquest warPreparation , nuclearFirstStrike , recovery — the situational specialsGoals don't fire on rigid rules rather they're scored against each other and the highest-utility ones win. The scoring blends several signals: Every score is then multiplied by a personality weight . Roughly: function scoreGoals player, world, personality : scores = {} for goal in CANDIDATE GOALS: base = goal.baseValue player, world world factors = proximity × forceBalance × catchUp × opportunity scores goal = base × world factors × personality.weightFor goal return sortDescending scores e.g. early-expand ≈ base × siteRatio × proximityAdj × catchUp × personality.expansion Two of those terms are about the world; one is about who this civ is. That's how the same evaluation function produces a cautious turtle and a rampaging horde. The top goal priority 0 drives the turn. Secondary goals queue behind it, ready to take over the moment an interrupt fires. A 4X AI that only looks at its own empire plays in a vacuum. Annhexation's AI explicitly models every player it has met before deciding who to fight. The AI profiles each known rival across roughly eleven dimensions, each normalised to 0, 1 : militarisation , development , expansionism , techPace exposure and coastalExposure undefended or weakly-garrisoned cities borderTension and aggression forces massed near wonderFocus , scienceFocus , and the all-important isRunawayLeader flagIt also tracks trends — rising, flat or falling over the last five turns — so the AI reacts to a rival who is accelerating , not just one who is currently strong. Those snapshots are kept in persistent state so trend detection survives across turns. A second pass turns those profiles into a war-target ranking. For each rival it weighs: function scoreWarTargets rivals, me, personality : for r in rivals: affinity = personality.aggression × r.borderTension winnable = clamp myStrength / r.militarisation reachable = 1 / 1 + travelCost me, r distracted = r.aggression elsewhere r.score = affinity × winnable × reachable × 1 + distracted return sortDescending rivals The winner of that scoring becomes the target of a militaryPush , and the magnitude feeds back as an opportunity multiplier into goal evaluation. An exposed, accessible, distracted neighbour is a temptation the AI is built to notice and exploit. Personality in Annhexation isn't a single "aggression" slider — it's a vector of about twenty weights military production, attack appetite, expansion, wonder-building, research, naval production, raid preference, plus early-game tuning like second-city urgency and first-build preference . On top of that sits the doctrine system — eight civ-specific playbooks that override those weights and the AI's unit-composition preferences: | Civ | Doctrine | Signature | |---|---|---| | Mongolia | HORSE RUSH | +50% military production, +50% attack, double raid preference, cavalry-heavy armies | | Aztecs | WARRIOR RUSH | +40% military & attack, −20% expansion, melee-heavy early aggression | | Russia | EXPAND WIDE | +40% expansion, +30% garrison commitment | | Rome | INFRA FIRST | +40% infrastructure, +30% expansion | | France | WAR FOR SCIENCE | +40% research, +30% science-victory focus | | Greece | STRATEGIST | balanced militarisation across all domains | | Egypt | TURTLE WONDERS | +50% wonders & culture, −20% military | | England | COASTAL ONLY | +40% naval, +50% coastal-site preference, harbour priority | Because the doctrine only modulates shared machinery, Egypt and Mongolia run the identical goal-evaluation and combat code — they simply weight it toward completely different ends. Mongolia drowns you in cavalry; Egypt hides behind wonders and culture; England fights for the coastline. Combined with unique per civ units this gives each civ a distinctive personality. Once a goal is chosen, the operational layer turns intent into concrete plans. Unit quotas compute empire-wide demand for each unit class — settlers, workers, garrison, field army, reserve, naval, raiders — each scaled by goals, threat levels, personality and difficulty. During a militaryPush against a walled city, for instance, the garrison quota rises with threat level, melee demand jumps, and siege units become mandatory — you cannot crack walls without them, and the AI knows it. Unit composition picks the melee/ranged/siege/mounted ratio for an army. Against an unwalled city it loads up on ranged units free damage ; against walls it must bring siege. Doctrine tilts the mix, and resource gating caps it — no horses means no cavalry, no iron means no siege, full stop: function targetComposition target, doctrine, resources : if target.walled: mix = {melee: 0.4, siege: 0.4, ranged: 0.2} else: mix = {melee: 0.4, ranged: 0.5, mounted: 0.1} mix = applyDoctrineBias mix, doctrine HORSE RUSH → more mounted, etc. if not resources.horses: mix.mounted = 0 if not resources.iron: mix.siege = 0 return normalise mix Attack plans are first-class, multi-turn objects with an explicit lifecycle: mustering → gathering → advancing → besieging → assaulting ↘ naval awaitingTransport → embarking → sailing → landing ↗ Target selection scores enemy cities by proximity −5 per hex of distance , with bonuses for being unwalled +15 , being a capital +10 , and sitting near iron or horses the AI needs a big multiplier gated on personality and urgency . It goes for the weakest reachable target first — and it commits. City production is a distributed priority queue: high-output cities feed global military needs first, low-output cities backfill settlers and workers. The priority cascade runs upgrades → settlers → garrison → military → naval → workers/roads → buildings → wonders, gated by the active goal. Research follows the goal: an expanding AI beelines the wheel and animal husbandry. A science-victory AI walks a hardcoded path toward rocketry while a warring AI weights military techs. It searches the prerequisite tree but abandons paths longer than three techs — no hundred-turn detours. In theory Worker management plans and caches road routes between cities and strategic resources, invalidating them when borders flip. Bottleneck detection explicitly diagnoses why military modernisation is stalled — waiting on a tech, lacking road access to iron, missing currency for trade — and escalates urgency the longer the bottleneck persists. When the planning is done, the AI executes the turn as an ordered sequence of phases. Roughly: Event detection & city-loss response compare against last turn's snapshot Emergency garrison fill enemy standing on a city tile Unit upgrades & recalls Retreats pull damaged units that aren't committed Combat city defence first, then general Naval invasion lifecycle drive the beachhead state machines Settler escorts & transport convergence Army movement via the movement planner Build orders worker tasks, roads Diplomacy trade, war declarations City Defence Commander per-city garrison assignment Government & tech completion Fortification & hidden-unit setup A few pieces deserve a closer look. function shouldAttack attacker, defender, difficulty : atk = attacker.strength × difficulty.combatEffectiveness def = defender.strength × terrainBonus × fortifyBonus × garrisonBonus winProb = clamp 0.5 + atk - def × 0.1, 0, 1 return winProb ≥ attacker.riskTolerance Movement shares a context across all units so two units never plan into the same tile no accidental stacking . It uses strategic pathing with an A fallback, plus anti-oscillation rules — it won't step back onto a tile it occupied in the last couple of turns unless it's hurt or there's an enemy adjacent — which kills the classic "AI unit jitters back and forth forever" bug. Retreat pulls units below an HP threshold 50% on Easy, down to 20% on Deity or when outnumbered 2:1 nearby — but garrisons never retreat, assault-committed units only break below 15%, and loaded transports never run. Commitment is respected. The City Defence Commander automates each threatened city's garrison through its own little state machine — reinforcing → defending → critical → secure — tracking the local force balance and issuing movement orders to defenders. Cities defend themselves intelligently without the strategic layer micromanaging every hex. None of this multi-turn coherence works without persistence. The AI's state object is serialised between turns and carries, among other things: counterattack knows what to retakeThat last point drives the AI's reactivity. Each turn it diffs the current world against last turn's snapshot to spot captured or lost cities, fresh war declarations, lost wonders, completed techs, detected nukes, and pillaged tiles. Any of these can fire an interrupt that pre-empts the current goal — lose a city and the AI drops what it was doing to respond; lose your capital and counterattack jumps the stack. function detectEvents aiState, world : prev = aiState.snapshot for change in diff prev, world : if change is CITY LOST: raise Interrupt counterattack, change.city if change is WAR DECLARED: raise Interrupt defensiveWar, change.by if change is NUKE DETECTED: raise Interrupt recovery, change.where ... wonders lost, tiles pillaged, techs done Difficulty in Annhexation is partly competence and partly bonus — and the line between them is deliberate. | Easy | Normal | Hard | Deity | | |---|---|---|---|---| | Production / Research / Gold | 0.8× | 1.0× | 1.15× / 1.1× / 1.1× | 1.3× / 1.25× / 1.2× | | Combat phasing & focus fire | off | on | on | on | | Will retreat | no | yes | yes | yes | | Combat effectiveness | 0.95× | 1.0× | 1.08× | 1.15× | | Decision accuracy | ~60% | 100% | 100% | 100% | | Strategy re-evaluation | every 20 turns | 12 | 10 | 8 | So an Easy AI isn't just weaker — it genuinely plays worse: it makes suboptimal choices more often, doesn't phase its combat, doesn't retreat damaged units, and reconsiders its strategy only sluggishly. A Deity AI plays the engine to its full ability and gets economic bonuses on top. The higher difficulties also unlock a small, clearly-scoped set of adaptive cheats: a fog-of-war peek at rival posture, conditional production boosts while pursuing a goal, completion boosts on the home stretch of a wonder or spaceship, and an increased chance of coordinating a joint attack with another AI. These are bonuses with a purpose rather than omniscience. The Annhexation AI deliberately trades short-term tactical perfection for long-term strategic coherence. Its unit movement is somewhat greedy; it will occasionally make a locally-suboptimal step. But it musters real armies, plans amphibious invasions across several turns, reads which neighbour is weak and accessible, holds a campaign together through a dozen turns of grinding siege, and reacts when you take one of its cities. The architecture is what makes that possible: a sticky goal stack on top, multi-turn plans in the middle, flexible greedy execution at the bottom, and a persistent memory threading it all together — with personality and difficulty as multipliers reaching into every layer. The result is eight civilizations that feel different, four difficulty levels that genuinely play differently, and an opponent whose intentions you can usually see coming. Stopping them is the game. It doesn't take long before you realise that working on the AI will need you to analyse a lot of games and a lot of data. You need to see why it did something - as the AI grows in complexity you'll find, or I found, that I would end up with units sat idle, units osciallating between two positions, hopeless attacks, settlers refusing to found cities. And all this can be impacted by all the possibilities that can emerge from the complex set of rules the AI follows and the situations that develop on the map. And so you need instrumentation, a way to interrogate it, and a way to play more games than you humanly can. At least as a solo developer And so a big chunk of work turned out not to be the AI itself but building tools to let me use it and interrogate it. Playing the game by hand to test the AI is hopeless — turns are slow, and you need hundreds of them across many games to spot patterns. So there's a command-line testbed that runs all-AI games with no rendering and no human in the loop: testbed new --map continent --difficulty deity --players 6 create an all-AI game testbed run