{"slug": "games-between-programs-the-ruliology-of-competition", "title": "Games Between Programs: The Ruliology of Competition", "summary": "Stephen Wolfram has systematically analyzed all possible strategies in repeated two-player games using computational methods, applying ruliology to determine whether winning strategies tend toward complexity or simplicity. By modeling agents as programs that choose actions based on past sequences, Wolfram found that even with simple payoff structures like matching pennies, the most successful strategies often emerge from simple computational rules rather than complex ones. The research challenges conventional assumptions about competition, suggesting that winning may more frequently depend on finding simple hacks than on building elaborate strategic frameworks.", "body_md": "## The Basic Setup\n\nWhether one’s dealing with biology, economics, politics or a host of other fields, it’s common to encounter situations that can be modeled as involving two agents that repeatedly compete with each other. One imagines that at each step each agent can take one of a certain set of actions, and that then—in a classic [game theory](https://reference.wolfram.com/language/guide/GameTheory.html) way—each agent (or “player”) gets a certain fixed “payoff” based on the action they and their opponent take. But how do the agents decide what action to take? We imagine that each agent has a certain fixed procedure—or “strategy”—for making its decisions. And we imagine that the input to each of those decisions is the sequence of past actions that the agent and its opponent have taken.\n\nThere’s been lots of work done over the course of nearly a century on particular choices of strategies. But something [I’ve long been curious about](https://www.wolframscience.com/nks/notes-10-12--games-between-programs/) is what happens if one systematically considers all possible strategies. And if we think of strategies as programs this becomes a question to which we can immediately apply [ruliological methods](https://writings.stephenwolfram.com/2026/01/what-is-ruliology/). Which is what I’m going to do here.\n\nTo be more specific about the setup, let’s assume that at each step, each agent takes one of two possible actions, indicated by and . And for now let’s take the payoffs to be the ones for the classic [“match-or-not” (“matching pennies”)](https://www.wolframscience.com/nks/notes-10-12--game-theory/) game—in which player 1 has the bigger payoff when there’s a match, and player 2 has the bigger payoff when there isn’t a match:\n\nSo what happens when agents repeatedly play this game? Well, it depends on their strategies. Here are a few examples for several different choices of each agent’s strategy:\n\nPlotting the cumulative payoffs for the two agents (represented by and ) in each of these cases we get:\n\nOften we’ll consider the “winning agent” to be the one that has the numerically largest cumulative payoff (i.e. is eventually on top in these plots) after a certain number of steps. And with a criterion like this, we’ll be able to rank different programs against each other—and in general explore the ruliology of competition.\n\nWith the basic setup we’re using, we can represent all possible sequences of actions by a multiway graph:\n\nFor any given sequence of actions, there is then a cumulative payoff for each agent for our match-or-not game:\n\nIf each agent adopts a particular strategy, this will define a particular path through the multiway graph. For the strategies used in the examples above, the paths are:\n\nWhat does it take to have a winning strategy? In what follows, we’ll consider strategies based on several different types of programs. But one basic question we can always ask is whether what turn out to be the winning strategies tend to be based on programs that are more complicated, or less so—or to show behavior that is more complicated, or less so.\n\nIn other words, if you want to win, should you typically be trying to build up something complicated? Or should you instead expect to be able to find a “simple hack” that will “crack the game” and—at least usually—let you win? In effect, we’re asking whether competition tends to lead to complexity, or simplicity.\n\nI’ve recently looked at minimal models of both [biological evolution](https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/) and [machine learning](https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/), in which one is adaptively evolving programs in order to maximize some externally imposed fitness function. And what I’ve found is that even when the fitness function one uses is simple, the behavior of the programs that maximize it is normally quite complex. In other words, adaptive evolution will tend to make even a simple, fixed objective be achieved in a complicated way.\n\nSo what if instead of having a fixed, externally imposed objective, our goal is just broadly to win against other agents? Does such—potentially open-ended—competition lead us to more complex behavior (or more complex programs), or not? That’s the kind of question we’re going to be able to explore here by looking at the ruliology of competition.\n\n## Strategies from Finite State Machines\n\n[Finite state machines](https://www.wolframscience.com/nks/notes-6-7--finite-automata/) can be thought of as defining extremely simple programs (that might model pathways in biology, decision processes in economics, etc.). And to start our investigation of the ruliology of competition we’re going to look at strategies defined by finite state machines.\n\nA typical example of a finite state machine (here with 3 states) is:\n\nWe’re going to use this finite state machine to define a strategy for an agent. To see how this works, let’s say that the sequence of actions taken by the agent’s opponent have been:\n\nThe idea is to use this sequence of actions to define a path in the finite-state-machine graph, then to determine the next action from the color of the state reached. We start at the vertex with the incoming arrow, then successively follow the edge whose color matches the next move made by the opponent:\n\nAt the end of this process we’ll reach some vertex in the graph (i.e. some state in the finite state machine). In the particular case shown here, the state we reach is . And then we take the output of the strategy—i.e. the next action for the agent to take—to be .\n\nIt’s sometimes convenient to show the states of the finite state machine arranged on a line:\n\nAnd then we can summarize the path taken with a certain input by showing the successive states reached:\n\nSo what happens if two finite state machines compete? The basic idea is that the successive outputs from one machine become the successive inputs to the other, and vice versa. If our second machine is\n\nthen we can represent the behavior of the machines by:\n\nIf the payoffs we use are for the match-or-not game, then their cumulative values for these machines are\n\nso that in the end agent 2 can be considered the winner.\n\nIt’s important to note here that in the setup we’re using, everything is deterministic: at every step, each agent takes an action that is deterministically computed using its strategy from the past history of moves. It’s a different setup from what’s most often studied in game theory, where each move is in effect considered independently, but where there can be probabilities for different actions (“mixed strategies”)—and where in the end averaging is done over “different possible rolls of the dice”.\n\n## The Space of Possible Finite State Machines\n\nThe number of possible graphs for finite state machines with *s* states is (2 *s*2) s. But some of those graphs correspond to machines with identical behavior—so that the number of distinct machines is smaller:\n\n### 2-State Machines\n\nIn the 2-state case, the 22 distinct machines are\n\nwhere we’ve identified each machine by a number.\n\nSo what happens if pairs of these machines compete? Here are a few examples, where in each case we’re identifying the average payoff (here for 10 rounds of the match-or-not game):\n\n(In all competitions between pairs of finite state machines, the sequence of moves ultimately has to become periodic—with a period equal at most to the product of the number of states in each machine.)\n\nWhat happens if each of the 22 distinct 2-state machines competes against each of the other ones? We can summarize the results by showing the mean (long-term) payoff for every pair of machines (the payoff is for each machine “playing as agent 1”; in match-or-not, the payoff is negated if “playing as agent 2”):\n\nSo what machine is the “overall winner”? One way to assess this is to look at the average of the mean payoffs achieved by a given machine when competing with all other (distinct) machines:\n\nThe winner by this measure is then machine 26:\n\nRunning this machine against all (distinct) 2-state machines we get the following mean payoffs:\n\nThe actual behavior in each case—which doesn’t itself depend on the payoffs, only on the machines involved—is:\n\nWhat are the “runners-up” to the winning machine? Here are all the distinct machines, ranked by their mean payoffs:\n\nHere’s what happens if we play the top 3 runners-up against all machines:\n\nWe can summarize how a machine behaves by showing the history of its behavior when playing against all other machines (or, in effect, by putting together the first columns in pictures like the ones above). Here are the results for all the machines (for 15 steps), ordered from highest average score down:\n\n(Once again, these pictures are completely determined just from the machines involved; the payoffs in the match-or-not game determine only their ordering.)\n\nOne footnote to what we’ve been saying here has to do with how many steps of competition we are getting the machines to do. For all finite-state machines, the behavior must eventually become periodic—and for 2-state machines the maximum period is 4 steps, with a maximum transient of 3 steps. But the actual average mean payoffs vary with the total number of steps one considers:\n\nIt’s notable that at the least for the first few steps, the rankings move around:\n\nBut in this case it doesn’t take too many steps for the ultimate winner to be clear (later on we’ll see examples where it takes much longer).\n\n(There are other subtleties as well. One of them is that we are computing average payoffs by playing every machine against every other distinct machine. In principle we could also include other equivalent machines—which would slightly change the weighting of our averages. But since we’re really concerned with strategies, not machines as such, the scheme we’re using seems more appropriate.)\n\n### 3-State Machines\n\nFor the 956 distinct machines with *s* = 3 states, the corresponding “competitive array” (after 1000 steps) is:\n\nThe average mean payoff for each of the machines (i.e. the average across each row in the “competitive array”) is then\n\nwhile the distribution of these average mean payoffs is:\n\nThe top few machines for the match-or-not game are then:\n\nRunning the top machine (*s *= 3 machine 1164) against all (distinct) 3-state machines we get the following mean payoffs:\n\nThe distribution of possible limiting mean payoffs here is:\n\nAnd the most common forms of behavior seen are:\n\nThe maximum possible period for a competition between two 3-state machines is 9. Machine 1164 never quite achieves this; its maximum period of 7 occurs when competing with machines 2546 and 2755 (both giving limiting mean payoff –1):\n\nIf one looks at all possible pairs of 3-state machines, there turn out to be 792 that yield period-9 behavior, examples being:\n\n(These have no transients; the maximum transient for 3-state machines turns out to be 8.)\n\n### An Aside: What Do We Mean by “Average”?\n\nWe’ve talked about how a machine does “on average” when competing with all other (distinct) machines. But what do we mean by “on average”? So far, we’ve taken the “average” to be the mean of the payoffs obtained by competing with each other machine (and the payoffs here are themselves means across successive steps). But what if we use the median instead of the mean? Here are the median payoffs from running each machine for 1000 steps against all other machines:\n\nThe standout winning machine here is machine 1172:\n\nThe mean payoffs and their distributions in this case are:\n\nAnd the median is “anomalously high” because with this machine exactly 1/2 of all mean payoffs are +1. (The corresponding mean is pulled down by the “left tail” in the distribution of mean payoffs.)\n\n## The Complexity of Winning\n\nLet’s look (basically as above) at the actual behavior of each of the distinct 2-state finite state machines when competing against all other 2-state machines, ordered from smallest average mean payoff to largest:\n\nThe cases with 0 average mean payoff look simple in their behavior. But for other average mean payoffs, the behavior of a given machine competing against all others seems more complicated.\n\nWe can get some sense of this complexity by [looking at the compressed size](https://writings.stephenwolfram.com/2025/11/whats-special-about-life-bulk-orchestration-and-the-rulial-ensemble-in-biology-and-beyond/#measuring-mechanoidal-behavior) (as obtained from ` Compress`) of the array of behavior shown above:\n\nHere’s the corresponding result for the 956 distinct 3-state machines—showing no strong correlation between average mean payoff and our estimate of the complexity of behavior:\n\nAnd indeed among machines with the highest average mean payoffs there is still quite a diversity of levels of complexity in behavior\n\nwith the “behavior traces” of the machines indicated being\n\nand\n\nIn other words, at least in this case, we really can’t say that winning machines are characterized either by being particularly complex in their behavior, or particularly simple. It seems that it’s detailed structure, rather than overall features, that determines what machines will win.\n\n## Competitions between Machines of Different Sizes\n\nCan finite state machines with more states systematically do better (i.e. achieve larger payoffs) than ones with fewer states? The best average mean payoff any 2-state machine can achieve when competing with all other 2-state machines is about 0.151. But if, for example, we consider 3-state machines competing (for 1000 rounds) against 2-state machines, the best average mean payoff is instead 0.593:\n\nLooking at the distribution of possible average mean payoffs, we see that the distribution of average mean payoffs is wider for 3-state machines than for 2-state ones—a fact that is at least partly just a consequence of there being many more possible 3-state machines than 2-state ones:\n\nBut something that’s notable is that the very broadest distribution is for 3-state machines competing against 2-state ones: in effect it seems that with their larger collection of possible strategies, the 3-state machines can do better at “outmaneuvering” the 2-state ones.\n\nThe 3-state machine that does the best overall against 2-state machines is machine 1234:\n\nIt doesn’t always definitively win (with mean payoff +1), but does so the majority of the time:\n\nHow does it achieve this? Basically, for lots of different 2-state machines, this particular 3-state machine manages to behave just as they do:\n\nIn some sense, there are facets of the 3-state machine that “resonate” with many 2-state ones:\n\nHow about 4-state machines? The 4-state machine that does best overall against 2-state machines is machine 109828:\n\nOut of the 22 2-state machines, it only gets less than payoff +1 in 6 cases:\n\nHere’s the behavior for all 22 cases:\n\nAnd once again we can think of the 4-state machine as successfully “covering” most of the 2-state behaviors:\n\n## Adaptive Evolution of Finite State Machines\n\nIn many practical situations where there’s competition, there’s a way for the agents that are competing to evolve. So can we make a minimal model of this using finite state machines?\n\nIn what we’ve done so far, we’ve always been looking at a space of all possible finite state machines. But what about sequences of machines found by adaptive evolution? Is there, for example, a way to adaptively evolve machines to do progressively better in competitions?\n\nThe first step in doing this is to see how we might make successive mutations to finite state machines. A simple approach is to say that any given mutation can affect either a random vertex or a random edge in the graph of a machine. For a vertex, the mutation just reverses its color. For an edge, it either reverses the color, or “reroutes” the edge to a different vertex (with the constraint that doing so doesn’t disconnect the graph). Applying a sequence of such mutations at random gives for example\n\nor, with a different graph rendering:\n\n(Note that we’re mutating machines in whatever form we find them; we’re not worrying about equivalences between machines, or the canonicalization of machines.)\n\nImagine we have an opponent machine—like 3-state machine 1165—that usually forces a lose, i.e. limiting payoff –1 (for example about half the time when competing with other 3-state machines):\n\nNow we can ask whether we can adaptively evolve a machine that will win against this opponent. In order to give our adaptive evolution process some “room to maneuver” we’ll use a 4-state machine. We can start with a random such machine, say\n\nwhich “loses” (always having payoff –1) against machine 1165:\n\nTo do adaptive evolution, we now make successive random mutations to this machine, “accepting” a mutation if it doesn’t decrease the mean payoff, and otherwise rejecting it. The result is a [typical “fitness curve”](https://writings.stephenwolfram.com/2024/05/why-does-biological-evolution-work-a-minimal-model-for-biological-evolution-and-other-adaptive-processes/#the-model) in which most mutations (indicated by red dots) don’t lead to improvement in the payoff—but there are some that lead to “breakthroughs” where the payoff increases (sometimes only by a small amount), with the payoff eventually reaching the maximum value of +1:\n\nThe various “breakthroughs” progressively converge on a “perfect solution” with payoff +1:\n\nConcatenating the successive results over the course of the adaptive evolution process, we can see the eventual convergence to the perfect solution where the actions of the two agents always match:\n\nWith different random mutations, the “fitness curve” will be different in detail, though will have the same general form. And the same is true with different specific opponents.\n\nBy the way, using our way of numbering finite state machines, we can make a plot of how the process of adaptive evolution “moves the machine around in rule space”:\n\nBut what happens if we do as we have done above, and ask about the mean payoff averaged over all possible finite-state-machine opponents of a given size? For example, how well can 4-state machines do against all possible 2-state machines?\n\nStarting with the same random 4-state machine as before, a typical fitness curve is:\n\nThe fitness here increases, but never reaches +1. The behavior of successive “breakthrough” machines playing against all size-2 machines is:\n\nAnd we can see that even the best machine we get still loses to some of the 2-state machines, yielding in the end an average mean payoff of about 0.62.\n\nSo what happens if we look at machines that have more states? With 10 states, for example, it is possible to adaptively evolve to a machine that achieves limiting payoff +1 against every single 2-state machine:\n\nThe final machine obtained in this case\n\ncan be thought of as a kind of (2-state) “universal winner”—that ultimately wins against all 2-state machines:\n\nHow does it do it? In some sense the machine is big enough that it can have different “specialized parts” for different opponents. And if we look at how the machine behaves we indeed see that with different opponents the machine settles into different subsets of its complete space of states:\n\nAnd even if we consider all 956 3-state machines as opponents, our machine continues to do well. It doesn’t win in all cases, but it still achieves an average mean payoff of +0.603:\n\nSome examples where the machine doesn’t win—in effect because it doesn’t contain as a submachine something to deal with a particular opponent—include:\n\nSo far we’ve considered the adaptive evolution of a single machine competing either against a single fixed opponent, or against a collection of fixed opponents. But what if both the machine and its opponent are undergoing adaptive evolution?\n\nFor example, let’s say that on alternating adaptive evolution steps we do a mutation on a machine and on its opponent. We keep the mutation for each machine if the (mean) payoff for that machine does not decrease; otherwise we reject it.\n\nWith this setup, here’s the evolution of mean payoffs for two (initially identical) 4-state machines:\n\nThere are periods where one machine wins, and periods where its opponent wins—as visible in the actual successive behaviors of the machines:\n\nThe actual machines found by adaptive evolution move around in rule space—soon losing memory of what they initially were:\n\nNot much changes if the number of states in the machines change, or aren’t the same—though there is typically less alternation of winners for machines with more states, presumably because each individual mutation tends to have less effect on behavior if there are more states.\n\n## What About Prisoner’s Dilemma?\n\nEverything we’ve done so far has been based on the particularly simple game of match-or-not (“matching pennies”). So what happens with other games? And in particular with the famous “prisoner’s dilemma” game? Here are the payoffs for this game\n\nwhere in the usual narrative for the game one interprets as “defect” and as “cooperate”.\n\nJust as above, we can imagine defining strategies for the prisoner’s dilemma game based on finite state machines. Here are a few examples of iterated games between 2-state machines—now with payoffs determined by the prisoner’s dilemma game:\n\nIn the case of match-or-not, it was visually easy to tell whether a particular payoff was ±1 or 0 just by seeing whether the actions of the agents matched at a particular step. Here it’s not quite so visually obvious.\n\nBut using the payoffs for the prisoner’s dilemma game we can compute the cumulative payoffs for these examples (and, unlike in match-or-not, which is a zero-sum game, the payoffs for the two agents don’t sum to zero at each step):\n\nMuch as we did before, we can now consider competitions between agents whose strategies are based on all possible 2-state finite state machines (for match-or-not the zero-sum nature of the game makes the resulting array of payoffs symmetrical; here there’s symmetry only from the fact that the payoffs remain the same if one interchanges the roles of agent 1 and agent 2):\n\nWith this setup, we can now ask what machine is the “overall winner”—say in the sense that it has the largest average mean payoff playing against all other (distinct) 2-state machines:\n\nThe answer turns out to be machine 30:\n\nIn the literature of prisoner’s dilemma this is often called “grim trigger”, because it yields a strategy that starts with , then repeats this until its opponent first gives —after which it always gives .\n\nRunning this machine against all other 2-state machines we get the following behaviors\n\ncorresponding to the following mean payoffs:\n\nLooking at the average mean payoff for all 2-state machines, the ranking of these machines is:\n\nIt’s notable that machine 22 (which corresponds to the [famous “tit-for-tat” strategy](https://www.wolframscience.com/nks/notes-10-12--game-theory/))\n\nis quite far down in this ranking, even though it’s often identified as the most successful in collections of human-suggested strategies.\n\nThe rankings we’ve just given are based on average mean payoffs obtained after many iterations of the prisoner’s dilemma game. But if we do only a few iterations, the rankings can be different:\n\nZooming in at the beginning we can then see that machine 30 only starts to win after 13 steps:\n\nMachine 20 gives a constant average mean payoff of –1 obtained from\n\nwhile machine 30 yields an average mean payoff given by – – , limiting to – ≈ –0.86.\n\nSo what about 3-state machines? This gives the average mean prisoner’s dilemma payoff for each of these machines:\n\nThe distribution of these average mean payoffs is:\n\nThe machines with the highest ultimate average mean payoffs are:\n\nBut this ordering emerges only after more than 500 steps\n\nwith the crossover of average mean payoffs being surprisingly complex:\n\n(The seemingly quite random variation of average mean payoffs reflects the combining of many different periods in the always-ultimately-periodic behavior of competitions between machines.)\n\nSo how do 3-state machines do compared to 2-state machines in the prisoner’s dilemma game? Running 2-state machines against each other, machine 30 gets the highest average mean payoff of about –0.866. Meanwhile, for 3-state machines running against each other, the highest average mean payoff achieved is the very slightly smaller –0.885. What about 2-state machines running against 3-state ones? They don’t do well. Machine 30 does the best—but now it gives an average mean payoff not of –0.866 but instead of about –0.97.\n\nBut now, running 3-state machines against 2-state ones, the best average mean payoff is larger—about –0.80, as achieved by machine 2743\n\nwith the mean payoffs obtained by running it against each possible 2-state machines being:\n\nHow about 4-state machines? Running all these against 2-state machines, the overall winner is machine 336766 with average mean payoff –0.77:\n\nThe mean payoffs against each 2-state machine in this case\n\nare very similar to those for the winning 3-state machine, the only different behaviors occurring when the opponents are 2-state machines 20 and 30:\n\nSummarizing these results, the winning machines with small numbers of states that we’ve found for prisoner’s dilemma are:\n\nBut what about machines with more states—that we might find by adaptive evolution? Here’s an example of adaptive evolution for 10 states, competing against all 2-state machines:\n\nAfter 1000 steps of this adaptive evolution, we get the 10-state machine\n\nwith average mean payoff –0.73.\n\nThe behavior of this machine competing with all 2-state machines is:\n\n## The Space of All Possible Games\n\nWe’ve now looked at two specific examples of games—match-or-not and prisoner’s dilemma—and we’ve seen very similar phenomena in both cases. But what about other games?\n\nIf we allow payoffs –1 and +1 (as in match-or-not) there are a total of 256 possible games:\n\nOf these, 16 are zero sum (like match-or-not)—in the sense that the sum of the payoffs for the two agents is always zero), and 16 are symmetric (like prisoner’s dilemma)—in the sense that the payoff for the two agents is always the same.\n\nFor each of the 256 possible games, we can compute the average mean payoffs for each possible 2-state finite state machine competing with all 2-state machines:\n\nThe winning average mean payoffs for these 256 games are always –1, 0 or +1:\n\nIn most cases, many machines achieve the maximum payoff; across all games, this is the number of times each machine is a winner:\n\nWhat about when we look at more games—for example ones with payoffs –1, 0, +1? There are 6561 such games. And the story is very much the same, with some slight differences:\n\n## Cellular Automaton Strategies\n\nEverything we’ve done here so far has been based on using finite state machines as our source of strategies. Now we’re going to turn to another source of strategies: [cellular automata](https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-2--more-cellular-automata).\n\nThe setup we’re going to use takes the actions of our agents to be determined by running cellular automaton rules. The basic idea is that at each step the initial conditions for the cellular automaton are given by the sequence of actions taken by the opponent so far. The next action of our agent is then determined by the value of the cell obtained by running the cellular automaton for as many steps as there were actions taken so far by the opponent.\n\nMore specifically, let’s say the rules for our cellular automaton are:\n\nAnd let’s say the actions taken by the opponent so far have been:\n\nThen the idea is to run the cellular automaton with these as initial conditions\n\nand to extract the final cell value to determine the next action to take.\n\nSo, for example, if our two competing cellular automata have rules\n\nthen the successive steps in running them against each other give\n\nwhere in our pictures everything about the second rule has been reversed. The actions taken on each step can now be read off either from the opponent initial conditions, or from the outer diagonals of the final pattern generated:\n\nTo analyze “competition” between rules we can assign payoffs, say from the match-or-not game:\n\nAnd in this case we get the following cumulative payoffs:\n\nThere are altogether 16 possible cellular automaton rules of the kind we’re using here:\n\nRunning each one against every other we get the following array of limiting mean payoffs:\n\nSome notable “competitions” include:\n\nThe cumulative mean (match-or-not) payoffs in these cases are:\n\nFor most of these pairs of rules the winner quickly becomes clear. But for the case of rule 6 vs. rule 7 it’s more complicated—and after 500 steps it’s still not at all clear which rule will win:\n\nThe underlying behavior is:\n\nOn their own, these two rules behave in rather simple ways (indeed, rule 7 is just XOR):\n\nBut when they’re set up in competition, the effective rule that emerges has much more complex—and apparently unpredictable—behavior, with no sign, for example, of periodicity.\n\nLooking across all the rules, the one with the largest average mean payoff turns out to be rule 14:\n\nIn a sense, rule 14 finds a very “simple solution”, generating either constant or period-2 behavior, and forcing its opponent to do likewise—and in the end giving an average mean payoff of exactly – ≈ –0.69:\n\nWhat about with more complicated cellular automaton rules? Are the winners still ones with simple behavior?\n\nLet’s look at the 3-color analogs of our cellular automaton rules. There are 33 2 = 19683 of these. And in each case we can “make a decision about the next action” by looking at the final value mod 2. Running all these rules against the 16 2-color rules the distribution of scores is:\n\nAnd once again the best-performing rules (such as rule 15911) behave in rather simple ways:\n\nLooking—as we did for finite state machines—at the compressed size of patterns versus the average mean payoff in the corresponding competition\n\nwe see that the highest payoff rules tend to behave in simpler ways.\n\nThe rules with the most complicated behavior (at least by this measure) have average mean payoffs near zero. A typical example is rule 11948:\n\nSome of the more complicated competitions in this case are:\n\nWhat about different games with different payoffs? The underlying behavior of particular rules competing with each other will always be the same. But their payoffs will be different. And so, for example, in prisoner’s dilemma, the cumulative payoffs for 2-color rule 6 vs. 2-color rule 7 are now:\n\nPlaying each 2-color rule against all others the average mean payoffs obtained are:\n\nRule 13 has the highest average mean payoff (of –1), and shows fairly simple behavior:\n\nLooking at compressed size versus average mean payoff for games between 3-color and 2-color rules, the phenomenon of high payoff being associated with simpler behavior seems even more marked for prisoner’s dilemma than for match-or-not:\n\n## Cellular Automata vs. Finite State Machines\n\nWe’ve looked at finite state machines competing with finite state machines, and cellular automata competing with cellular automata. But what about cellular automata competing with finite state machines?\n\nHere’s an example of a particular step in a competition between a cellular automaton and a finite state machine\n\nand here are the cumulative payoffs in this case for the match-or-not game:\n\nRunning all 16 cellular automaton rules of this type against all 2-state finite state machines the mean payoffs are:\n\nAveraging over all finite state machines, the mean payoffs for the possible cellular automata are:\n\nRather boringly, the winning cellular automaton is rule 0, which generates in response to anything any finite state machine does:\n\nThis yields an average mean payoff of only +0.181. But what if we use 3-color cellular automata? Here are the average mean payoffs in that case—with the winning case highlighted:\n\nSummarizing the various competitions between different types of strategies, we see that—running against 2-state finite state machines—the most successful competitors are, by a small margin, 3-color cellular automata:\n\n## Adaptive Evolution of Cellular Automaton Strategies\n\nJust as we did above for finite state machines, we can consider adaptive evolution of cellular automaton rules (which is also something I’ve studied in other contexts [somewhat extensively elsewhere](https://writings.stephenwolfram.com/2024/12/foundations-of-biological-evolution-more-results-more-surprises/)). As a first case, let’s consider adaptively evolving a 4-color cellular automaton rule to get the best mean payoff against the most successful 3-state finite state machine above, machine 1165. At each step of adaptive evolution, we’ll randomly change one of the 2 = 16\n\nThe “breakthroughs” correspond to the following rules:\n\nAnd as is [often the case](https://writings.stephenwolfram.com/2025/11/whats-special-about-life-bulk-orchestration-and-the-rulial-ensemble-in-biology-and-beyond/#measuring-mechanoidal-behavior), the early breakthroughs are somewhat complicated, but in the end the “solution” that emerges shows rather simple behavior—something we can see at least some evidence for if we put the results at successive mutation steps together:\n\nWhat about adapting cellular automata to compete with other cellular automata? As an example, let’s use adaptive evolution to find a 6-color cellular automaton with the largest average mean payoff when competing with all 16 of the 2-color cellular automata we’ve considered. Here’s a typical fitness curve for this case:\n\nAfter 1000 mutation steps, it’s reached a rule that gives average mean payoff 0.91. And here’s what happens when that rule competes with all our 2-color rules:\n\nWhat if (as for finite state machines above) both a rule and its opponent are undergoing adaptive evolution—say on alternating steps? Here’s an example of the successive payoffs one gets with a pair of 4-color rules:\n\nAnd here are the corresponding actual behaviors:\n\nWhat are the underlying cellular automata doing? Here are results at a sequence of mutation steps—illustrating that adaptive evolution can select both rules with very simple behavior and ones with somewhat more complex behavior:\n\n## Turing Machine Strategies\n\nWe’ve looked at strategies based on finite state machines and strategies based on cellular automata. Now let’s talk about strategies based on [Turing machines](https://www.wolframscience.com/nks/chap-3--the-world-of-simple-programs#sect-3-4--turing-machines). For our purposes, we can think of Turing machines as in some ways interpolating between finite state machines and cellular automata—though they also introduce some entirely new features.\n\nOur basic setup will be to use the opponent’s actions as initial values on a Turing machine tape, with the latest value on the right, which is where the Turing machine head is initially placed. We then [run the Turing machine until its head goes further to the right than it’s ever gone before](https://writings.stephenwolfram.com/2026/01/p-vs-np-and-the-difficulty-of-computation-a-ruliological-approach/#the-basic-setup), at which point we determine the next action from the value that appears at the initial head position.\n\nFor example, consider a Turing machine defined by the rule:\n\nThen imagine that the sequence of opponent actions so far is:\n\nRunning the Turing machine with this as its initial condition we get the following:\n\nAnd from this we can then read off “the next move” according to our “Turing machine strategy”, in this case .\n\nIn our finite state machine and cellular automaton setups we did just one step of evolution for each step in our game. In our Turing machine setup, at every step in our game we’re running the Turing machine for as many steps as it takes for the head to go further to the right than it started.\n\nHere’s what happens if we take a particular sample 3-state finite state machine\n\nand have it compete with the Turing machine above:\n\nWith match-or-not the cumulative mean payoffs here are:\n\nThere are a total of 4096 Turing machines of the type we’re using here (with *s* = 2 states and *k* = 2 colors). Running each of these against our sample 3-state machine the mean payoffs in the match-or-not game for all the Turing machines are:\n\nThere are several Turing machines that have limiting mean payoffs of +1. An example is machine 2529:\n\nThere’s a tricky issue that comes up here, though. Our Turing machine strategy works by running a Turing machine until its head goes further to the right than it started—so that we can consider that it halts. But what if it never halts, as in:\n\nFor our purposes we’re just saying that in this case, the payoff is undefined. And if such an undefined payoff ever occurs in a particular game, we assume the mean payoff for the whole game is undefined—leaving a gap in the plot above.\n\nWhat if we have Turing machines compete against, say, all distinct 2-state finite state machines? Here are the average mean payoffs in that case (the gaps are for machines that don’t halt):\n\nThe maximum of +0.4 is achieved for Turing machine 2403\n\nwhich yields the following behaviors and limiting payoffs when\n\ncompeting with each of the 22 distinct 2-state finite state machines:\n\nSo what about Turing machines competing with Turing machines? To keep things manageable, we can look at 1-state Turing machines, of which there are only 16 (with *k* = 2). Running each of these machines against each other, the array of mean payoffs is (the gray entries correspond to cases where one of the Turing machines doesn’t halt):\n\nThe average mean payoff for each of these machines is given by:\n\nThe “winner” among the machines is Turing machine 13:\n\nRunning this machine against all other *s* = 1, *k* = 2 Turing machines the behaviors we get are:\n\nIf we look at the cumulative payoffs, we see that many give mean payoffs that approach 1, though some do not, yielding in the end an average mean payoff of about +0.81:\n\nA typical competition between 2-state Turing machines is\n\nwhich yields a slightly more complicated pattern of cumulative payoffs:\n\nWhat happens if 2-state and 1-state Turing machines compete? Here’s the array of mean payoffs for all 4096 2-state machines running against the 16 1-state machines:\n\nThe average mean payoffs for 2-state machines are as follows—again with maximum 0.81:\n\n## Discussion\n\nWe’ve now seen many examples of the ruliology of competition. And, perhaps more than anything else, it’s now clear that if we look—ruliologically—at all possible programs of particular types, the picture of how competition works is quite complicated, even when all the programs involved are simple.\n\nIn a sense, this is a typical result of [computational irreducibility](https://www.wolframscience.com/nks/chap-12--the-principle-of-computational-equivalence#sect-12-6--computational-irreducibility): to know how competitions between programs will work out, there’s basically no choice but to run them and see what happens.\n\nSometimes the programs that win do so in very simple ways—in effect “exploiting simple hacks”. But in other cases, things are more complicated. Sometimes two competing programs with both show complex behavior, and in a sense, it’ll “just so happen” that one of them wins. But sometimes the win will be more systematic. And typically this happens because the behavior effectively plugs into some pocket of computational reducibility that systematically out-competes opponents of a certain type.\n\nWe’ve mostly looked at extremely simple programs which in some sense inevitably have to “expose the same rules” to every competitor. But particularly if we have a fairly small collection of competitors, a sufficiently large program can in effect expose a different part of its rules for different competitors, and so have a “customized substrategy” that separately wins against different possible competitors.\n\nIn looking at adaptive evolution of strategies we’ve often dealt with larger programs. And we’ve typically seen that the adaptive evolution can be quite successful at finding winning strategies. But—as is typically the case with adaptive evolution—there’s no obvious way to “describe the mechanism” of the strategies that are produced. Instead, it’s more like what we’ve seen in other studies of adaptive evolution: the process of evolution puts together certain “lumps of irreducible computation” that in our case here in effect “just happen” to be competitively successful.\n\nDifferent games—corresponding to different patterns of payoffs—lead to results that are different in detail. And if one constructs a detailed narrative about the course of a game, it may well seem different for different games. But at an overall level, there seems to be remarkable similarity between different games—and the key phenomena seem very much the same.\n\nWhat does this all say about practical situations where there’s competition between agents? One thing is that it’s typically going to be difficult to “predict in advance” or “prove a theorem” about what the best strategy will be. There’s enough computational irreducibility that one will basically just have to try running different competitions and seeing what happens. And in a sense the very diversity of behavior we’ve seen here supports the idea that ruliological investigation is critical. Finding some simple parametrization of possible strategies won’t be enough to get an accurate sense of everything that can happen. There’s no choice but to systematically enumerate some version of “all computationally possible strategies”. Which is what we can do in our ruliological investigations.\n\nAnd, yes, what we’ve done here just scratches the surface of studying the ruliology of competition. For a start, one can scale up the size of the programs, and see what new phenomena occur. One can expect that mostly things will be the same—with computational irreducibility the dominant force. But there may be new and unexpected pockets of reducibility, perhaps each with their own “paths to competitive success”.\n\nOne can also imagine investigating different kinds of computational systems—that serve as metamodels appropriate for different applications. The Principle of Computational Equivalence suggests that there’ll be a certain universality to the overall results. But details will be different. And those details will potentially be important, particularly in interpreting results for very different domains. Even if what matters for ultimate purposes of competition is well captured by finite state machines—or a cellular automata—the way one gets to these from microscopic biology, human decision making, societal interactions, AI competition, etc. may be very different.\n\n## Historical & Personal Notes\n\nThere’s a long history to formal studies of games—and indeed early developments in areas like combinatorics and probability were largely driven by them. The modern field known as [game theory](https://reference.wolfram.com/language/guide/GameTheory.html) emerged in the 1940s, concentrating on the question of optimal strategies given particular patterns of payoffs. Most often the idea is to analyze what happens when each player makes a single move—albeit perhaps a probabilistic one, with averages taken over many instances. Fairly complete (though sometimes complicated) mathematical results have been derived for this kind of setup (and are now, for example, [implemented in the Wolfram Language](https://reference.wolfram.com/language/guide/GameTheory.html)). But what about repeated, or iterated, games of the kind we’ve been discussing here? In the early days of game theory there was discussion about defining strategies as arbitrary mappings from histories to actions—and various rather abstract mathematical results were proved, particularly for applications in economics.\n\nBut by the 1970s there started to emerge the idea that one should model agents as having “bounded rationality”, and corresponding to limited computational systems. And by the end of the 1970s computer experiments were being done on competition between what amounted to simple programs. A notable example was the tournament organized by Bob Axelrod for the prisoner’s dilemma game. In this tournament, a collection of particular programs were submitted by different individuals, and run against each other. The conclusion was that the “tit for tat” strategy (that can be thought of as a finite state machine) came out best—a result from which much has been made about the value of cooperation, etc.\n\nI must admit that I was always suspicious of the result. It seemed very unscientific to have just looked at programs people happened to have submitted for the tournament. Why not instead systematically enumerate all possible programs and see what happens? In my own work—starting at the beginning of the 1980s—[I was routinely doing this kind of thing](https://www.wolframscience.com/nks/chap-1--the-foundations-for-a-new-kind-of-science#sect-1-4--the-personal-story-of-the-science-in-this-book), particularly for cellular automata. I always found the setup for game theory a little arbitrary, and fiddly, and I was discovering more than I could keep up with just investigating the behavior of individual programs, without trying to have them compete with each other. Still, finally, in the mid-1990s, I did have a look at what happens when a range of possible programs (in that case, cellular automata) compete with each other. I summarized the result in a small note at the end of my book* ** A New Kind of Science*:\n\nI always meant to come back and look at this in more detail. And finally my recent work in the [foundations of biological evolution](https://writings.stephenwolfram.com/category/life-science) made me think it was time to do it. I found out that there was some literature on using models like finite state machines as strategies for iterated games. But so far as I could tell, the kind of systematic ruliological investigation I had imagined had never been done. Which is why I recently decided it was finally time to do it…\n\n## Thanks\n\nThanks to Willem Nielsen, Brian Ashiundu and Júlia Campolim of the Wolfram Institute for their extensive help. Several participants at our summer programs have done projects about games between programs that I’ve suggested: Rodrigo Bazaes, Kantaporn Danchaivijitr and Aziz Sahibnazarov. Over the course of many years, I’ve discussed game theory and related ideas with quite a few people, including Brian Arthur, Bob Axelrod, Seth Chandler, Roger Germundsson, Paul Harrald, Jozsef Konczer, Pedro Marquez-Zacarias, Eric Maskin, Zsombor Méder, Chrystopher Nehaniv, Scott Page, Jordan Pollack, John Maynard Smith, Stan Reiter, Nassim Taleb, Valeriu Ungureanu and Marc Vicuna. (Notable game theorist John Nash was a long-time user of what’s now Wolfram Language, and attended conferences about it, but I never personally met him.)\n\nPosted in:\n[Artificial Intelligence](https://writings.stephenwolfram.com/category/artificial-intelligence/), [Computational Science](https://writings.stephenwolfram.com/category/computational-science/), [Life Science](https://writings.stephenwolfram.com/category/life-science/), [Ruliology](https://writings.stephenwolfram.com/category/ruliology/)", "url": "https://wpnews.pro/news/games-between-programs-the-ruliology-of-competition", "canonical_source": "https://writings.stephenwolfram.com/2026/06/games-between-programs-the-ruliology-of-competition/", "published_at": "2026-06-05 23:36:00+00:00", "updated_at": "2026-06-05 23:48:10.886030+00:00", "lang": "en", "topics": ["ai-research"], "entities": ["Stephen Wolfram"], "alternates": {"html": "https://wpnews.pro/news/games-between-programs-the-ruliology-of-competition", "markdown": "https://wpnews.pro/news/games-between-programs-the-ruliology-of-competition.md", "text": "https://wpnews.pro/news/games-between-programs-the-ruliology-of-competition.txt", "jsonld": "https://wpnews.pro/news/games-between-programs-the-ruliology-of-competition.jsonld"}}