{"slug": "not-every-byte-gets-a-vote", "title": "Not Every Byte Gets a Vote", "summary": "The article explains that for deterministic game replay, not all data should be included in a checksum; only \"authoritative\" state that can affect future gameplay (like player health and projectile position) should be hashed, while non-authoritative data (like debug events and render interpolation) should be excluded. The author describes a system where replay stores only inputs (seed and input tape) rather than outcomes, and the simulation advances in fixed ticks with a strict phase order to ensure determinism. The checksum must be a carefully chosen boundary that includes all state capable of influencing future ticks, such as entity state and RNG, but excludes helper caches that can be rebuilt deterministically.", "body_md": "Not Every Byte Gets a Vote\nWhen I started wiring replay for a deterministic simulation, my first instinct was simple:\nEasy. Hash everything.\nIt sounds safe. No byte left behind. If anything changes, the checksum changes.\nThe catch is that \"everything\" is not a design boundary. It includes real gameplay state, but it can also include caches, debug traces, render helpers, padding, ordering accidents, and things that are useful without being authoritative.\nSo the question I ended up needing was narrower:\nWhich state is allowed to change what happens next?\nPlayer health? Yes.\nProjectile position? Yes.\nThe RNG stream? Yes.\nDebug events? Probably not. They are observations.\nRender interpolation? No. Useful, but not gameplay truth.\nPathfinding cache? Maybe worth saving, maybe worth rebuilding, but it needs to be named as one of those. It shouldn't become authoritative by accident.\nThis post is a walkthrough of that split: truth, cache, observation, and presentation. The examples are excerpts from a Zig ARPG game engine. The categories are local to this project: a working vocabulary that has helped me find bugs and argue with the code more clearly.\nThe tick is the unit replay can trust\nThe sim advances in fixed ticks. Not frames. Not vibes. Ticks.\nThe outer tick function mostly schedules phases:\nI like this function because it is plain, not clever.\nidle\n-> ingress\n-> control\n-> derive\n-> plan\n-> apply // movement, physics, combat\n-> cleanup\n-> idle\nReplay needs that kind of boring order. Every tick starts from idle, drains the queues it expects to drain, runs systems in a fixed order, increments time once, and returns to idle.\nIf a queue leaks between phases, the next phase can read a command that belonged to the previous one. If a system mutates state in the wrong phase, it becomes harder to explain which tick caused which result. If the world does not return to idle, the next tick starts with unfinished work already loaded.\nThe tick matters because it says where work is allowed to happen.\nReplay stores inputs, not outcomes\nFor this replay setup, the file says what went in, not what supposedly happened.\nIf the replay file stores \"the fireball hit for 18,\" replay is no longer checking the sim. It is just checking whether I can read yesterday's answer back correctly.\nThe contract is:\nseed + input tape -> ticks -> same authoritative result\nThe recorder is deliberately small:\nSeed. Inputs. Tick count. Final checksum.\nThat's the shape.\nThe harness then runs once while recording, runs again while replaying, and compares the final checksum:\nThe checksum is where this becomes easy to fool yourself. Hash too little and replay can miss real bugs. Hash too much and replay starts failing because some non-authoritative helper changed shape.\nThe checksum also doesn't make the sim deterministic by itself. The usual suspects still matter: fixed tick size, explicit RNG, stable iteration order, initialized state, and no hidden dependency on render timing or local machine state. The checksum just tells you when those rules failed.\nSo the checksum needs a boundary, not just appetite.\nThe checksum is a chosen surface\nThe checksum entrypoint names what counts as authoritative gameplay state:\nThe list is the point. It says these things can affect future gameplay:\n- tick count\n- current phase\n- RNG state\n- entity state\n- projectile state\n- item, ground-loot, and trinket runtime stores\n- tilemap and encounter interaction markers\n- runtime modifiers, behavior emissions, scope rewires, and rules\n- phase queues\n- encounter/node state\nThis is not \"all memory.\" It is not \"whatever fields happen to exist.\" It is a decision, and it only works if the list contains every piece of state that can affect future gameplay.\nOne wrinkle: cached stats are in this checksum because later systems read those stats as runtime facts. If the cache is wrong, future gameplay can change. A helper cache can stay out only when it is rebuilt deterministically from authoritative state before anything reads it, or when the authoritative inputs to the cache are what the sim actually branches on.\nThe list also leaves things out. Debug events are not checksum authority. AI traces are not checksum authority. Render state is not checksum authority.\nThose things still matter. Good debug output and good presentation are part of making the game work. They shouldn't make replay look broken when a debug label changes or a render helper gets nicer.\nThe checksum should catch gameplay divergence. It shouldn't punish better breadcrumbs.\nSnapshots ask a different question\nReplay and save/load look related, but they are checking different properties.\nReplay asks:\nIf I start over from the same seed and inputs, do I arrive at the same authoritative state?\nSnapshot asks:\nCan I freeze this world, write it to bytes, restore it, and keep going?\nThose overlap, but they are not the same test.\nA snapshot may include things that are not checksum authority. A cache might be cheap to rebuild, or it might be worth restoring because rebuilding it at that boundary is awkward. That doesn't automatically make the cache replay truth.\nThe snapshot encoder uses the same path for measuring and writing:\nIf buffer\nis null, the encoder walks the protocol and counts bytes. If buffer\nexists, it writes.\nUsing the same path keeps measuring and writing from drifting apart.\nThe field encoder is generic, but the protocol is still explicit:\nThe failure mode is the useful part. If a field type is not part of the snapshot protocol, the build stops. It doesn't silently invent a format.\nBoring in the useful way.\nCaches are allowed to be caches\nA deterministic engine can still have derived state. Recomputing everything all the time is not magically more correct.\nThe trick is naming derived state as derived.\nThe flowfield stores pathing data so AI can ask which way to move from a tile:\nAI uses this, so it matters. The cache needs a contract.\nIf the flowfield is rebuilt in the derive phase from authoritative inputs before AI reads it, the\nchecksum can care about those inputs instead of the helper arrays. If a later tick can branch on a\npersisted valid\nflag or stale direction field without rebuilding, that field has joined the\nauthority path and should be treated like it gets a vote.\nThe distinction to preserve:\nIs this source of truth, or a cached expression of truth with a deterministic rebuild contract?\nWith that contract, the cache layout can change. In the current code, snapshots do serialize the flowfield so save/load resumes from the exact cached shape, while replay still doesn't treat the helper arrays as checksum authority.\nEvents are receipts, not steering wheels\nThe sim tick can optionally emit events:\nThat ?\nis the useful bit. The event queue may not exist.\nTurning events on or off should produce the same gameplay state. Events are receipts. They describe what committed: a skill started, a hit landed, a thing died, something interesting happened for VFX or tests.\nThey don't steer the sim.\nThis keeps observation from quietly becoming gameplay. If the renderer has to be present for the sim to behave, the replay boundary has probably leaked. If a test changes the outcome by asking for events, it is not only observing anymore.\nInput goes in. Gameplay state changes inside the tick. Events come out.\nKeep the arrows clean.\nRender does not get a vote either\nRender is allowed to be smart about presentation. It can interpolate, draw telegraphs, sort sprites, play VFX, and make the game readable.\nIt doesn't get to decide gameplay facts.\nIf something is dangerous, the sim should expose that fact. Render can color it red, pulse it, or make it dramatic. But render should not infer danger from pixels or animation timing.\nThe deadline version of this is tempting:\nJust check the animation frame.\nThat stays out of the authority path.\nIn this codebase, the same rule shows up elsewhere: app routes input, game owns session meaning, sim owns encounter truth, render presents committed truth.\nThe boundary can move as the project changes. If it moves, it should move intentionally.\nThe working taxonomy\nWhen new state appears, I use this checklist:\nCan this state change future gameplay?\nyes -> checksum/replay authority\nno -> keep asking\nIs it needed to restore and continue correctly?\nyes -> snapshot surface\nno -> keep asking\nIs it an observation of committed gameplay?\nyes -> event/debug/inspect surface\nno -> keep asking\nIs it only presentation?\nyes -> render/app state\nThe categories are not about importance. Render is important. Debugging is important. Snapshots are important.\nAuthority is just a different question.\nAuthoritative state is state with the right to change the future.\nEverything else can be useful without getting a vote.\nWhy bother?\nBecause bugs get easier to sort.\nIf replay fails, I can start by assuming gameplay truth diverged, not that a debug label changed.\nIf save/load fails, I inspect the snapshot protocol.\nIf a visual is wrong, I ask whether the sim emitted the right fact or render drew it wrong.\nIf an event is wrong, I can fix the observer without treating it as a replay-authority change.\nThat's the payoff. Not purity. Fewer places for bugs to hide.\nMost of the code here is phase order, bounded storage, explicit hashing, narrow codecs, optional event queues, and assertions. The hard part is deciding what each piece of state is allowed to mean.\nNot every byte gets a vote. That's the line this design is trying to hold.", "url": "https://wpnews.pro/news/not-every-byte-gets-a-vote", "canonical_source": "https://mitander.xyz/posts/not-every-byte-gets-a-vote/", "published_at": "2026-04-28 00:00:00+00:00", "updated_at": "2026-05-24 05:37:17.827620+00:00", "lang": "en", "topics": ["developer-tools", "data"], "entities": ["Zig"], "alternates": {"html": "https://wpnews.pro/news/not-every-byte-gets-a-vote", "markdown": "https://wpnews.pro/news/not-every-byte-gets-a-vote.md", "text": "https://wpnews.pro/news/not-every-byte-gets-a-vote.txt", "jsonld": "https://wpnews.pro/news/not-every-byte-gets-a-vote.jsonld"}}