Simulate what happens to GPT inference under space conditions — cosmic-ray bit flips and other radiation-induced faults corrupting a model's weights, activations, KV cache, and output.
See what radiation does to an AI model's output: a single-run report and an environment comparison.
See DESIGN.md for goals and the conditions we model, and ARCHITECTURE.md for the technical design.
The end-to-end loop covers the full Single-Event-Effect taxonomy across three corruptible regions, with faults either hand-specified or derived from a physical radiation environment: build a seeded nanoGPT (with a real KV cache), generate a clean baseline, get faults (manual or from the flux scheduler), inject them (weight mutations, activation forward-hooks, KV-cache mutations), regenerate with the same sampling seed, and diff.
Fault kinds (--kind
): SEU (single bit flip), MBU (multi-bit upset),
STUCK_AT (cell pinned 0/1), SEL (latch-up — a whole tensor zeroed),
SET (transient activation glitch), SEFI (NaN/garbage cascade).
Regions (--region
): weight, activation (incl. lm_head
→ logits), kv_cache.
Environments (--orbit
): LEO, SAA, POLAR, GEO, INTERPLANETARY, SOLAR_STORM, with an optional solar-flare burst window raising λ(t) mid-inference.
Every run also reports a failure mode (silent_correct / subtle_wrong / repetition / garbage / nan_garbage / crash), time-to-failure, and mean KL divergence of the output distribution, and can emit a per-step RunTrace JSON (the data the upcoming visualizations consume).
cosmicgpt run --orbit SAA --flux-mult 1e4 --tokens 120
cosmicgpt run scenarios/mission_solar_storm.yaml
cosmicgpt run --orbit SOLAR_STORM --flux-mult 1e4 --report report.html
cosmicgpt report runs/storm/trace.json -o report.html
cosmicgpt compare --orbits LEO,SAA,SOLAR_STORM -o comparison.html
Reports are fully self-contained (inline CSS + inline SVG, no external assets, no matplotlib) so they're emailable and archivable.
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
cosmicgpt run scenarios/walking_skeleton.yaml
cosmicgpt run --kind SEFI --n-flips 1 --tokens 120 --fault-seed 3
cosmicgpt run --kind SEL --n-flips 8 --tokens 100
pytest
- Single faults on
low-impact sites(biases, low mantissa bits) are routinelymasked— realistic: most cosmic-ray hits do nothing visible. Exponent/sign flips andSEL are far more destructive than mantissa flips.SET(transient activation glitch) is gentle: without persistence it affects one step, and only if it lands on the emitted position.- The model now has a real
KV cache(
--region kv_cache
): a strike there is mutated once butpersists, because every later token re-reads the corrupted entry through attention. Region is independent of fault kind —--region weight|activation|kv_cache
. A single short inference in LEO is essentially fault-free at realistic upset rates; meaningful corruption needs the SAA, a solar storm, or long exposure. With a flareburst window, divergence visibly begins right when the flux spikes.
The model is a small, seeded, randomly-initialized char-level GPT, so the baseline
text is gibberish — but that's fine for the skeleton: the point is to demonstrate the
fault-injection loop and that flips (especially in the float exponent) measurably
corrupt the output. Train a coherent model later via scripts/train_tiny.py
(roadmap).
src/cosmicgpt/
model/ nanogpt.py (+KV cache), adapter.py, sites.py # model + fault registry
faults/ bitops.py, types.py, injector.py # taxonomy + injection
environment/ flux.py, presets.py, scheduler.py # scaled-physical flux
eval/ runner, metrics, classify, trace # loop + metrics + RunTrace
viz/ svg, diffview, timeline, report # inline-SVG/HTML reports
config.py, cli.py
scenarios/ walking_skeleton.yaml, sefi_cascade.yaml, mission_solar_storm.yaml
tests/ test_bitops, test_injection, test_kvcache, test_scheduler, test_eval, test_viz
See ARCHITECTURE.md §11. Next (step 6): mitigation wrappers (ECC / TMR voting / scrubbing / NaN guards) with cost-benefit experiments, then a pluggable larger-GPT backend to test whether findings generalize.