# Cosmicgpt – A GPT-in-space simulator to research SpaceX AI satellite viability

> Source: <https://github.com/davedx/cosmicgpt>
> Published: 2026-06-17 15:09:47+00:00

Simulate what happens to GPT inference under space conditions — cosmic-ray bit flips and other radiation-induced faults corrupting a model's weights, activations, KV cache, and output.

See what radiation does to an AI model's output: a [single-run report](https://davedx.github.io/cosmicgpt/report.html)
and an [environment comparison](https://davedx.github.io/cosmicgpt/comparison.html).

See [DESIGN.md](/davedx/cosmicgpt/blob/main/DESIGN.md) for goals and the conditions we model, and
[ARCHITECTURE.md](/davedx/cosmicgpt/blob/main/ARCHITECTURE.md) for the technical design.

The end-to-end loop covers the full **Single-Event-Effect taxonomy** across three
corruptible **regions**, with faults either hand-specified or derived from a physical
**radiation environment**: build a seeded [nanoGPT](/davedx/cosmicgpt/blob/main/src/cosmicgpt/model/nanogpt.py)
(with a real KV cache), generate a clean baseline, get faults (manual or from the
[flux scheduler](/davedx/cosmicgpt/blob/main/src/cosmicgpt/environment/scheduler.py)), inject them (weight mutations,
activation forward-hooks, KV-cache mutations), regenerate with the same sampling seed,
and diff.

Fault kinds (`--kind`

): **SEU** (single bit flip), **MBU** (multi-bit upset),
**STUCK_AT** (cell pinned 0/1), **SEL** (latch-up — a whole tensor zeroed),
**SET** (transient activation glitch), **SEFI** (NaN/garbage cascade).
Regions (`--region`

): **weight**, **activation** (incl. `lm_head`

→ logits), **kv_cache**.
Environments (`--orbit`

): **LEO, SAA, POLAR, GEO, INTERPLANETARY, SOLAR_STORM**, with an
optional solar-flare **burst window** raising λ(t) mid-inference.

Every run also reports a **failure mode** (silent_correct / subtle_wrong / repetition /
garbage / nan_garbage / crash), **time-to-failure**, and **mean KL divergence** of the
output distribution, and can emit a per-step [ RunTrace](/davedx/cosmicgpt/blob/main/src/cosmicgpt/eval/trace.py)
JSON (the data the upcoming visualizations consume).

```
# physically-derived faults from an orbit (flux scaled so a short run shows effects)
cosmicgpt run --orbit SAA --flux-mult 1e4 --tokens 120
# a mission with a mid-inference solar flare
cosmicgpt run scenarios/mission_solar_storm.yaml
# write a self-contained HTML report (token diff + degradation timeline + raster)
cosmicgpt run --orbit SOLAR_STORM --flux-mult 1e4 --report report.html
# regenerate a report from a saved trace — no re-inference
cosmicgpt report runs/storm/trace.json -o report.html
# compare conditions side by side (View C)
cosmicgpt compare --orbits LEO,SAA,SOLAR_STORM -o comparison.html
```

Reports are **fully self-contained** (inline CSS + inline SVG, no external assets, no
matplotlib) so they're emailable and archivable.

```
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# run the smallest scenario (SEU)
cosmicgpt run scenarios/walking_skeleton.yaml

# drive the taxonomy directly
cosmicgpt run --kind SEFI --n-flips 1 --tokens 120 --fault-seed 3
cosmicgpt run --kind SEL  --n-flips 8 --tokens 100

# verify the bit-flip foundation + injection mechanisms
pytest
```

- Single faults on
**low-impact sites**(biases, low mantissa bits) are routinely*masked*— realistic: most cosmic-ray hits do nothing visible. **Exponent/sign** flips and**SEL** are far more destructive than mantissa flips.**SET**(transient activation glitch) is gentle: without persistence it affects one step, and only if it lands on the emitted position.- The model now has a real
**KV cache**(`--region kv_cache`

): a strike there is mutated once but*persists*, because every later token re-reads the corrupted entry through attention. Region is independent of fault kind —`--region weight|activation|kv_cache`

. **A single short inference in LEO is essentially fault-free** at realistic upset rates; meaningful corruption needs the SAA, a solar storm, or long exposure. With a flare**burst window**, divergence visibly begins right when the flux spikes.

The model is a small, seeded, **randomly-initialized** char-level GPT, so the baseline
text is gibberish — but that's fine for the skeleton: the point is to demonstrate the
fault-injection loop and that flips (especially in the float **exponent**) measurably
corrupt the output. Train a coherent model later via `scripts/train_tiny.py`

(roadmap).

```
src/cosmicgpt/
  model/        nanogpt.py (+KV cache), adapter.py, sites.py   # model + fault registry
  faults/       bitops.py, types.py, injector.py               # taxonomy + injection
  environment/  flux.py, presets.py, scheduler.py              # scaled-physical flux
  eval/         runner, metrics, classify, trace               # loop + metrics + RunTrace
  viz/          svg, diffview, timeline, report                # inline-SVG/HTML reports
  config.py, cli.py
scenarios/      walking_skeleton.yaml, sefi_cascade.yaml, mission_solar_storm.yaml
tests/          test_bitops, test_injection, test_kvcache, test_scheduler, test_eval, test_viz
```

See [ARCHITECTURE.md §11](/davedx/cosmicgpt/blob/main/ARCHITECTURE.md). Next (step 6): mitigation wrappers
(ECC / TMR voting / scrubbing / NaN guards) with cost-benefit experiments, then a
pluggable larger-GPT backend to test whether findings generalize.
