# CS2-10k: A Large-Scale Egocentric Counter-Strike 2 Dataset

> Source: <https://reka.ai/news/cs2-10k-a-large-scale-egocentric-counter-strike-2-dataset>
> Published: 2026-06-26 00:15:41+00:00

Training interactive world models requires data that is notoriously hard to find: ego-centric video sequences with densely aligned action signals (keyboard inputs, camera motion, and ego state) all synchronized to the visual stream.

Real-world embodied data is costly to collect, while synthetic data often lacks the visual richness or behavioral diversity needed for generalization. Counter-Strike 2 demos offer a compelling middle ground: because matches are recorded as deterministic replays, we can reconstruct clean first-person video at any point in a match, extracting the precise control inputs that drove each visual change. For these reasons, Counter-Strike is fast becoming a popular substrate for embodied AI and world-model research, with recent efforts such as [EgoCS-400k](https://egocs-400k.github.io/#dataset) reflecting a growing community interest in it as a rich source of egocentric training data.

Today we release [CS2-10k](https://huggingface.co/datasets/RekaAI/CS2-10k), a large-scale egocentric gameplay dataset built from professional CS2 matches. It **contains 600,000+ player-round videos** spanning **10,000+ hours of first-person footage**, paired with **per-frame annotations** covering **keyboard state, mouse movement, and 3D player trajectory**. Alongside this **ready-to-use dataset**, we are also releasing the **ready-to-extend** [cs2-dem-renderer](https://github.com/reka-ai/cs2-dem-renderer), the open-source pipeline used to produce it. All of this, so we can build better world models, together.

## Dataset Overview

CS2-10k is built from public professional match demos sourced from [HLTV](https://www.hltv.org/). For each demo, we render clean first-person video at 720p, 48fps using the demo replay tool inside CS2, producing one video per player per round. Alongside each video, we store a parquet file containing per-frame annotations synchronized to the video timeline.

#### Annotation Schema

Every video clip has its corresponding anotations stored in a `.parquet`

file:

Field | Type | Description |
|---|---|---|
| string | Map name (e.g. "mirage", "dust2") |
| int | Round within the match |
| int | 0 = Counter-Terrorist, 1 = Terrorist |
| int | Total frames in the clip |
| float | Video frame rate (48.0) |
| float | Clip duration in seconds |
| float | Camera field of view (90.0°) |
| list[dict] | Per-frame annotation array (see below) |

#### Per-Frame Annotations

Each entry in `frame_data`

contains:

Field | Description |
|---|---|
| Concatenated active keys: |
| Horizontal camera delta — proxy for mouse X movement |
| Vertical camera delta — proxy for mouse Y movement |
| Player world position in game units |
| Camera yaw angle (−180° to 180°) |
| Camera pitch angle (−90° to 90°) |

The combination of video and per-frame control signals creates a tight action-observation loop.

## No Abrupt Visual Changes

Each clip is a contiguous segment of a single round from a single player's perspective. There are no mid-round cuts, no editing transitions, and no UI HUD. The camera moves in a physically plausible relationship in the world and we hide the player weapon to get rid of sudden visual changes caused by weapon recoil, reloads, and weapon switching.

## Many Use Cases

CS2-10k is designed for training interactive world models that learn how first-person visual observations change in response to player actions. The same aligned video, control, and state signals also support a range of related research workflows:

## Rendering Pipeline

If CS2-10k does not cover the scale, matches, or annotations you need, you can use our open-source pipeline at [github.com/reka-ai/cs2-dem-renderer](https://github.com/reka-ai/cs2-dem-renderer) to render your own CS2 datasets. Given a `.dem`

file, it performs a two-pass parse to extract per-player spawn/death intervals and per-frame button inputs, then drives CS2's built-in demo replay system to render first-person video for each player each round. Frames are streamed in real time from CS2's movie output to ffmpeg (VAAPI HEVC), producing `.mp4`

clips alongside synchronized `.parquet`

annotation files. A worker mode processes entire directories of demos with automatic deduplication, making it straightforward to run at the scale of CS2-10k.

## Citation

If you use CS2-10k in your work, please cite:
