CS2-10k: A Large-Scale Egocentric Counter-Strike 2 Dataset Reka AI released CS2-10k, a large-scale egocentric dataset built from professional Counter-Strike 2 matches, containing over 600,000 player-round videos totaling 10,000+ hours of first-person footage with per-frame annotations of keyboard state, mouse movement, and 3D player trajectory. The dataset aims to support training interactive world models for embodied AI research, and the open-source rendering pipeline used to create it is also being released. Training interactive world models requires data that is notoriously hard to find: ego-centric video sequences with densely aligned action signals keyboard inputs, camera motion, and ego state all synchronized to the visual stream. Real-world embodied data is costly to collect, while synthetic data often lacks the visual richness or behavioral diversity needed for generalization. Counter-Strike 2 demos offer a compelling middle ground: because matches are recorded as deterministic replays, we can reconstruct clean first-person video at any point in a match, extracting the precise control inputs that drove each visual change. For these reasons, Counter-Strike is fast becoming a popular substrate for embodied AI and world-model research, with recent efforts such as EgoCS-400k https://egocs-400k.github.io/ dataset reflecting a growing community interest in it as a rich source of egocentric training data. Today we release CS2-10k https://huggingface.co/datasets/RekaAI/CS2-10k , a large-scale egocentric gameplay dataset built from professional CS2 matches. It contains 600,000+ player-round videos spanning 10,000+ hours of first-person footage , paired with per-frame annotations covering keyboard state, mouse movement, and 3D player trajectory . Alongside this ready-to-use dataset , we are also releasing the ready-to-extend cs2-dem-renderer https://github.com/reka-ai/cs2-dem-renderer , the open-source pipeline used to produce it. All of this, so we can build better world models, together. Dataset Overview CS2-10k is built from public professional match demos sourced from HLTV https://www.hltv.org/ . For each demo, we render clean first-person video at 720p, 48fps using the demo replay tool inside CS2, producing one video per player per round. Alongside each video, we store a parquet file containing per-frame annotations synchronized to the video timeline. Annotation Schema Every video clip has its corresponding anotations stored in a .parquet file: Field | Type | Description | |---|---|---| | string | Map name e.g. "mirage", "dust2" | | int | Round within the match | | int | 0 = Counter-Terrorist, 1 = Terrorist | | int | Total frames in the clip | | float | Video frame rate 48.0 | | float | Clip duration in seconds | | float | Camera field of view 90.0° | | list dict | Per-frame annotation array see below | Per-Frame Annotations Each entry in frame data contains: Field | Description | |---|---| | Concatenated active keys: | | Horizontal camera delta — proxy for mouse X movement | | Vertical camera delta — proxy for mouse Y movement | | Player world position in game units | | Camera yaw angle −180° to 180° | | Camera pitch angle −90° to 90° | The combination of video and per-frame control signals creates a tight action-observation loop. No Abrupt Visual Changes Each clip is a contiguous segment of a single round from a single player's perspective. There are no mid-round cuts, no editing transitions, and no UI HUD. The camera moves in a physically plausible relationship in the world and we hide the player weapon to get rid of sudden visual changes caused by weapon recoil, reloads, and weapon switching. Many Use Cases CS2-10k is designed for training interactive world models that learn how first-person visual observations change in response to player actions. The same aligned video, control, and state signals also support a range of related research workflows: Rendering Pipeline If CS2-10k does not cover the scale, matches, or annotations you need, you can use our open-source pipeline at github.com/reka-ai/cs2-dem-renderer https://github.com/reka-ai/cs2-dem-renderer to render your own CS2 datasets. Given a .dem file, it performs a two-pass parse to extract per-player spawn/death intervals and per-frame button inputs, then drives CS2's built-in demo replay system to render first-person video for each player each round. Frames are streamed in real time from CS2's movie output to ffmpeg VAAPI HEVC , producing .mp4 clips alongside synchronized .parquet annotation files. A worker mode processes entire directories of demos with automatic deduplication, making it straightforward to run at the scale of CS2-10k. Citation If you use CS2-10k in your work, please cite: