{"slug": "training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale", "title": "Training a Twitch chat toxicity classifier on real VOD data at scale", "summary": "A developer built a Twitch chat toxicity classifier by scraping VOD chat replay data at scale using the platform's internal `VideoCommentsByOffsetOrCursor` GraphQL endpoint, which is not publicly accessible. The project required bypassing Twitch's TLS fingerprint inspection and rate-limiting through browser-emulating HTTP libraries, residential proxies, and offset-based pagination to collect structured message data including text, emotes, badges, and subscriber status. The resulting dataset, costing approximately $0.001 per message, enables training of TF-IDF and logistic regression classifiers with features that distinguish between moderators, subscribers, and regular users.", "body_md": "Quick answer:Twitch has no public API for VOD chat replay. To build a Twitch toxicity classifier dataset you walk the internal`VideoCommentsByOffsetOrCursor`\n\nGraphQL endpoint at scale — the same one the web player uses. The[Devil Scrapes Twitch VOD Chat Archive Actor]does that for $0.001 per message (~$1.05 per 1,000), returning the structured fields —`message_fragments`\n\n,`badges`\n\n,`is_subscriber`\n\n— that make classifier features actually useful.\n\nIf you maintain a mod-bot (StreamElements, Nightbot, Streamlabs, or custom), or if you are an ML engineer building a Twitch-native toxicity model, your training data problem is the same: you need labeled-able chat messages at scale from real VODs, with enough context per row to build signal-rich features. This post walks the full pipeline — pulling the data, loading it into pandas, training a baseline TF-IDF + logistic-regression classifier, and sketching the upgrade path to a transformer.\n\nNot in any useful sense. The [Twitch Helix API](https://dev.twitch.tv/docs/api/) exposes live IRC chat via EventSub and the Chat & Messaging endpoints, but it has no endpoint for VOD chat replay — the historical timestamped record of a past broadcast. That data exists (you can watch it in the VOD player), but the only programmatic surface for it is the internal `VideoCommentsByOffsetOrCursor`\n\npersisted GraphQL query.\n\nWalking that endpoint reliably is a job in itself. Twitch inspects TLS fingerprints from incoming requests — Python's `requests`\n\nor `httpx`\n\nproduce a ClientHello that no real browser sends, and the server responds with a `403`\n\nbefore it reads the body. Past roughly 10,000 messages on a single IP, Twitch's rate-limiting kicks in hard. The cursor-based pagination mode triggers an integrity-check challenge that needs a live browser to solve. Offset-based pagination avoids it, but only if you know to use it before you start coding.\n\nWe absorb all of that. The Actor rotates through Chrome, Firefox, and Safari TLS fingerprints via `curl-cffi`\n\n, threads residential proxies with fresh session IDs on each block, retries with exponential backoff on `408 / 429 / 5xx`\n\n, and pages exclusively by content offset to sidestep the integrity check. The result is a clean dataset of typed rows you can load straight into pandas.\n\nNot all chat APIs return the same structure. The fields the Actor returns were chosen with feature engineering in mind:\n\n** message_text** — the plain-text body of the message with emote shortcodes preserved as literal text (e.g.\n\n`\"PogChamp PogChamp OMEGALUL\"`\n\n). This is your label target and your primary text feature.** message_fragments** — a structured array of\n\n`{type, text, emote_id}`\n\nobjects. Type is either `\"text\"`\n\nor `\"emote\"`\n\n. This matters because emotes carry semantic weight a TF-IDF tokenizer cannot capture from their shortcode text alone. An `\"emote\"`\n\nfragment with `emote_id`\n\nlets you treat emotes as a distinct token type, deduplicate their representation, or embed them separately. Spam runs often consist almost entirely of emote fragments; that ratio is a cheap feature.** badges** — an array of\n\n`{set_id, version}`\n\nobjects representing the user's active chat badges. A user carrying a `moderator`\n\nbadge, a `broadcaster`\n\nbadge, or a `vip`\n\nbadge is structurally different from a first-time chatter — and their messages should be weighted differently in your training set. A model that does not distinguish a moderator warning from a random user saying the same thing is a weaker model.** is_subscriber** — a boolean convenience flag derived from the badges array. Subscribers are users who have paid for channel membership; their base rate of toxic behavior differs from non-subscribers. This is a fast binary feature your model can use without parsing the full badges array.\n\n** message_offset_seconds** — the message's position in the VOD timeline in seconds. Toxic spikes correlate with in-stream events: a bad play, a controversial opinion, a raid. Including offset in your labeling pass lets you sample across the full timeline rather than front-loading training data from the first ten minutes.\n\n** commenter_id** and\n\n`commenter_login`\n\nYou need `apify-client`\n\ninstalled (`pip install apify-client pandas scikit-learn`\n\n). Get a free Apify API token at [apify.com](https://apify.com) — no card required, every account starts with $5 of credit.\n\nThe call below targets three VODs by ID and caps at 5,000 messages per VOD. At $0.001 per message plus the $0.05 actor-start, 15,000 messages costs $15.05.\n\n``` python\nfrom apify_client import ApifyClient\n\nclient = ApifyClient(\"YOUR_APIFY_TOKEN\")\n\nrun = client.actor(\"DevilScrapes/twitch-vod-chat-archive\").call(\n    run_input={\n        \"vodIds\": [\n            \"2773625679\",\n            \"2756421083\",\n            \"2741897234\"\n        ],\n        \"maxMessagesPerVod\": 5000,\n        \"startOffsetSeconds\": 0,\n        \"proxyConfiguration\": {\n            \"useApifyProxy\": True,\n            \"apifyProxyGroups\": [\"RESIDENTIAL\"]\n        }\n    }\n)\n\nitems = list(client.dataset(run[\"defaultDatasetId\"]).iterate_items())\nprint(f\"Pulled {len(items)} messages\")\n```\n\nFor a larger training corpus — say 100 VODs from a mix of channels — set `maxRecentVods`\n\non `channelLogin`\n\nmode instead of listing IDs:\n\n```\nrun = client.actor(\"DevilScrapes/twitch-vod-chat-archive\").call(\n    run_input={\n        \"channelLogin\": \"shroud\",\n        \"maxRecentVods\": 50,\n        \"maxMessagesPerVod\": 10000,\n        \"proxyConfiguration\": {\n            \"useApifyProxy\": True,\n            \"apifyProxyGroups\": [\"RESIDENTIAL\"]\n        }\n    }\n)\n```\n\nThat gives you up to 500,000 messages per channel in a single run. At $0.001/message that is ~$500.05 for the full 500k — but the free $5 trial credit covers 4,950 messages, enough to validate your pipeline before committing.\n\n``` python\nimport pandas as pd\n\ndf = pd.DataFrame(items)\n\n# Compute emote ratio — useful spam feature\ndef emote_ratio(fragments):\n    if not fragments:\n        return 0.0\n    emote_count = sum(1 for f in fragments if f.get(\"type\") == \"emote\")\n    return emote_count / len(fragments)\n\ndf[\"emote_ratio\"] = df[\"message_fragments\"].apply(emote_ratio)\n\n# Extract badge sets as a frozenset for grouping\ndef badge_set(badges):\n    return frozenset(b[\"set_id\"] for b in badges) if badges else frozenset()\n\ndf[\"badge_set\"] = df[\"badges\"].apply(badge_set)\n\n# is_moderator / is_broadcaster convenience columns\ndf[\"is_moderator\"] = df[\"badge_set\"].apply(lambda s: \"moderator\" in s)\ndf[\"is_broadcaster\"] = df[\"badge_set\"].apply(lambda s: \"broadcaster\" in s)\n\n# Messages per user — frequency signal\nmsg_counts = df.groupby(\"commenter_id\")[\"message_id\"].count().rename(\"user_msg_count\")\ndf = df.merge(msg_counts, on=\"commenter_id\", how=\"left\")\n\nprint(df[[\"message_text\", \"is_subscriber\", \"is_moderator\", \"emote_ratio\", \"user_msg_count\"]].head())\n```\n\nSample output row from a real VOD scrape (channel: shroud, toxic content masked):\n\n```\n{\n  \"vod_id\": \"2773625679\",\n  \"vod_title\": \"never played forza but i definitely have a drivers license so it should be easy\",\n  \"channel_login\": \"shroud\",\n  \"message_id\": \"1292e052-0561-4db5-86c7-adfc4556d628\",\n  \"message_offset_seconds\": 12,\n  \"posted_at\": \"2026-05-16T18:42:35.297Z\",\n  \"commenter_id\": \"142680597\",\n  \"commenter_login\": \"tabrexs\",\n  \"commenter_display_name\": \"tabrexs\",\n  \"message_text\": \"PewPewPew\",\n  \"message_fragments\": [\n    {\n      \"type\": \"emote\",\n      \"text\": \"PewPewPew\",\n      \"emote_id\": \"emotesv2_587405136a8147148c77df74baaa1bf4\"\n    }\n  ],\n  \"user_color\": \"#DAA520\",\n  \"badges\": [],\n  \"is_subscriber\": false,\n  \"scraped_at\": \"2026-05-16T19:00:00Z\"\n}\n```\n\nFor a first iteration, label toxic/benign manually on a sample and train a TF-IDF + logistic-regression baseline. This is fast to iterate on and gives you a performance floor to beat with transformer fine-tuning later.\n\n**Important framing note for the labeling pass:** toxic labels in mod-tool training are typically defined by the channel's own moderation rules, not a universal taxonomy. What a family-friendly channel flags as toxic differs from a gaming-focused one. Build your label schema per-channel or use a community standard like [Perspective API categories](https://perspectiveapi.com/) for initial seeding.\n\nDo not include known-slur text in your labeled examples file in plaintext — store them masked (e.g. `[masked slur]`\n\n) and apply transformations at load time. The mod community, and any team reviewing your training data, will thank you.\n\n``` python\nimport json\nfrom sklearn.feature_extraction.text import TfidfVectorizer\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import classification_report\nfrom sklearn.pipeline import Pipeline\nimport numpy as np\n\n# Load your labeled subset (human annotations: {message_id: 0 or 1})\n# 0 = benign, 1 = toxic / spam\nwith open(\"labels.json\") as f:\n    labels = json.load(f)  # {\"message_id_1\": 0, \"message_id_2\": 1, ...}\n\nlabeled_df = df[df[\"message_id\"].isin(labels)].copy()\nlabeled_df[\"label\"] = labeled_df[\"message_id\"].map(labels)\n\n# Text feature — message_text is the primary signal\nX_text = labeled_df[\"message_text\"].fillna(\"\")\ny = labeled_df[\"label\"]\n\nX_train, X_test, y_train, y_test = train_test_split(\n    X_text, y, test_size=0.2, random_state=42, stratify=y\n)\n\n# Baseline: TF-IDF unigrams + bigrams, logistic regression\npipeline = Pipeline([\n    (\"tfidf\", TfidfVectorizer(\n        ngram_range=(1, 2),\n        max_features=20000,\n        sublinear_tf=True\n    )),\n    (\"clf\", LogisticRegression(\n        C=1.0,\n        class_weight=\"balanced\",  # important: toxic is a minority class\n        max_iter=1000\n    )),\n])\n\npipeline.fit(X_train, y_train)\ny_pred = pipeline.predict(X_test)\n\nprint(classification_report(y_test, y_pred, target_names=[\"benign\", \"toxic\"]))\n```\n\n**Adding structural features alongside TF-IDF:**\n\nThe text pipeline above ignores `emote_ratio`\n\n, `is_subscriber`\n\n, and `user_msg_count`\n\n. To include them in the same model, combine sparse TF-IDF with a dense feature matrix:\n\n``` python\nfrom scipy.sparse import hstack\nfrom sklearn.preprocessing import StandardScaler\n\n# Dense features\ndense_features = labeled_df[[\"emote_ratio\", \"is_subscriber\", \"is_moderator\", \"user_msg_count\"]].fillna(0).values\n\nX_train_dense, X_test_dense = (\n    dense_features[labeled_df.index.isin(X_train.index)],\n    dense_features[labeled_df.index.isin(X_test.index)],\n)\n\n# Fit TF-IDF on train split only\ntfidf = TfidfVectorizer(ngram_range=(1, 2), max_features=20000, sublinear_tf=True)\nX_train_sparse = tfidf.fit_transform(X_train)\nX_test_sparse = tfidf.transform(X_test)\n\n# Combine\nX_train_combined = hstack([X_train_sparse, X_train_dense])\nX_test_combined = hstack([X_test_sparse, X_test_dense])\n\nclf = LogisticRegression(C=1.0, class_weight=\"balanced\", max_iter=1000)\nclf.fit(X_train_combined, y_train)\n\nprint(classification_report(y_test, clf.predict(X_test_combined), target_names=[\"benign\", \"toxic\"]))\n```\n\nIn practice the `emote_ratio`\n\ncolumn tends to lift spam precision noticeably — pure-emote spam messages produce a ratio near 1.0 and a short `message_text`\n\nlength, a combination TF-IDF alone does not capture well.\n\nThe baseline above will plateau around 75–82% F1 on a well-balanced Twitch dataset. The main failure modes are:\n\nThe upgrade path is to fine-tune a pre-trained model on your labeled data. `cardiffnlp/twitter-roberta-base-offensive`\n\nis a strong starting checkpoint for chat-style text — it was trained on social-media toxicity and transfers better to Twitch than a generic BERT.\n\n```\n# Pseudocode — full fine-tuning loop depends on your GPU setup\nfrom transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments\nfrom datasets import Dataset\n\nmodel_name = \"cardiffnlp/twitter-roberta-base-offensive\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\n\nhf_dataset = Dataset.from_pandas(labeled_df[[\"message_text\", \"label\"]].rename(columns={\"message_text\": \"text\"}))\n\ndef tokenize(batch):\n    return tokenizer(batch[\"text\"], truncation=True, padding=\"max_length\", max_length=128)\n\ntokenized = hf_dataset.map(tokenize, batched=True)\n# ... standard Trainer setup with TrainingArguments, compute_metrics, etc.\n```\n\nThe `message_fragments`\n\nfield opens a further avenue: treat emote tokens as special tokens added to the tokenizer vocabulary (one token per `emote_id`\n\n), then let the model learn emote embeddings jointly with text. This is not a weekend project, but it is the difference between a model that handles `OMEGALUL`\n\nas an unknown token and one that learns it signals laughter.\n\nThe plan answers the pricing question directly. At $0.001/message:\n\n| Pull size | Cost | Labeled examples (assuming 10% manual label rate) |\n|---|---|---|\n| 10,000 messages | $10.05 | ~1,000 labeled rows |\n| 50,000 messages | $50.05 | ~5,000 labeled rows |\n| 100,000 messages | $100.05 | ~10,000 labeled rows |\n\nFor a TF-IDF baseline, 1,000–5,000 labeled examples is workable if your class balance is reasonable. For transformer fine-tuning, 5,000+ labeled examples per class is the typical floor for stable results. You get to the free trial's 4,950 messages before spending a cent — that is enough to validate your feature extraction pipeline end-to-end before scaling up.\n\nThe full Twitch chat scraper guide covers the broader use-case landscape (esports analytics, post-broadcast review, channel back-catalog mode) if you want context beyond classifier training: [Twitch Chat Scraper: export any VOD's full chat replay for $1.05/1K](https://dev.to/devil_scrapes/twitch-chat-scraper-export-any-vods-full-chat-replay-for-1051k-1jea).\n\n**Can I use this for StreamElements / Nightbot rule testing?**\n\nYes. Pull historical chat from VODs where you know toxic events occurred, then replay the `message_text`\n\nvalues through your bot's filter rules in a test harness. The `badges`\n\nand `is_subscriber`\n\nfields let you simulate the trust-level rules most bots implement (moderators and subscribers often get different thresholds).\n\n**Does the Actor return deleted or banned messages?**\n\nNo. The public chat-replay endpoint does not expose moderator actions — bans, timeouts, or the content of deleted messages. Deleted messages may appear as a `<message deleted>`\n\nplaceholder or may not appear at all, depending on when they were removed relative to the archive write. Your toxicity model should treat the absence of a message ID from a later snapshot as a soft toxic signal, not a hard one.\n\n**How do I avoid training on bot messages?**\n\nFilter on `user_msg_count`\n\n— accounts that sent more than N messages in the same VOD are candidate spam bots. You can also filter out users whose `message_text`\n\nis identical across multiple rows in the same VOD (copy-paste spam). The Actor returns the stable `commenter_id`\n\nso grouping is straightforward.\n\n**Is this legal / TOS-compliant?**\n\nTwitch's public VOD chat replay is presented to any logged-out visitor; this Actor retrieves only what the VOD player shows anonymously, at a paced rate. We are not affiliated with Twitch. Check your own jurisdiction and use case. The [Twitch Terms of Service](https://www.twitch.tv/p/legal/terms-of-service/) governs what you may do with the collected data — notably the prohibition on commercial use of data in ways that compete directly with Twitch.\n\nThe Actor is live at ** apify.com/DevilScrapes/twitch-vod-chat-archive**. Free $5 trial credit, no credit card. Pull a few thousand messages from a channel you know, run through the pipeline above, and you will have a working baseline before the end of the day. Leave a question in the comments if you hit a snag — the\n\n`message_fragments`\n\n/ feature-engineering section in particular has sharp edges worth talking through.*Built by Devil Scrapes — we do the dirty work so your dataset stays clean.* 😈", "url": "https://wpnews.pro/news/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale", "canonical_source": "https://dev.to/devil_scrapes/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale-4b05", "published_at": "2026-06-05 20:32:23+00:00", "updated_at": "2026-06-05 20:41:34.591597+00:00", "lang": "en", "topics": ["machine-learning", "natural-language-processing", "ai-tools", "ai-infrastructure", "mlops"], "entities": ["Twitch", "StreamElements", "Nightbot", "Streamlabs", "Devil Scrapes Twitch VOD Chat Archive Actor", "Helix API", "EventSub", "VideoCommentsByOffsetOrCursor"], "alternates": {"html": "https://wpnews.pro/news/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale", "markdown": "https://wpnews.pro/news/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale.md", "text": "https://wpnews.pro/news/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale.txt", "jsonld": "https://wpnews.pro/news/training-a-twitch-chat-toxicity-classifier-on-real-vod-data-at-scale.jsonld"}}