{"slug": "ask-hn-data-source-used-for-training-anthropic-s-mythos", "title": "Ask HN: Data source used for training Anthropic's Mythos?", "summary": "An anonymous user on Hacker News asked how Anthropic obtained private security bug data for training its Mythos or Fable AI models, noting that Anthropic's documentation only mentions publicly accessible data. The user speculated that such data is typically sensitive and not publicly available, raising questions about the training sources.", "body_md": "| ||||||||||||\n2 points by |\nThis question came up from a tangential discussion. Some one was asking suggestions about systems architecture. I suggested them to look into RCA (root cause analysis) for some big scale failures. Hard part is to find this kind of data publicly. This prompted the question about the data source for Mythos/Fable training. The security bugs reports, fixes are somewhat of a sensitive private collections. I did check the documentation from Anthropic on this, it only mentions the usage of publicly accessible data. Any ideas how Anthropic would have got their hands on this private data if they used that at all? | |||||||||||\n|", "url": "https://wpnews.pro/news/ask-hn-data-source-used-for-training-anthropic-s-mythos", "canonical_source": "https://news.ycombinator.com/item?id=48536482", "published_at": "2026-06-15 04:09:03+00:00", "updated_at": "2026-06-15 04:42:01.204519+00:00", "lang": "en", "topics": ["ai-research", "ai-safety", "large-language-models"], "entities": ["Anthropic", "Mythos", "Fable"], "alternates": {"html": "https://wpnews.pro/news/ask-hn-data-source-used-for-training-anthropic-s-mythos", "markdown": "https://wpnews.pro/news/ask-hn-data-source-used-for-training-anthropic-s-mythos.md", "text": "https://wpnews.pro/news/ask-hn-data-source-used-for-training-anthropic-s-mythos.txt", "jsonld": "https://wpnews.pro/news/ask-hn-data-source-used-for-training-anthropic-s-mythos.jsonld"}}