cd/entity/OpenWebTextยท homeโ€บ entitiesโ€บ OpenWebText
grep -l @openwebtext /news/*.json | wc -l โ†’ 1

OpenWebText

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:51
2026-06-17
github.com
large-language-models

GPT-2 124M checkpoint pre-trained on OpenWebText 27.5B tokens

A 124M-parameter GPT-2 model trained from scratch on OpenWebText data using a custom deep learning library achieved a validation loss of 2.764 nats and a perplexity of 15.87 after 56,000 steps (27.5B โ€ฆ

// co-occurs with top 7 entities