12:18
2026-05-26
iwhalen.github.io
large-language-models
Show HN: Rogue-Bench โ LLMs play the game Rogue
A new benchmark called Rogue-Bench tests how well large language models can play the classic dungeon crawler game Rogue. The tool runs a modified headless version of Unix Rogue 5.4.2, communicating wiโฆ