12:16
2026-05-22
dev.to
artificial-intelligence
LMR-BENCH: Can LLM Agents Reproduce NLP Research Code? (EMNLP 2025)
The LMR-BENCH benchmark, introduced by researchers at the University of Texas at Dallas at EMNLP 2025, evaluates whether LLM agents can reproduce core implementations from NLP research papers by filliβ¦