03:22
2026-06-24
microsoft.github.io
large-language-models
BenchPress: Predict any LLM's score on any benchmark
A new tool called BenchPress allows users to predict any large language model's score on any benchmark, using a score matrix derived from reported evaluations. The project invites community contributiβ¦