BenchPress: Predict any LLM's score on any benchmark A new tool called BenchPress allows users to predict any large language model's score on any benchmark, using a score matrix derived from reported evaluations. The project invites community contributions to expand its predictive capabilities. Predict any LLM's score on any benchmark. 01 / 02 Score Leaderboard Resources Use the code to reproduce the paper, or download the score matrix behind the predictor. Contribute Report benchmark scores for a model. Include the model, benchmark, score, evaluation setting, effort, and source; we will review provenance before adding it to the matrix.