You don't need all the LLM benchmarks
A new analysis of over 5,400 AI models reveals that benchmark scores for large language models are highly correlated, with just five subjects on the MMLU test predicting the remaining 52 with 91% accuracy. Researchers ha…