IFScale benchmark

mentions 1 type Person feed RSS

// recent coverage 1 mentions

14:45

2026-05-12

arize.com

large-language-models

Models got an order of magnitude better at following instructions in one year

New research shows frontier AI models have improved nearly tenfold in their ability to follow instructions over the past year, according to data from the IFScale benchmark. A year ago, models began lo…

// co-occurs with top 4 entities

Dexter Horthy 1 Jaroslawicz 1 GPT 5.5 1 DeepSeek V4 Pro 1

// topics top 3 topics

large language models 1 artificial intelligence 1 ai research 1