00:00
2026-05-16
sparsethought.com
large-language-models
curation all the way down: on clinical AI benchmarks
The medARC group released Medmarks v1.0, the largest fully open medical LLM evaluation suite, featuring 30 benchmarks across verifiable and open-ended subsets, covering 61 models on 71 configurations.โฆ