ESCI

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

04:00

2026-06-05

arxiv.org

large-language-models

Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference

Researchers have developed PRECISE, a method that combines small human-labeled datasets with large LLM-generated judgments to produce bias-corrected estimates of ranking evaluation metrics. The approa…

// co-occurs with top 3 entities

PRECISE 1 Prediction-Powered Inference 1 Claude 3 Sonnet 1

// topics top 4 topics

large language models 1 machine learning 1 artificial intelligence 1 ai research 1