cd/entity/QAGSΒ· homeβ€Ί entitiesβ€Ί QAGS
grep -l @qags /news/*.json | wc -l β†’ 1

QAGS

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

03:19
2026-06-28
arxiv.org
large-language-models

Improved LLM as a Judge Techniques

Researchers propose BINEVAL, a framework that decomposes LLM evaluation into atomic binary questions for interpretable, multi-dimensional scoring. The method matches or outperforms strong baselines on…

// co-occurs with top 6 entities