{"type": "article", "title": "Reliability without Validity: A Systematic, Large-Scale Evaluation of LLM-as-a-Judge Models Across Agreement, Consistency, and Bias", "publisher": "Web Pulse", "url": "https://wpnews.pro/news/reliability-without-validity-a-systematic-large-scale-evaluation-of-llm-as-a-and", "original_source": "https://arxiv.org/abs/2606.19544", "published": "2026-06-19T04:00:00+00:00", "accessed": "2026-06-19", "id": "reliability-without-validity-a-systematic-large-scale-evaluation-of-llm-as-a-and"}