cd/entity/InboxΒ· homeβ€Ί entitiesβ€Ί Inbox
grep -l @inbox /news/*.json | wc -l β†’ 1

Inbox

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

00:00
2026-06-30
dev.to
large-language-models

The AI judge that called a half-finished audit 'exhaustive'

An engineer building a benchmark for AI coding agents discovered that an LLM judge incorrectly scored a half-finished audit as 'exhaustive' because it lacked a reference answer. The judge evaluated th…

// co-occurs with top 1 entities