cd/entity/AA-BriefcaseΒ· homeβ€Ί entitiesβ€Ί AA-Briefcase
grep -l @aa-briefcase /news/*.json | wc -l β†’ 1

AA-Briefcase

mentions 1 type Organization feed RSS

// recent coverage 1 mentions

23:57
2026-06-18
artificialanalysis.ai
ai-research

Show HN: AA-Briefcase: a frontier knowledge work evaluation

A new evaluation benchmark, AA-Briefcase, measures frontier knowledge work performance, with models like Claude Opus 4.8 averaging 24 minutes per task and achieving an Elo of 1356, while MiniMax-M3 ta…

// co-occurs with top 4 entities