FlashAttention-3

mentions 2 type Organization feed RSS

// recent coverage 2 mentions

12:32

2026-05-30

maltebuettner.eu

large-language-models

DocumentAI Visual Benchmark - GPT 5.5, Gemini 3.5, Qwen...

A new benchmark evaluating DocumentAI models on bounding box accuracy shows GPT-5.5 and Gemini 3.5 leading with 67.7% and 67.5% scores respectively, while Qwen, Kimi, and Mistral trail significantly. …

00:00

2026-05-14

maltebuettner.eu

large-language-models

documentai bbox benchmark

Malte Buettner benchmarked bounding box accuracy for Document AI models using pages from the FlashAttention-3 paper, testing Qwen, Kimi, and Mistral via OpenRouter. The evaluation scored models on cov…

// co-occurs with top 8 entities

OpenRouter 2 Qwen 2 Kimi 2 Mistral 2 ExtractBench 2 ContextualAI 2 pdfplumber 1 Malte Buettner 1