cd /news/large-language-models/gpt-and-claude-failed-bridgewater-s-… · home topics large-language-models article
[ARTICLE · art-47710] src=the-decoder.com ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

GPT and Claude failed Bridgewater's finance tests because the right answers were never public

Bridgewater Associates and Thinking Machines Lab fine-tuned a Qwen3-235B model for financial tasks, achieving 84.7% accuracy and outperforming GPT, Claude, and Gemini at roughly one-fourteenth the cost. The results have not been independently verified.

read1 min views1 publishedJul 3, 2026

Bridgewater and Thinking Machines Lab—the startup from former OpenAI CTO Mira Murati—have fine-tuned a Qwen3-235B model for financial tasks. According to their own testing, the model hits 84.7 percent accuracy, beating Gemini, Claude, and GPT at roughly one-fourteenth of the cost. The numbers haven't been verified by anyone outside the two companies, though.

The article GPT and Claude failed Bridgewater's finance tests because the right answers were never public appeared first on The Decoder.

── more in #large-language-models 4 stories · sorted by recency
── more on @bridgewater associates 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/gpt-and-claude-faile…] indexed:0 read:1min 2026-07-03 ·