BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses

wpnews.pro

cd /news/large-language-models/beaver-enterprise-benchmark-for-llm-… · home › topics › large-language-models › article

[ARTICLE · art-27409] src=beaverbench.github.io ↗ pub=2026-06-15T01:20Z topic=large-language-models verified=true sentiment=· neutral

BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses

Researchers at MIT and other institutions released BEAVER, a large-scale enterprise benchmark for evaluating LLM text-to-SQL capabilities, containing 9,128 queries from private data warehouses across 19 domains. The benchmark includes a public set of 7,978 queries and a private test set, designed to assess real-world enterprise SQL generation. The team invites submissions for evaluation and provides code and data for citation.

read1 min views25 publishedJun 15, 2026

Please send an email to peterbc@mit.edu, along with your method name, a brief description of the method, and, optionally, a link to your paper or codebase. We will follow up with detailed instructions.

| Rank | Submission Date | Method | Model | Execution Accuracy | |---|

If you find our data, code, or the paper helpful, please cite the paper:

@article{chen2024beaver,
  title={BEAVER: an enterprise benchmark for text-to-sql},
  author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},
  journal={arXiv preprint arXiv:2409.02038},
  year={2024}
}

BEAVER is a large-scale enterprise text-to-SQL dataset containing 9128 queries spanning 812 tables across 19 diverse domains. Of these, 7978 queries are publicly released, while the remaining portion is held out as a private test set. Queries and databases were collected from private organizations.

To facilitate fine-grained evaluation and analysis, we provide

Representative BEAVER tasks with question, SQL, and subtask annotations.

If you find our data, code, or the paper helpful, please cite the paper:

article{chen2024beaver,
  title={BEAVER: an enterprise benchmark for text-to-sql},
  author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},
  journal={arXiv preprint arXiv:2409.02038},
  year={2024}
}

source & further reading

beaverbench.github.io — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/beaver-enterprise-benchm…

Read original on beaverbench.github.io → beaverbench.github.io/

mentioned entities

MIT

Peter Baile Chen

Devin Yang

Weiyue Li

Fabian Wenz

Yi Zhang

Nesime Tatbul

Michael Cafarella

metadata

slugbeaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses

topic#large-language-models

secondary3 topics

sentimentneutral

canonicalbeaverbench.github.io

navigation

← prevSpam Detection for Inbound Agent…

next →You need to know about the Baruc…

── more in #large-language-models 4 stories · sorted by recency

kaitchup.substack.com · 31 Jul · #large-language-models

Agentic AI at Two Different Scales: Nanbeige4.2-3B and Laguna S2.1

runtimewire.com · 31 Jul · #large-language-models

Murati's Thinking Machines ships Inkling-Small at about one-quarter of Inkling's size

dev.to · 31 Jul · #large-language-models

All software engineers are now QAs

siliconangle.com · 30 Jul · #large-language-models

Google DeepMind debuts Gemini Robotics 2 model series for humanoid robots

── more on @mit 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 30 Jul · #artificial-intelligence

Microsoft Will Soon Release an AI Super App

wpnews · 30 Jul · #artificial-intelligence

Apple to join Samsung in AI glasses race against Meta

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required