{"slug": "beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses", "title": "BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses", "summary": "Researchers at MIT and other institutions released BEAVER, a large-scale enterprise benchmark for evaluating LLM text-to-SQL capabilities, containing 9,128 queries from private data warehouses across 19 domains. The benchmark includes a public set of 7,978 queries and a private test set, designed to assess real-world enterprise SQL generation. The team invites submissions for evaluation and provides code and data for citation.", "body_md": "Please send an email to peterbc@mit.edu, along with your method name, a brief description of the method, and, optionally, a link to your paper or codebase. We will follow up with detailed instructions.\n\n| Rank | Submission Date | Method | Model | Execution Accuracy |\n|---|\n\nIf you find our data, code, or the paper helpful, please cite the paper:\n\n```\n@article{chen2024beaver,\n  title={BEAVER: an enterprise benchmark for text-to-sql},\n  author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\\c{C}}a{\\u{g}}atay and Stonebraker, Michael},\n  journal={arXiv preprint arXiv:2409.02038},\n  year={2024}\n}\n```\n\nBEAVER is a large-scale enterprise text-to-SQL dataset containing 9128 queries spanning 812 tables across 19 diverse domains. Of these, 7978 queries are publicly released, while the remaining portion is held out as a private test set. Queries and databases were collected from *private* organizations.\n\nTo facilitate fine-grained evaluation and analysis, we provide\n\nRepresentative BEAVER tasks with question, SQL, and subtask annotations.\n\nIf you find our data, code, or the paper helpful, please cite the paper:\n\n```\narticle{chen2024beaver,\n  title={BEAVER: an enterprise benchmark for text-to-sql},\n  author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\\c{C}}a{\\u{g}}atay and Stonebraker, Michael},\n  journal={arXiv preprint arXiv:2409.02038},\n  year={2024}\n}\n```\n\n", "url": "https://wpnews.pro/news/beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses", "canonical_source": "https://beaverbench.github.io/", "published_at": "2026-06-15 01:20:49+00:00", "updated_at": "2026-06-15 01:42:22.596245+00:00", "lang": "en", "topics": ["large-language-models", "natural-language-processing", "ai-research", "ai-products"], "entities": ["MIT", "Peter Baile Chen", "Devin Yang", "Weiyue Li", "Fabian Wenz", "Yi Zhang", "Nesime Tatbul", "Michael Cafarella"], "alternates": {"html": "https://wpnews.pro/news/beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses", "markdown": "https://wpnews.pro/news/beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses.md", "text": "https://wpnews.pro/news/beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses.txt", "jsonld": "https://wpnews.pro/news/beaver-enterprise-benchmark-for-llm-text-to-sql-from-private-data-warehouses.jsonld"}}