{"slug": "boost-bigquery-with-python-managed-python-udfs-now-generally-available", "title": "Boost BigQuery with Python: Managed Python UDFs now generally available", "summary": "Google Cloud announced the general availability of BigQuery Managed Python User-Defined Functions (UDFs), enabling data practitioners to run custom Python code directly within BigQuery using standard SQL queries or BigQuery DataFrames. The fully managed, serverless feature automatically handles infrastructure, scales to billions of rows, and supports popular Python libraries like NumPy, SciPy, and pandas, as well as external API integrations. This launch enhances BigQuery's extensibility for complex procedural logic, scientific computing, and machine learning workflows.", "body_md": "SQL is the industry standard for high-performance structured data analysis. However, expressing complex procedural logic, scientific computations, advanced string manipulations, or machine learning workflows in pure SQL can be highly challenging, if not impossible. That kind of work is better done with Python. Data practitioners often take on additional infrastructure management tasks **—** maintaining custom images and containers, and working with additional compute services — just to run simple helper functions with custom Python code and libraries.\n\nToday, we are thrilled to announce the general availability (GA) of** **[ BigQuery Managed Python User-Defined Functions (UDFs)](https://docs.cloud.google.com/bigquery/docs/user-defined-functions-python).\n\nThis launch represents a major milestone in BigQuery’s extensibility strategy, allowing data scientists, engineers, and analysts to execute custom Python code directly and securely inside BigQuery using standard SQL queries or [BigQuery DataFrames](https://docs.cloud.google.com/bigquery/docs/bigquery-dataframes-introduction) (BigFrames) in Python. With this release, Python UDFs are fully supported for production enterprise workloads and completely integrated into BigQuery's billing SKUs.\n\nBigQuery Managed Python UDFs run on BigQuery-managed serverless resources that automatically scales to billions of rows, without having to set up infrastructure or manage containers. BigQuery automatically handles the compilation, image building, security patching, deployment, and execution of your Python code, making it super simple to use Python functions in your SQL.\n\n**Core benefits**\n\n**Flexibility:** Access the vast Python ecosystem — including top-tier scientific and mathematical libraries like NumPy, SciPy, pandas, scikit-learn and more — directly in your SQL select statements.\n\n**Tight external API integration:** Clean and enrich your BigQuery tables in real time by calling external web APIs or Google Cloud services such as Cloud Translation, Gemini Enterprise Agent Platform or custom microservices securely within your queries.\n\n**Fully managed and serverless:** BigQuery handles the underlying container infrastructure and auto-scales performance dynamically.\n\nHere is an example of a Python UDF that utilizes a popular Python package —` beautifulsoup`\n\n— to remove HTML tags. We use this function to process\n\nStackOverflow answer bodies that are stored in a BigQuery public table:\n\n**How to query it:**\n\nFor advanced users, Python UDF adds a set of capabilities to tune the performance as well as monitor the usage. Here are some examples.\n\n**Vectorized processing with Pandas PyArrow** To maximize throughput, the GA release supports direct processing of vectorized input as PyArrow RecordBatches. By processing columns of data in bulk rather than row-by-row, PyArrow eliminates Python serialization and conversion overhead, boosting performance by up to 10x for data-intensive calculations.\n\n**Configurable container resources** For heavy-duty data science and ML data preparation, you can now provision container memory (up to 16 GB) and CPU (up to 4 vCPUs) per function. This enables memory-intensive workloads (such as loading large serialized models or geospatial datasets) to run directly within the sandbox.\n\n**Customizable concurrency** Optimize your throughput and resource efficiency by configuring concurrent requests per container (up to 1,000 concurrent operations). This helps ensure that your scale-out execution is highly cost-effective and performs exceptionally well under heavy parallel loads.\n\n**Streaming logs and real-time metrics** Easily debug and monitor your production workloads. The BigQuery console now features a direct link from your query results to real-time CPU, memory, and concurrency metrics in Cloud Monitoring.\n\nBigQuery Managed Python UDF are billed with [ BigQuery Services SKU](https://cloud.google.com/bigquery/pricing#bigquery-services-pricing). This SKU is fully eligible for\n\nYou can also get cost observability through `INFORMATION_SCHEMA.JOBS `\n\nas well as using billing labels `MANAGED_ROUTINE_EXECUTION`\n\nand `MANAGED_ROUTINE_BUILD`\n\n).\n\nSee more details in the [Pricing](https://docs.cloud.google.com/bigquery/docs/user-defined-functions-python#pricing) section of the documentation.\n\nTo get started with BigQuery Python UDFs, first check out [product documentation](https://docs.cloud.google.com/bigquery/docs/user-defined-functions-python).\n\nThen, try out the functions [published](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!6m3!1sbigquery-public-data!2spython_udfs!3stokenize) in the public BigQuery dataset. For example, run the following code in a BigQuery project to tokenize country names data from BigQuery public data. Under the hood, the token UDF utilizes the o200k_base tokenizer library.\n\nOr, try out this [code lab](https://codelabs.developers.google.com/managed-python-udfs) to explore some advanced scenarios.\n\nThen, to learn how to implement other advanced design patterns, we encourage you to explore our official public documentation guides:\n\n**BigQuery DataFrames (BigFrames) Python UDFs:** To learn how to write, deploy, and scale custom Python functions natively from standard Jupyter notebook or Colab environments using BigQuery DataFrames, visit the [Customize Python functions for BigQuery DataFrames guide](https://docs.cloud.google.com/bigquery/docs/user-defined-functions-python#bigquery-dataframes_1).\n\nBring your Python workflows out of isolation and directly into the heart of your data warehouse today!", "url": "https://wpnews.pro/news/boost-bigquery-with-python-managed-python-udfs-now-generally-available", "canonical_source": "https://cloud.google.com/blog/products/data-analytics/python-udf-in-bigquery-now-generally-available/", "published_at": "2026-06-22 17:00:00+00:00", "updated_at": "2026-06-24 00:23:54.688993+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "developer-tools", "ai-infrastructure"], "entities": ["Google Cloud", "BigQuery", "Python", "NumPy", "SciPy", "pandas", "scikit-learn", "Cloud Translation"], "alternates": {"html": "https://wpnews.pro/news/boost-bigquery-with-python-managed-python-udfs-now-generally-available", "markdown": "https://wpnews.pro/news/boost-bigquery-with-python-managed-python-udfs-now-generally-available.md", "text": "https://wpnews.pro/news/boost-bigquery-with-python-managed-python-udfs-now-generally-available.txt", "jsonld": "https://wpnews.pro/news/boost-bigquery-with-python-managed-python-udfs-now-generally-available.jsonld"}}