See which database types VDF AI Data supports, how to choose the right connector, what fields a database connection needs, and how to scope read-only access safely.
Enterprises rarely have one database. A production data estate usually includes transactional systems, analytical warehouses, federated query engines, enterprise databases, issue trackers, and a long tail of older stores that still hold important business context.
That is why database connectivity matters in an AI data platform. Before a team can profile tables, discover features, build semantic indexes, or ask natural-language questions over structured data, the platform needs a secure way to read from the systems where that data already lives.
VDF AI Data ships with first-party connectors for the most common operational and analytical stores. From the Data Connections screen, choose the database type that matches your source, scope it narrowly, and connect it with a read-only account.
This guide covers the supported database types, when to use each connector, what a database connection looks like, and how to configure access in a way that works for production teams.
Supported Database Types #
VDF AI Data supports the database types most enterprise teams search for when evaluating an AI data platform: PostgreSQL, MySQL, SQL Server, Oracle, SAP HANA, Exasol, Presto, generic JDBC connections, and Jira as structured data.
The short version:
| Source type | Best fit |
|---|---|
| PostgreSQL | Transactional applications, product databases, operational reporting |
| MySQL | Web applications, MariaDB-compatible deployments, managed MySQL |
| Microsoft SQL Server | Enterprise applications, Azure SQL, on-prem Microsoft estates |
| Oracle | Enterprise systems, finance, ERP-adjacent operational data |
| SAP HANA | SAP HANA Cloud and on-prem SAP analytical data |
| Exasol | High-performance analytics and MPP workloads |
| Presto | Federated querying across multiple underlying sources |
| JDBC | Warehouses and stores with a JDBC driver, including Snowflake, BigQuery via JDBC, Redshift, Trino, Vertica, and more |
| Jira | Issues, projects, backlog data, and delivery metadata as queryable structured data |
If your store is not named directly, the JDBC option covers most databases and warehouses with a published JDBC driver.
PostgreSQL Connector #
PostgreSQL is one of the most common transactional databases in modern software teams. VDF AI Data supports managed PostgreSQL deployments such as Amazon RDS, Google Cloud SQL, Azure Database for PostgreSQL, and self-hosted Postgres running in your own infrastructure.
Use the PostgreSQL connector when you want to make operational data available for:
- exploratory data analysis over application tables
- semantic search over text-heavy records
- feature discovery across customer, order, event, or product schemas
- fine-tune data preparation from production-like datasets
For production use, create a dedicated read-only PostgreSQL role. Grant access only to the database and schemas VDF AI Data should inspect. Avoid reusing the application user, since that account often has write permissions the AI data layer does not need.
MySQL and MariaDB-Compatible Databases #
MySQL remains common across web applications, commerce platforms, CRM-adjacent systems, and operational reporting. VDF AI Data supports MySQL, including MariaDB-compatible deployments and managed MySQL services across major cloud providers.
Choose the MySQL connector when your source is:
- a managed MySQL database
- a MariaDB-compatible deployment
- a self-hosted MySQL instance
- an application database with tables you want to profile, search, or use for feature engineering
As with every database connector, the right pattern is a connection-scoped user with read-only access. If only a subset of tables should be available, grant access at the schema, table, or view level instead of exposing the full database.
Microsoft SQL Server and Azure SQL #
Microsoft SQL Server is common in enterprise environments, especially where Microsoft infrastructure, ERP systems, internal tools, and legacy operational systems are already established.
VDF AI Data supports SQL Server in on-prem environments and Azure SQL. You can connect using an existing service account if it is appropriately scoped, or create a dedicated read-only login for the connection.
SQL Server data is often valuable for AI workflows because it contains business-critical operational records: orders, customer accounts, cases, invoices, inventory, and service history. Once connected, those tables can become available for EDA, semantic search, feature discovery, and downstream AI workflows without giving VDF AI Data write permissions.
Oracle Database #
Oracle remains a core enterprise database for finance, operations, ERP-adjacent systems, and high-value line-of-business applications. VDF AI Data supports enterprise Oracle deployments, including the standard listener and service-name configuration.
Use the Oracle connector when your organization needs to expose selected Oracle schemas to AI-assisted analysis while keeping access tightly controlled. Good Oracle connection hygiene includes:
- use a dedicated read-only database user
- grant SELECT only on approved schemas, tables, or views
- document the schema owner and business owner in the connection description
- validate the known asset count after connecting
That last point matters. If you expected 40 tables and the connection sees 4,000, the scope is too broad or the account can see more than intended.
SAP HANA #
SAP HANA stores critical enterprise data for many organizations, both in SAP HANA Cloud and on-prem deployments. VDF AI Data supports SAP HANA connections for teams that want to make selected schemas available to AI data workflows.
The production pattern is straightforward: create a read-scoped database user with access to the schemas you want VDF AI Data to use. Keep the connection focused on specific business domains rather than exposing everything available in the SAP environment.
This is especially important for SAP-backed use cases where data may contain sensitive finance, supply chain, HR, or operational records. Narrow scoping makes the connection easier to govern, easier to audit, and easier for downstream users to understand.
Exasol #
Exasol is used for high-performance analytical workloads on its MPP database. VDF AI Data supports Exasol as a first-party connector for teams that want to bring analytical tables into AI-assisted workflows.
Use the Exasol connector when your analytical data already sits in Exasol and you want to support:
- table profiling and data quality checks
- feature discovery over analytical datasets
- semantic search over descriptive dimensions or text fields
- training dataset preparation from curated analytical sources
Because Exasol environments often contain broad analytical views, scoping is important. Connect to the database, schema, or views that represent the business domain you want VDF AI Data to work with.
Presto #
Presto is a federated query layer. Instead of connecting to a single underlying store, Presto can query across multiple systems through catalogs and connectors.
Use the Presto connector when your organization already relies on Presto to access data spread across different sources. In this setup, VDF AI Data connects to Presto as the entry point, while Presto handles access to the underlying stores. This is useful when teams want one AI data connection to reach a governed federated layer rather than creating separate connections to every backing database.
The same scoping rule applies: connect to the catalog, schema, or query surface that matches the intended use case. “Everything Presto can see” is usually too broad for production AI workflows.
Generic JDBC Connector #
The generic JDBC connector is the fallback for databases and warehouses that are not first-class options in the connection list.
Use JDBC when your source has a published JDBC driver, including:
- Snowflake
- BigQuery via JDBC
- Amazon Redshift
- Trino
- Vertica
- other enterprise databases with JDBC support
JDBC is useful because real enterprise data estates include more than the most common database engines. If the database can be reached through a JDBC driver and the network path is available, VDF AI Data can often connect through the generic JDBC option.
If your team repeatedly uses a JDBC-backed source and wants a first-class connector for that database type, contact us. First-class connectors can make configuration simpler for common production patterns.
Jira as Structured Data #
Jira is not a database in the traditional sense, but Jira projects can be added as a structured connection in VDF AI Data.
This is useful when you want issues, projects, backlog items, statuses, priorities, assignees, epics, and sprint metadata to behave like queryable data rather than documents.
For example, a product or delivery team might ask:
- Which unresolved issues block the current release?
- Which epics have the most reopened tickets?
- Where are bug reports increasing by component?
- Which backlog items relate to a specific customer impact theme?
Treating Jira as structured data makes it easier to connect delivery signals with other enterprise data sources.
What a Database Connection Looks Like #
Each database connection in VDF AI Data is a small set of fields grouped so teams can see what identifies the connection, what defines the network path, and what is secret.
| Field | What it is for |
|---|---|
| Name | A friendly label your team will recognize, such as “Production Orders DB” or “Analytics Warehouse” |
| Type | The database type, such as PostgreSQL, MySQL, Oracle, JDBC, or Jira |
| Status | The connection lifecycle state |
| Database / Store | The database, schema, catalog, or store name that scopes the connection |
| Host and port | The network address VDF AI Data uses to reach the source |
| Credentials | A read-scoped username and password or token, stored encrypted and never shown back after save |
| Description | A one-line note explaining what the connection is for |
| Assets | The expected number of tables, views, or objects on the other side |
Credentials can be pasted directly or referenced from a secret managed elsewhere, such as your vault or platform secrets store. Direct paste is fastest for a first connection. Secret references are the better pattern for production.
Connection States #
Database connections move through a small set of states. Watch the status indicator on the connection card.
| State | What it means | What to do |
|---|---|---|
| Configuring | The connection is being defined and is not active yet | Fill in the remaining fields and save |
| Connected | VDF AI Data can read from the source | Use it in EDA, search, feature discovery, or other downstream workflows |
| Needs attention | Authentication failed, the host is unreachable, or the scope changed | Update credentials, check the network path, or re-scope and re-test |
| d | The connection is temporarily disabled, typically by a workspace admin | Resume it from the connection menu when ready |
These states keep connection health visible without forcing teams to inspect logs for every routine issue.
How to Scope a Database Connection #
The most important rule for database connectivity is simple: narrower is better.
A good connection is scoped to a clear business domain. “Production Orders” is better than “everything the user can see.” “Finance reporting views” is better than “all Oracle schemas.” “Customer support Jira project” is better than “all Jira projects.”
Use these practices before putting a database connection into production:
- Scope by database, schema, catalog, view, or project instead of exposing every available source.
- Create a dedicated read-only login for VDF AI Data.
- Do not reuse the application’s database user.
- Allow only the network paths needed from the host running VDF AI Data to the database host.
- Use the Description field to record the data owner and where to ask if something changes.
- Compare the asset count against what you expected the connection to see.
VDF AI Data only reads, but defense in depth still matters. The source account should only be able to read too.
What You Can Do With a Connected Database #
Once a database is connected, it becomes a first-class source across the Data area.
Exploratory Data Analysis (EDA) helps teams profile tables, inspect column statistics, find outliers, and surface relationships without writing queries.
Feature engineering supports feature lists, feature discovery across tables, and feature associations across a schema.
Vector indexing lets Vector DB Builder create semantic indexes over text-heavy columns, so chats and agents can search records by meaning rather than exact keyword match.
Fine-tune data preparation helps teams assemble training datasets from real production data, while still keeping access scoped to the approved connection.
Semantic search lets users ask natural-language questions over structured data with citations back to specific tables and rows.
In practical terms, a connected database is not just a data source. It becomes a governed input to AI analysis, search, retrieval, and model improvement workflows.
Choosing the Right Connector #
Use the first-party connector when your database type is listed directly. That gives the clearest setup path for PostgreSQL, MySQL, SQL Server, Oracle, SAP HANA, Exasol, Presto, and Jira.
Use JDBC when the store is not listed but has a published JDBC driver. This is the right route for many warehouses, lakehouse query engines, and enterprise databases that are not shown as first-class options yet.
Use Jira when the team wants issue and delivery data to behave like structured data. If the goal is document-style search over pages, use document or knowledge connectors instead. If the goal is queryable issue metadata, Jira as structured data is the better fit.
Further Reading #
Want to connect operational and analytical databases to governed AI workflows? Contact VDF AI or explore VDF AI Data.
Frequently Asked Questions #
What database types does VDF AI Data support? #
VDF AI Data supports first-party connectors for PostgreSQL, MySQL, Microsoft SQL Server, Oracle, SAP HANA, Exasol, Presto, generic JDBC sources, and Jira as structured queryable data.
Can VDF AI Data connect to Snowflake, BigQuery, Redshift, Trino, or Vertica? #
Yes, in most cases. If the source has a published JDBC driver, use the generic JDBC connector. That path covers many analytical warehouses and query engines, including Snowflake, BigQuery via JDBC, Redshift, Trino, and Vertica.
Should database credentials be read-only? #
Yes. VDF AI Data only reads from connected stores, and the database account should also be limited to read-only permissions. Create a dedicated user and grant SELECT only on the schemas, tables, or views you want available.
What can I do after connecting a database to VDF AI Data? #
A connected database can be used for exploratory data analysis, feature engineering, vector indexing, fine-tune data preparation, and semantic search over structured data with citations back to specific tables and rows.