Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure NVIDIA and Oracle Cloud Infrastructure (OCI) announced the deployment of the production-ready NVIDIA AI-Q Blueprint on OCI, enabling developers to deploy multi-agent AI systems using Terraform and Helm. The blueprint, built on LangChain Deep Agents and the NeMo Agent Toolkit, supports long-horizon agents for tasks like cited answers and research reports. This integration allows users to provision OCI resources and deploy the AI-Q endpoint in their own tenancy with a single teardown command. AI agents have changed a lot in the last two years. The first could only answer one question at a time. Then came multi-turn chat, where the model could keep some context across a session. Today, we have long-horizon agents. Systems that plan many steps, split work between sub-agents, keep context across a long task, and run tools in a safe sandbox. The NVIDIA AI-Q Blueprint https://build.nvidia.com/nvidia/aiq is an open source reference for this kind of agent. It is built on LangChain Deep Agents https://docs.langchain.com/oss/python/integrations/providers/nvidia and the NVIDIA NeMo Agent Toolkit https://github.com/NVIDIA/NeMo-Agent-Toolkit . You can use it for quick cited answers, or for longer research reports with sources. This post shows you how to deploy AI-Q 2.0 on Oracle Cloud Infrastructure OCI using Terraform to create the OCI resources and Helm to install the workloads on OKE. By the end, you will have a working AI-Q endpoint in your own OCI tenancy, and one command to take it all down when you are done. Who this is for: Developers and platform engineers comfortable with Kubernetes, Terraform, and the shell, and who want to run AI-Q on OCI rather than on a laptop. What you’ll learn: How AI-Q’s multi-agent architecture maps to OCI services, plus the exact commands to provision, deploy, and open the blueprint from start to finish. More background on the multi-agent architecture such as intent router, shallow research agent, deep agent, planning sub-agent, researcher sub-agent , is on the AI-Q product page https://build.nvidia.com/nvidia/aiq and the NeMo Agent Toolkit docs https://github.com/NVIDIA/NeMo-Agent-Toolkit . Prerequisites Make sure you have: OCI tenancy access with a compartment you can deploy into, and enough service limits for:- OKE: One enhanced cluster and one node pool - Block Volume: At least 10 GB dynamically provisioned by the OKE CSI driver for the in-cluster PostgreSQL - Load Balancer: One flexible - Vault: One vault plus secrets API keys: - NGC API key from build.nvidia.com https://build.nvidia.com/ , format nvapi- … used both as the NVIDIA inference key and to authenticate to the NGC container registry nvcr.io . - Tavily API key from tavily.com https://tavily.com/ , format tvly- … - NGC API key from Local tools: terraform 1.5 or later, kubectl 1.28 or later, helm 3.x or later, the oci CLI set up with your API signing key Some basic knowledge of Kubernetes, Helm charts, Terraform, and the shell. LangChain or NeMo Agent Toolkit experience is nice to have, but not required. Architecture overview AI-Q uses a multi-agent design. An intent router reads each user query and sends it to the right workflow. The blueprint is built to be extensible. Every layer models, tools, RAG backends, sub-agents, evaluators can be swapped through YAML config or through the NeMo Agent Toolkit plugin system. We will use that extensibility in Parts 2 and 3 of this series. OCI deployment architecture The deployment uses Terraform for the OCI resources and Helm for the Kubernetes workloads. This gives a clean split between infrastructure and application, and one terraform destroy is enough to remove everything later. Resource | Terraform module | Purpose | |---|---|---| | VCN, subnets, gateways, NSGs | network | Network isolation with public and OKE subnets | | OKE cluster + node pool | oke | Kubernetes runtime Enhanced cluster, VCN-native CNI | | OCI Load Balancer | loadbalancer | Public HTTP ingress on port 80, forwarding to NodePort 30080 | | OCI Vault + secrets | vault | AES-256 encrypted storage for API keys and credentials | Table 1. OCI resources created by the Terraform modules in deploy/terraform .The Helm chart installs three workloads on OKE: Backend aiq-backend : A FastAPI-based agent server that runs the AI-Q workflow. Frontend aiq-frontend : A next.js web UI exposed over NodePort 30080. PostgreSQL aiq-postgres : An in-cluster database for the job store, checkpoints, and summaries. Deployment steps git clone https://github.com/oracle-samples/ai-q.git cd ai-q/oke-samples/aiq-2.0 Total time: around 20 to 25 minutes. The full reference is in aiq-2.0/README.md https://github.com/oracle-samples/ai-q/blob/main/oke-samples/aiq-2.0/README.md . Step 1. Configure Terraform variables Copy the example file and edit it with your tenancy details: cd deploy/terraform cp terraform.tfvars.example terraform.tfvars At minimum, set these variables in terraform.tfvars : tenancy ocid , compartment id , region for example us-chicago-1 user ocid , fingerprint , private key path same values as your ~/.oci/config db admin password , used to bootstrap the in-cluster PostgreSQL, stored in OCI Vault. nvidia api key , your NVIDIA NGC key from build.nvidia.com https://build.nvidia.com/ . Used for inference and to pull container images from nvcr.io . tavily api key , your Tavily key from tavily.com https://tavily.com/ , for web search. Step 2. Create the infrastructure Initialize the providers, check the plan, and apply: terraform init terraform plan terraform apply This takes about 10 to 15 minutes. Terraform creates the VCN, OKE cluster, Load Balancer, and the Vault with the NGC and Tavily API keys encrypted at rest. Check: terraform output should show values for oke cluster id and lb public ip . If either is empty, run terraform apply again – the apply is safe to repeat. Capture the two values you’ll need in the next step: export OKE CLUSTER ID="$ terraform output -raw oke cluster id " export LB PUBLIC IP="$ terraform output -raw lb public ip " Step 3. Install AI-Q from the NGC Helm chart The chart and container images are published on NGC, so there’s nothing to build locally. We point kubectl at the new OKE cluster, create the secrets the chart consumes, then helm pull and helm install . 3a. Configure kubectl for the OKE cluster configure kubectl for the OKE cluster oci ce cluster create-kubeconfig \ --cluster-id "$OKE CLUSTER ID" \ --file ~/.kube/config \ --region us-ashburn-1 \ --token-version 2.0.0 \ --kube-endpoint PUBLIC ENDPOINT sanity check. nodes should be ready kubectl get nodes 3b. Export the API keys Reuse the same NGC and Tavily keys you put in terraform.tfvars . The NGC key does double duty. It’s both the inference key and the nvcr.io pull credential. export NGC API KEY="nvapi-..." from build.nvidia.com export TAVILY API KEY="tvly-..." from tavily.com export DB USER PASSWORD="