{"slug": "deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure", "title": "Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure", "summary": "NVIDIA and Oracle Cloud Infrastructure (OCI) announced the deployment of the production-ready NVIDIA AI-Q Blueprint on OCI, enabling developers to deploy multi-agent AI systems using Terraform and Helm. The blueprint, built on LangChain Deep Agents and the NeMo Agent Toolkit, supports long-horizon agents for tasks like cited answers and research reports. This integration allows users to provision OCI resources and deploy the AI-Q endpoint in their own tenancy with a single teardown command.", "body_md": "AI agents have changed a lot in the last two years. The first could only answer one question at a time. Then came multi-turn chat, where the model could keep some context across a session. Today, we have long-horizon agents. Systems that plan many steps, split work between sub-agents, keep context across a long task, and run tools in a safe sandbox.\n\nThe [NVIDIA AI-Q Blueprint](https://build.nvidia.com/nvidia/aiq) is an open source reference for this kind of agent. It is built on [LangChain Deep Agents](https://docs.langchain.com/oss/python/integrations/providers/nvidia) and the [NVIDIA NeMo Agent Toolkit](https://github.com/NVIDIA/NeMo-Agent-Toolkit). You can use it for quick cited answers, or for longer research reports with sources.\n\nThis post shows you how to deploy AI-Q 2.0 on Oracle Cloud Infrastructure (OCI) using Terraform to create the OCI resources and Helm to install the workloads on OKE. By the end, you will have a working AI-Q endpoint in your own OCI tenancy, and one command to take it all down when you are done.\n\n**Who this is for:** Developers and platform engineers comfortable with Kubernetes, Terraform, and the shell, and who want to run AI-Q on OCI rather than on a laptop.\n\n**What you’ll learn:** How AI-Q’s multi-agent architecture maps to OCI services, plus the exact commands to provision, deploy, and open the blueprint from start to finish.\n\nMore background on the multi-agent architecture (such as intent router, shallow research agent, deep agent, planning sub-agent, researcher sub-agent), is on the [AI-Q product page](https://build.nvidia.com/nvidia/aiq) and the [NeMo Agent Toolkit docs](https://github.com/NVIDIA/NeMo-Agent-Toolkit).\n\n**Prerequisites**\n\nMake sure you have:\n\n**OCI tenancy access** with a compartment you can deploy into, and enough service limits for:- OKE: One enhanced cluster and one node pool\n- Block Volume: At least 10 GB (dynamically provisioned by the OKE CSI driver for the in-cluster PostgreSQL)\n- Load Balancer: One flexible\n- Vault: One vault plus secrets\n\n**API keys:**- NGC API key from\n[build.nvidia.com](https://build.nvidia.com/), format`nvapi-`\n\n… used both as the NVIDIA inference key and to authenticate to the NGC container registry (`nvcr.io`\n\n). - Tavily API key from\n[tavily.com](https://tavily.com/), format`tvly-`\n\n…\n\n- NGC API key from\n**Local tools:** terraform 1.5 or later,`kubectl`\n\n1.28 or later,`helm`\n\n3.x or later, the oci CLI set up with your API signing key**Some basic knowledge** of Kubernetes, Helm charts, Terraform, and the shell. LangChain or NeMo Agent Toolkit experience is nice to have, but not required.\n\n## Architecture overview\n\nAI-Q uses a multi-agent design. An **intent router** reads each user query and sends it to the right workflow.\n\nThe blueprint is built to be extensible. Every layer (models, tools, RAG backends, sub-agents, evaluators) can be swapped through YAML config or through the NeMo Agent Toolkit plugin system. We will use that extensibility in Parts 2 and 3 of this series.\n\n## OCI deployment architecture\n\nThe deployment uses **Terraform** for the OCI resources and **Helm** for the Kubernetes workloads. This gives a clean split between infrastructure and application, and one `terraform destroy`\n\nis enough to remove everything later.\n\nResource | Terraform module | Purpose |\n|---|---|---|\n| VCN, subnets, gateways, NSGs | `network` | Network isolation with public and OKE subnets |\n| OKE cluster + node pool | `oke` | Kubernetes runtime (Enhanced cluster, VCN-native CNI) |\n| OCI Load Balancer | `loadbalancer` | Public HTTP ingress on port 80, forwarding to NodePort 30080 |\n| OCI Vault + secrets | `vault` | AES-256 encrypted storage for API keys and credentials |\n\n*Table 1. OCI resources created by the Terraform modules in*\n\n`deploy/terraform`\n\n.The Helm chart installs three workloads on OKE:\n\n**Backend**(`aiq-backend`\n\n): A FastAPI-based agent server that runs the AI-Q workflow.**Frontend**(`aiq-frontend`\n\n): A`next.js`\n\nweb UI exposed over NodePort 30080.**PostgreSQL**(`aiq-postgres`\n\n): An in-cluster database for the job store, checkpoints, and summaries.\n\n## Deployment steps\n\n```\ngit clone https://github.com/oracle-samples/ai-q.git\ncd ai-q/oke-samples/aiq-2.0\n```\n\nTotal time: around 20 to 25 minutes. The full reference is in [aiq-2.0/README.md](https://github.com/oracle-samples/ai-q/blob/main/oke-samples/aiq-2.0/README.md).\n\n**Step 1. Configure Terraform variables**\n\nCopy the example file and edit it with your tenancy details:\n\n```\ncd deploy/terraform\ncp terraform.tfvars.example terraform.tfvars\n```\n\nAt minimum, set these variables in `terraform.tfvars`\n\n:\n\n`tenancy_ocid`\n\n,`compartment_id`\n\n,`region`\n\n(for example`us-chicago-1`\n\n)`user_ocid`\n\n,`fingerprint`\n\n,`private_key_path`\n\n(same values as your`~/.oci/config`\n\n)`db_admin_password`\n\n, used to bootstrap the in-cluster PostgreSQL, stored in OCI Vault.`nvidia_api_key`\n\n, your NVIDIA NGC key from[build.nvidia.com](https://build.nvidia.com/). Used for inference and to pull container images from`nvcr.io`\n\n.`tavily_api_key`\n\n, your Tavily key from[tavily.com](https://tavily.com/), for web search.\n\n**Step 2. Create the infrastructure**\n\nInitialize the providers, check the plan, and apply:\n\n```\nterraform init\nterraform plan\nterraform apply\n```\n\nThis takes about 10 to 15 minutes. Terraform creates the VCN, OKE cluster, Load Balancer, and the Vault with the NGC and Tavily API keys encrypted at rest.\n\n**Check: **`terraform output`\n\nshould show values for `oke_cluster_id`\n\nand `lb_public_ip`\n\n. If either is empty, run `terraform apply`\n\nagain – the apply is safe to repeat.\n\nCapture the two values you’ll need in the next step:\n\n```\nexport OKE_CLUSTER_ID=\"$(terraform output -raw oke_cluster_id)\"\nexport LB_PUBLIC_IP=\"$(terraform output -raw lb_public_ip)\"\n```\n\n**Step 3. Install AI-Q from the NGC Helm chart**\n\nThe chart and container images are published on NGC, so there’s nothing to build locally. We point `kubectl`\n\nat the new OKE cluster, create the secrets the chart consumes, then `helm pull`\n\nand `helm install`\n\n.\n\n**3a. Configure kubectl for the OKE cluster**\n\n```\n# configure kubectl for the OKE cluster\n\noci ce cluster create-kubeconfig \\\n  --cluster-id \"$OKE_CLUSTER_ID\" \\\n  --file ~/.kube/config \\\n  --region us-ashburn-1 \\\n  --token-version 2.0.0 \\\n  --kube-endpoint PUBLIC_ENDPOINT\n\n# sanity check. nodes should be ready\n\nkubectl get nodes\n```\n\n**3b. Export the API keys**\n\nReuse the same NGC and Tavily keys you put in `terraform.tfvars`\n\n. The NGC key does double duty. It’s both the inference key and the `nvcr.io`\n\npull credential.\n\n```\nexport NGC_API_KEY=\"nvapi-...\"         # from build.nvidia.com\nexport TAVILY_API_KEY=\"tvly-...\"       # from tavily.com\nexport DB_USER_PASSWORD=\"<same value as db_admin_password in Step 1>\"\n```\n\n**3c. Create the namespace and secrets**\n\n```\nkubectl create namespace ns-aiq --dry-run=client -o yaml | kubectl apply -f -\n\n# Application credentials (NVIDIA + Tavily inference, Postgres user)\nkubectl create secret generic aiq-credentials -n ns-aiq \\\n  --from-literal=NVIDIA_API_KEY=\"$NGC_API_KEY\" \\\n  --from-literal=TAVILY_API_KEY=\"$TAVILY_API_KEY\" \\\n  --from-literal=DB_USER_NAME=\"aiq\" \\\n  --from-literal=DB_USER_PASSWORD=\"$DB_USER_PASSWORD\"\n\n# Image-pull secret for nvcr.io (NGC container registry)\nkubectl create secret docker-registry ngc-secret -n ns-aiq \\\n  --docker-server=nvcr.io \\\n  --docker-username='$oauthtoken' \\\n  --docker-password=\"$NGC_API_KEY\"\n```\n\n**3d. Pull and install the chart from NGC**\n\n```\ncd ../helm     # from deploy/terraform to deploy/helm\n\nhelm pull https://helm.ngc.nvidia.com/nvidia/blueprint/charts/aiq2-web-2.0.0.tgz \\\n  --username='$oauthtoken' \\\n  --password=\"$NGC_API_KEY\"\n\nhelm upgrade --install aiq aiq2-web-2.0.0.tgz \\\n  -n ns-aiq \\\n  --wait --timeout 10m \\\n  -f values-oci-ngc.yaml\n```\n\nThe OCI overlay (`values-oci-ngc.yaml`\n\n) is intentionally tiny — it only pins the frontend service to NodePort 30080 (the port the OCI Load Balancer health-checks) and names the `ngc-secret`\n\nimage-pull secret. Image repositories, the Postgres init SQL, and the dynamically provisioned 10 Gi Block Volume PVC all come from the chart’s own defaults.\n\nCheck: `kubectl get pods -n ns-aiq`\n\nshould show `aiq-backend`\n\n, `aiq-frontend`\n\n, and `aiq-postgres`\n\npods in `Running`\n\nstate after 3 to 5 minutes.\n\n**Step 4. Open AI-Q**\n\nThe LB IP is already in your shell from Step 2:\n\n```\necho \"http://$LB_PUBLIC_IP\"\n```\n\nIf you opened a new shell since then, re-export it from Terraform:\n\n```\ncd ../terraform\n\nexport LB_PUBLIC_IP=\"$(terraform output -raw lb_public_ip)\"\n\necho \"http://$LB_PUBLIC_IP\"\n```\n\nOpen `http://<lb_public_ip>`\n\nin your browser. You should see the `AI-Q`\n\nfrontend.\n\nTry a simple question first, for example, “What is the NeMo Agent Toolkit?”, to confirm the routing works. Then try a deeper one, for example, “Compare the top three open-source deep-research agents by benchmark score and cost”, to see the deep agent in action.\n\n### Troubleshooting\n\n`terraform apply`\n\n**fails on**. Check the service limits for your compartment for “Cluster count” and “Node count”, and ask for more quota if needed.`OKE`\n\ncreation with a quota error**Pods stuck in**`ImagePullBackOff`\n\n. Check that the image-pull secret was created (`kubectl get secret -n ns-aiq`\n\n) and that your`NGC_API_KEY`\n\nwas correct when you ran the`kubectl create secret docker-registry ngc-secret`\n\ncommand in Step 3c. To rotate, delete the secret and re-create it, then`kubectl rollout restart deployment -n ns-aiq aiq-backend aiq-frontend`\n\n.`postgres`\n\n**pod stays in**`Pending`\n\n**for more than 2 minutes**. The Block Volume PVC didn’t get dynamically provisioned. Run`kubectl describe pvc -n ns-aiq`\n\n. Typical causes are the OKE CSI driver not running, the default StorageClass missing, or insufficient Block Volume quota. Check the storage class with`kubectl get sc`\n\nand your compartment’s Block Volume service limit.**Load Balancer IP comes back as**. OCI can take a minute or two after`null`\n\n`Terraform`\n\nto finish the`LB`\n\n. Run`terraform refresh`\n\nand then`terraform output lb_public_ip`\n\nagain.**Frontend loads but queries return**. Look at`500`\n\n`kubectl logs -n ns-aiq deploy/aiq-backend`\n\n. The most common cause is a wrong or missing`NVIDIA_API_KEY`\n\nor`TAVILY_API_KEY`\n\nin the`aiq-credentials`\n\nsecret you created in Step 3c.\n\n### Learn more\n\nYou now have a working `AI-Q 2.0`\n\ndeployment on OCI, and one command (`terraform destroy`\n\n) to remove it cleanly when you are done. A few things to keep in mind as you go further:\n\n**Cost:** The OKE node pool and the Load Balancer keep costing you while they run. Destroy the stack between experiments, or scale the node pool down to zero.**Secrets:** Terraform stores the NGC and Tavily keys in OCI Vault at provision time (for audit and disaster recovery), but the running pods read them from the`aiq-credentials`\n\nKubernetes secret you created in Step 3c. To rotate, delete and re-create that secret with the new values, then`kubectl rollout restart deployment -n ns-aiq aiq-backend`\n\n. Editing`terraform.tfvars`\n\nalone won’t reach the pods.**Extensibility:** Everything you just deployed is driven by YAML and by the NeMo Agent Toolkit plugin system. Swapping an`LLM`\n\n, adding a sub-agent, or plugging a new RAG backend is a configuration change, not a rewrite.\n\nClone the [AI-Q in OCI repo](https://github.com/oracle-samples/ai-q.git) and share on the [NVIDIA Developer Forum](https://forums.developer.nvidia.com/) the solution you built and what problem you solved.", "url": "https://wpnews.pro/news/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure", "canonical_source": "https://developer.nvidia.com/blog/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure/", "published_at": "2026-06-26 19:00:45+00:00", "updated_at": "2026-06-26 19:17:26.114132+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "ai-research", "developer-tools"], "entities": ["NVIDIA", "Oracle Cloud Infrastructure", "LangChain", "NeMo Agent Toolkit", "Terraform", "Helm", "NGC", "Tavily"], "alternates": {"html": "https://wpnews.pro/news/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure", "markdown": "https://wpnews.pro/news/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure.md", "text": "https://wpnews.pro/news/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure.txt", "jsonld": "https://wpnews.pro/news/deploy-a-production-ready-nvidia-ai-q-blueprint-on-oracle-cloud-infrastructure.jsonld"}}