# Building an AI Cloud Cost Intelligence Platform That Doesn't Let AI Make Infrastructure Decisions

> Source: <https://dev.to/upshivam786/building-an-ai-cloud-cost-intelligence-platform-that-doesnt-let-ai-make-infrastructure-decisions-5gbn>
> Published: 2026-06-29 04:11:00+00:00

Most AI-powered cloud optimization demos follow a simple approach:

```
Azure Resources
      ↓
Large Language Model
      ↓
Recommendations
```

At first glance, this seems impressive.

But while building my own Azure Cost Intelligence Platform, I ran into a problem that completely changed my architecture.

The AI was generating infrastructure recommendations that looked correct—but some of them simply didn't exist.

For example, it suggested VM SKUs that Azure doesn't support and even produced Azure CLI commands with incorrect parameters. That was a wake-up call.

Instead of trying to "prompt engineer" my way out of the problem, I redesigned the application.

Rather than letting AI make decisions, I split the system into independent components:

The AI never decides **what** to do—it only explains decisions that have already been made.

Instead of asking the LLM:

"What should I do with this VM?"

the backend first gathers facts:

`Standard_B2ts_v2`

The recommendation engine then determines:

```
Resize VM
↓
Target SKU: Standard_B2ats_v2
↓
Estimated Savings: $0.74/month
```

Finally, the LLM produces a human-friendly explanation like:

"This virtual machine is significantly underutilized. Resizing it can reduce monthly costs without affecting current workloads."

Notice that the AI never invents the recommendation or the command—it only explains verified data.

This project taught me an important lesson:

**AI should not replace engineering logic.**

Cloud infrastructure decisions should come from verified APIs, business rules, and deterministic code. AI adds the most value when it helps engineers understand those decisions, not when it generates them.

That's the architecture I'll continue using as I expand this platform with historical cost trends, anomaly detection, and multi-cloud (AWS & GCP) support.

If you're building AI-assisted DevOps or FinOps tools, I'd love to hear how you're balancing automation with reliability.
