# Why Every CISO Needs an AIBOM in 2026 — And What Most Vendors Get Wrong

> Source: <https://dev.to/grumpysage/why-every-ciso-needs-an-aibom-in-2026-and-what-most-vendors-get-wrong-3ki9>
> Published: 2026-06-04 16:22:06+00:00

A friend of mine runs security at a mid-size fintech. Last month she got a Slack DM from her general counsel at 9:47 on a Tuesday night. The Italian DPA had sent a questionnaire. Standard stuff on paper: list the AI systems processing EU resident data, the foundation models behind them, the training data sources, the fine-tunes you've deployed, and the third-party inference endpoints you call. Forty-eight hours to respond.

She opened a ticket with her platform team. By midnight she had a partial answer for the two flagship products. By noon the next day she had partial answers for six more. By the deadline she had submitted a document that, in her words, "was 70% true and 30% me hoping." The part that made her sick wasn't the gaps. It was that engineering had stood up a new RAG pipeline against a self-hosted Mixtral fine-tune the week before, and nobody in her org had known until she went looking. Not the platform team. Not procurement. Not her. A whole production AI system with customer PII flowing through it, and the security team learned about it from a regulator.

She told me this story over coffee and asked the question every CISO I talk to is asking right now: "How do other people have an actual answer to this?" The honest reply is that most don't. They have an SBOM, maybe a vendor inventory in a GRC tool, and a Notion page someone updated in October. That is not an AIBOM. That is a wish.

An AI Bill of Materials isn't an SBOM with "model name" added as a column. It's a fundamentally different artifact because AI systems have layers SBOMs were never designed to capture: weights, training data, fine-tunes, prompts, retrieval indexes, agent tool graphs, and the runtime serving stack. Every one of those layers is a security boundary. Every one of them changes independently. If you can't enumerate them with the same rigor you enumerate Python packages, you can't reason about your risk, you can't respond to regulators, and you can't tell your board what you actually run. In 2026, that gap is no longer tolerable.

Let me be concrete, because vagueness is how this whole space went sideways.

A real AIBOM has seven layers. The first is the **model layer** itself: which base models you depend on, with versions, licenses, and provenance. Llama-3.3-70B-Instruct is not the same as a quantized GGUF derivative someone pulled from Hugging Face on a Thursday. The hash matters. The source matters. The license matters because Meta's community license has acceptable use clauses your legal team has probably never read.

The second is the **adapter layer**: every LoRA, every QLoRA, every full fine-tune sitting on top of a base. Each of these inherits the base model's vulnerabilities and adds its own attack surface. A fine-tune trained on internal support tickets will memorize PII. Your AIBOM has to know that fine-tune exists, where the training data came from, who approved it, and which products consume it.

The third is the **data layer**: training corpora, evaluation sets, retrieval indexes, vector databases, and the embedding models that populate them. This is where the GDPR questions live. A vector store containing customer email embeddings is a database of personal data, full stop, and most orgs treat it like cache.

The fourth is the **inference layer**: where models actually run. Self-hosted on Ollama? Behind vLLM with continuous batching? Through OpenAI's API? Bedrock? An employee's laptop running LM Studio on Tailscale? Each of these has different threat models, different network exposure, and different telemetry. The number of orgs I've seen with shadow vLLM servers running on a forgotten GPU box is genuinely funny if you don't think about it too hard.

The fifth is the **prompt and tool layer**: system prompts, prompt templates, RAG retrieval logic, agent tool definitions, and MCP server endpoints. This is the layer that the OWASP LLM Top 10 actually maps to. It's also the layer that changes most often and is documented least.

The sixth is the **identity and access layer**: which humans and which services can call which models, with which permissions, against which data. If your AIBOM doesn't tie back to your IAM model, it's a museum exhibit.

The seventh is the **dependency layer**: the boring software supply chain underneath everything. Transformers version. CUDA version. Triton kernels. The PyTorch wheel. The 1,815 rules in our scanner catch a lot here, but the point is that AI infrastructure is software, and software has the supply chain problems software has always had.

You need all seven. Drop any of them and your AIBOM has a blind spot a regulator or attacker will find.

Here's where I'm going to make some people unhappy.

Most "AIBOM" tooling on the market right now is one of three things. It's a CSPM with a tab labeled "AI." It's a discovery tool that scrapes git repos for `import openai`

and calls it inventory. Or it's an SBOM generator that added a JSON field for model name. None of these are AIBOMs in any meaningful sense.

The CSPM-with-a-tab products will tell you which Azure OpenAI deployments exist in which subscriptions. Useful. Stops at the cloud boundary. Can't see your self-hosted Ollama instance, can't see the LoRA your data science team trained last quarter, can't see the prompts, can't see the vector store. You get maybe 20% of the picture and a confident dashboard about it.

The code-scraping discovery tools find the calls. They don't find the runtime. They don't know if that `client.chat.completions.create()`

is actually wired up in production or if it's dead code from a prototype. They especially don't know about the *responses*: what data flows back, what gets logged, what gets cached.

The SBOM-with-extra-fields products are the most cynical of the three. SBOM formats like SPDX and CycloneDX have been bolting on ML-BOM extensions, and the extensions are fine as a serialization format. But generating one requires the underlying inventory work, and the vendors selling "we support ML-BOM" usually mean "we let you fill in the fields manually." That's not inventory. That's a form.

The other thing vendors get wrong is treating the AIBOM as a static artifact. Your AIBOM has to be a continuously regenerated view, not a PDF you ship to a regulator once. Models get swapped. Adapters get retrained. RAG indexes get rebuilt nightly. A snapshot from Tuesday is wrong by Thursday. Anyone selling you an "AIBOM" without a story for continuous generation is selling you a fossil.

I want to spend a minute on this because it's the part nobody wants to talk about.

The center of gravity in enterprise AI has moved. Three years ago every interesting AI workload ran on someone else's API. Today, a huge fraction of the AI that actually touches sensitive data runs on infrastructure you own, on runtimes like Ollama, vLLM, TGI, LocalAI, Triton, LM Studio, and llama.cpp. Each of those has a different deployment story, a different default network posture, and a different set of known vulnerabilities.

A vLLM server in production with no authentication on its OpenAI-compatible endpoint is a fully unauthenticated inference oracle for anyone who can reach it. An Ollama instance bound to 0.0.0.0 on an internal subnet is a model exfiltration risk and, if it's loaded with a fine-tune of your proprietary data, an IP exfiltration risk. Triton has had a real CVE history. LocalAI ships with defaults that I would charitably describe as "trusting."

Our radar scanner covers all of these because we built it specifically to map the self-hosted runtime layer. You cannot have an AIBOM in 2026 that omits self-hosted inference. That's not opinion. That's where the workloads are. If your vendor's discovery story stops at the cloud control plane, you don't have an AIBOM. You have an Azure inventory with extra steps.

The mechanics matter. Let me walk through what we actually do, because I think it's the right shape and I want other vendors to copy it.

You start with a scan of the code repositories. Not for `import openai`

calls, but for the full surface: model loading code, training scripts, prompt templates, agent tool definitions, RAG pipelines, MCP server registrations, evaluator harnesses. Our static scanner catches all of this across 75+ languages with 1,815 rules. That gives you the declared intent of your AI systems.

Then you scan the network. You enumerate inference runtimes that are actually live, fingerprint them, and check them against a known-issue catalog. This is where the radar piece earns its keep. The intent layer tells you what engineers said they were building. The runtime layer tells you what's actually running. The delta between those two is where the worst surprises live.

Then you scan the web-facing surface. Any agent that takes external input, any RAG system with user-controlled retrieval, any model behind a public endpoint needs to be probed. We run 22 categories of fuzzing against these surfaces and convert about 95% of community templates so you're not reinventing prompt injection wheels.

Then you correlate. A finding on a runtime is interesting. A finding on a runtime that you can prove is reachable from a public agent endpoint that has tool access to a database with PII is a Sev 1. Correlation is where AIBOMs become useful instead of just exhaustive.

And then you publish. Through an API, through a dashboard, and through an MCP server that exposes the inventory as 10 tools an LLM can query. That last piece sounds gimmicky until you've watched an incident response where the responder asks "what models touch the payments service" and gets an answer in three seconds instead of three hours.

A CLI example, because the abstraction is doing too much work:

```
cybrium aibom generate --org acme --output cyclonedx-mlbom.json
cybrium aibom diff --from last-week --to now
cybrium aibom query "models with PII training data exposed to internet agents"
```

The diff is the part that matters operationally. AIBOM as a verb, not a noun.

I get asked constantly why we built scan, radar, and web fuzz as one platform instead of three products. The answer is correlation, but the deeper answer is that the layers of an AI system don't respect product boundaries.

A prompt injection in a web-facing agent is a web vulnerability. The model it exploits is a runtime vulnerability. The tool it calls is a code vulnerability. The data it exfiltrates is a data governance failure. If your AIBOM lives in one product, your runtime scanner lives in another, and your web fuzzer lives in a third, you have three inventories and zero correlation. You have to write the JOIN yourself, in a spreadsheet, at 2 a.m., while a regulator is waiting.

Single platform isn't a marketing position. It's the only way the math works. The MCP server exposing the inventory matters because it lets the same correlation graph answer questions from a SOC analyst, a compliance auditor, an engineering lead, and an LLM agent doing automated triage. One graph. Ten tools. Many consumers.

What I'm watching happen across the industry in 2026 is a recomposition of the security stack around AI as a first-class asset class. We had CMDBs for hardware. We got CSPMs for cloud. We got SBOMs for software supply chain. Now we need AIBOMs for AI, and the orgs that treat it as a checkbox on an existing tool are going to be the orgs explaining to regulators why their inventory was wrong.

The reason this matters more than the last asset class additions is that AI systems are stochastic, composable, and rapidly mutating. A package version is a fact. A model's behavior is a distribution. A fine-tune changes that distribution. A prompt change changes it again. RAG retrievals change it on every query. Your AIBOM isn't just an inventory of nouns. It's an inventory of behaviors-in-context, and the tooling has to reflect that.

The EU AI Act high-risk system documentation requirements landed for real this year. NIST AI RMF mappings are now showing up in customer security questionnaires. SOC 2 auditors are starting to ask about model inventory in the controls walkthrough. The market is converging on AIBOM as a required artifact, and the vendors who got there with real architectural answers are going to look very different from the ones who bolted a tab onto an existing dashboard.

My friend at the fintech is rebuilding her program around continuous AIBOM generation right now. The Italian DPA letter was the forcing function, but she would have gotten there anyway because the alternative is doing this work manually every time a regulator, customer, or board member asks. Manual doesn't scale past the third question.

If you're a CISO in 2026 and you can't answer "what AI runs in our environment, what data touches it, what's reachable from the internet, and what changed this week" in under five minutes, you have an AIBOM problem. The good news is it's a solvable problem. The better news is the tooling finally exists to solve it without an army of analysts.

Start with `cyscan`

for the code surface, layer in `cyradar`

for the runtime, add `cyweb`

for the web-facing agents, and pipe the whole correlation graph through the MCP server so your team can ask it questions in English. That's an AIBOM. Everything else is a spreadsheet with ambitions.

If you want to talk through yours, find me at `anand@cybrium.ai`

.
