Your SBOM Won't Save You From AI Audits
Checkmarx for Developers
Blog
Research
← Blog

Your SBOM Won’t Save You From AI Audits

Your SBOM can track packages, but not the AI shaping application behavior. Here’s why AI-BOMs are emerging as the missing layer in software supply chain security and compliance.

AI-BOM

Over the past year, I have spent a significant amount of time talking with customers about AI and AppSec. Not just about their AI strategy, but about what is means practically to govern the AI components shipping in production software.

The same story keeps coming up. Teams are being asked by customers, auditors, legal, and leadership to document how they used AI in their products. What models are running? Where did they come from? Who approved them? What risks have been assessed? What licenses apply? The uncomfortable truth is that nobody knows, and they have no idea how to produce that documentation. This isn’t a program failure – the tools these teams rely on simply weren’t built to provide that kind of audit trail.

There’s also uncertainty about scope. What even counts as an AI component and how do you decide which need to be tracked? Only LLMs? What about MCP servers, agents, prompts, or datasets? There are no clear guidelines, and that ambiguity make this problem harder to resolve.

Why the SBOM Is Not Enough

When I tell AppSec teams that AI components need to be documented for compliance, their first instinct is almost always the same: point to the SBOM. It’s the supply chain inventory artifact most teams have already invested in, it feeds compliance workflows, and it’s supposed to tell you what’ is in your software. It’s a capability we include in Checkmarx’s supply chain security solution for exactly that reason.

But that’s the problem. The SBOM was built for a world of packages declared in manifests, versioned discretely, with vulnerability histories tracked in maintained databases. That model holds up well for open-source libraries, but it doesn’t hold up for AI components.

Here’s why: an LLM integration doesn’t declare itself in a requirements.txt or package.json. It shows up in source code as an API call, with a provider name and model identifier buried in a configuration string. The SBOM captures the SDK; it says nothing about which model is behind it, whether that model is version-pinned, where it originated, or what risks it carries. An agent framework appears in the dependency graph as a single package entry, the framework version, nothing more. What tools the agent can invoke, how broadly it can act, and whether anyone with security responsibility has reviewed that scope are nowhere in the manifest. An MCP server, and these are now common in production codebases, may exist entirely in a runtime configuration file that no scanner currently treats as a security artifact.

Then there is the risk taxonomy problem. CVEs cover code-level vulnerabilities in versioned packages, but the risks that matter most for AI components don’t fit neatly into that model. Model weights tampered with during training, datasets poisoned at the source, floating version references that silently change application behavior when a provider pushes an update, licensing obligations buried in dataset terms that no license scanner was designed to evaluate: none of these exist within the vulnerability intelligence infrastructure that makes the SBOM actionable.

That’s the gap an AI-BOM (AI Bill of Materials) is designed to fill. From my perspective, this feels like the same moment the industry went through with the SBOM about five years ago. The requirement is taking shape, the standards are emerging, and the teams that start now will be ready when compliance becomes unavoidable.

This Is a Global Compliance Problem, Not a European One

I recently co-presented a session with Carsten Huth, our Global Director of AppSec Advisory, on supply chain security and regulatory compliance. He made a compelling point: AppSec is becoming the technical implementation layer for CRA compliance. Security is no longer optional – it’s mandatory and it’s regulated.

The EU Cyber Resilience Act (CRA) and the EU AI Act get the most attention because their timelines are defined and their penalties are significant. But they’re just a part of a broader, global shift in how governments and enterprises need to be thinking about AI accountability.

In the United States, NIST AI Risk Management Framework (RMF) has evolved from optional guidance into a reference framework embedded not only in federal procurement, but increasingly in enterprise vendor assessments across regulated industries. In the UK, the AI Safety Institute is developing frameworks that mirror many of the EU’s documentation and transparency expectations. ISO 42001, the international standard for AI management systems, is becoming a global procurement requirement as enterprises extend their vendor risk programs to cover AI-enabled software.

At the same time, regulators in sectors like financial services, healthcare, and critical infrastructure are incorporating AI governance directly into existing compliance frameworks. In practical terms, this means that AppSec teams globally shipping software aren’t just managing a European compliance problem. They’re operating within a broader expectation for AI governance – one that’s formalized differently across jurisdictions, but ultimately converges on the same core requirement: you need to be able to demonstrate, with clear and documented evidence, that you know where AI exists in your products and that you are actively managing it.

What I Hear From AppSec Teams Around the World

In our session, Carsten shared something I’ve been hearing over and over again in customer conversations across regions: while regulatory frameworks differ, the security requirements they impose are similar. Whether a team is preparing for CRA, aligning with NIST AI RMF, or responding an enterprise ISO 42001 vendor questionnaire, they’re being asked to do the same thing – know your AI components, assess their risks, document governance process, and produce evidence.

The challenge isn’t lack of awareness; most AppSec leads understand what’s required. The real issue is in having the tooling and processes that meet those expectations. And that gap showed up clearly when we surveyed over a hundred AppSec and development professionals: Shadow AI is already the reality. 43% have no formal governance over which AI components their developers are permitted to use. At the same time, 70% expect AI components in production by the end of 2026, with 24% already there today.

Production codebases most commonly surface three categories of AI assets: LLMs, AI libraries, and AI agents – with the most frequently detected providers ranking as #1 OpenAI, #2 Google, and #3 Linux Foundation. These aren’t experimental integrations; they are core dependencies that influence application behavior, data handling, and decision-making. But in most organizations, security teams don’t have a structural inventory or record of them.

What we asked customers what they need most, their priorities were clear: first, enforcement in CI/CD and PR, then compliance and audit reporting. That ranking says a lot. Teams closest to this problem already understand that enforcement and documentation need to go hand in hand. You can’t produce credible compliance evidence for AI components if their use was never controlled in the first place. The audit trail has to start at integration – not be reconstructed when an audit request arrives.

Why AI-BOM Is the Right Artifact

I want to be clear about what an AI-BOM is, because I think there’s still genuine confusion in the market about what compliance frameworks are asking for.

An AI-BOM is a structured, machine-readable inventory of the AI components embedded in or consumed by a software application. Where a traditional SBOM tracks packages, an AI-BOM tracks the things that define how AI behaves in your product:

  • Models: provider, version, provenance, and fine-tuning lineage
  • Datasets: origin and licensing terms
  • Agent frameworks and SDKs: the libraries that power AI behavior in code
  • Agents: what tools they can invoke and what constraints govern their execution
  • MCP servers and clients: external connection configuration
  • System prompts: the instructions that shape model behavior in ways that are compliance-relevant even when they never appear in a package manifest

This broader scope reflects the reality of what’s running in production AI-enabled applications, which goes beyond than what most programs currently track. And just as important as the content is the format. The CycloneDX specification has extended its schema to support AI and machine learning components, providing a standard way to make AI-BOMs interoperable with existing compliance infrastructure.

At the same time, the OWASP AI Exchange has been doing important work defining what a complete AI-BOM should include. Aligning with these standards ensures the output plugs into existing workflows instead of requiring entirely new ones.

Regulators and enterprise customers are looking for documentation that can answer specific questions about specific components, with enough detail to support meaningful assessment. That’s exactly what a well-structured AI-BOM provides – and what an SBOM alone can’t deliver.

What We Built at Checkmarx and Why

When we designed Checkmarx AI Supply Chain Security, we made several decisions that shaped the product. The most important was this: detection had to be deterministic. We made a deliberate choice not to use AI to detect AI components.

The reason is straightforward. If a finding in an AI-BOM cannot be traced back to a specific location in source code or configuration, it won’t hold up as audit evidence. When an auditor asks where a component reference comes from, they need something concrete – a file name and a line number – not a confidence score. Deterministic detection, based on imports, API call patterns, configuration files, and string constants produces findings that developers can act on and auditors can trust.

The second decision was to build AI-BOM generation directly into Checkmarx One, rather than offering it as a standalone product. The rationale is similar to why teams adopted SCA inside an AppSec platform: governance works best when it embedded into existing workflows, not running alongside them as a separate process.

Checkmarx One generates the AI-BOM in CycloneDX format, aligned with the schema extensions for AI and machine learning components. When a customer or auditor requests a bill of materials, the AI layer is already included.

Policy enforcement works through the same mechanisms teams already use for open-source governance. Provider allowlists and blocklists are enforced in pull requests and CI/CD pipelines, floating model references get flagged before they introduce undocumented changes, and agent scope constraints surface alongside SAST and SCA findings in the same security gate. The discipline is familiar; the coverage now just extends to AI.

For teams navigating multiple compliance frameworks simultaneously – which, given the global regulatory landscape, is most enterprise teams – Checkmarx One also provides compliance posture reporting that maps AI-BOM output to the documentation requirements of CRA, the EU AI Act, NIST AI RMF, ISO 42001, and others. When an assessment arrives, the documentation does not need to be assembled from scratch; it’s already versioned per release, traceable to specific code, and structured to answer the questions auditors are asking.

As Carsten put it: “The SBOM with AI-BOM as structured documentation, providing tooling output for audit-ready security evidence.” That is exactly what we set out to build.

Timing Is Everything in AI Era

The question I hear most often from AppSec leads is simple: how urgently do we need to move on this?

My honest answer is that the urgency is already here, even if it doesn’t feel that way yet. For teams shipping into European markets, the CRA timeline is fixed: reporting obligations in 2025, CE marking in 2026, and full compliance by December 2027. In the U.S., alignment with the NIST AI RMF is already showing up in vendor assessments. In sectors like financial services and healthcare, regulators are extending existing compliance frameworks to cover AI. And across the board, enterprise procurement teams are starting to ask for AI-BOM documentation in security questionnaires, driven by their own compliance requirements.

At the same time, building the ability to produce a compliance-ready AI-BOM takes longer than most teams expect. Achieving meaningful detection coverage across applications, integrating into CI/CD pipelines, defining policies, and validating outputs against regulatory expectations aren’t quick wins; they require deliberate investment and take time.

But beyond timelines and deadlines, there’s a more immediate reality: AI components are already in production. They’re influencing product behavior in ways that carry real risk, and that risk is invisible in a traditional SBOM. Governing those AI components properly isn’t about preparing for future compliance obligations – it’s about closing a security gap that exists right now.

Tags:

Agentic AI

AI Security

AppSec

Software Supply Chain Security

Supply Chain