AI Model Confusion: An LLM/AI Model Supply Chain Attack - Checkmarx
← Blog

AI Model Confusion: An LLM/AI Model Supply Chain Attack

Checkmarx Zero research reveals the AI Model Confusion attack pattern against registries like Hugging Face, building on Dependency Confusion in OSS library registry. Learn what it is and how to defend yourself.

A robot wearing the HuggingFace logo is trying to hug a male developer. The developer is reacting in horror.

Recently, our team conducted an in-depth analysis of supply chain security with a focus on the AI ecosystem. During this investigation, we uncovered a new supply-chain attack vector that can compromise code that insecurely loads local models.

Today, we’re introducing this new supply-chain attack against registries of LLMs and other AI models: we formally call it Model Confusion. But we like to think of it as “unwanted hugs from strangers”.

See also our prior research on risks to be aware of from Hugging Face, in our “Free Hugs” series: start with Part 1 on our blog

Avoiding Hugs From Strangers: AI Model Confusion

Model Confusion resembles the well-known Dependency Confusion attack in software development. For context, dependency confusion occurs when a project references a local package that may not exist on the developer’s machine. If the package is missing locally but exists in a public package registry, the package manager may automatically download the remote version. This creates an opportunity for attackers to publish malicious packages with the same name as internal dependencies, potentially targeting organizational developers.

Model Confusion operates similarly but affects AI models rather than software packages. The implications are significant, as mistakenly downloading a malicious model could lead to severe security risks, including remote code execution (RCE) or the usage of compromised models.

Nothing here is hypothetical — we’ve already found popular open‑source code samples from top tech firms that could execute malicious code if run as‑is.

The Discovery

It all started with this line:

To get a pretrained model, you need to load the weights into the model. This is done by calling from_pretrained() which accepts weights from the Hugging Face Hub or a local directory.

– Hugging Face Documentation

Do you smell dependency confusion creeping in here, like we did?

Before we dive into how to identify and mitigate Model Confusion in your codebase, let’s see that in action:

Miniature demo – some details redacted due to contract obligations

On the other hand, if this model exists locally, you can see no calc is spawned, and the local model is the one to get executed:

Complete demo – some details redacted due to contract obligations

This demo shows how an official code sample (from one of the Fortune 500 companies) led to Remote Code Execution via Model Confusion. We’ve reported the issue to them, and they resolved it within a single day.

Technical Analysis

Similar to Dependency Confusion, the core challenge in this supply-chain attack is identifying which local packages to target to maximize the likelihood that victims will take the bait. However, Model Confusion relies on different discovery and exploitation techniques. While we developed several internal methods for identifying sensitive names, we won’t disclose them to avoid enabling malicious use. What we can say is that Model Confusion is effective because users often store local models in predictable, easy-to-guess directories.

Later, when the code is shared with other developers, either privately or via open-source projects, and the expected local models are missing, anyone who downloads and runs the code may inadvertently load a malicious remote model instead of the intended local one.
Organizations should proactively secure predictable identifiers, as failure to do so can allow threat actors to exploit naming ambiguity and mislead users.

Prerequisites

During the research, we’ve also gathered some prerequisites that must be met for the attack to be possible, and while this might look like a long list, in practice, they can be found quite easily in the wild:

  1. The expected local model is missing from the victim’s machine or is in a different path.
  2. The model path does not start with ./ or / (those imply local-only access), and it does not contain HF-username-restricted characters.
  3. The path has exactly two components (for example: checkpoints/model-name).
  4. The parameter local_files_only is not explicitly set to True.
  5. The attacker owns the username matching the sensitive directory name.
  6. While not a “real” prerequisite for this attack, the parameter trust_remote_code determines whether the code is vulnerable to RCE (when set to True) or to a compromised model (when set to False or not set).

Not to say this is the entire list, but we believe it’s comprehensive enough and provides a good indication of whether you need to act and how quickly.

To make these prerequisites easier to understand, let’s look at a few practical code examples:

Note: These code samples assume that an attacker has claimed the checkpoints username and uploaded a malicious model named some-model, and that the appropriate local model doesn’t exist on the victim’s machine.

from transformers import AutoTokenizer

# --- Vulnerable below ---

# Example 1 - Vulnerable to RCE (`trust_remote_code` is set to `True`)
tokenizer = AutoTokenizer.from_pretrained("checkpoints/some-model", trust_remote_code=True)

# Example 2 - Vulnerable to the usage of a compromised model
# same code as the first example, without setting `trust_remote_code` to `True`
tokenizer = AutoTokenizer.from_pretrained("checkpoints/some-model")


# --- Safe below ---

# Example 3 - violates prerequisites #2
# The "./" prefix enforces local directory only
tokenizer = AutoTokenizer.from_pretrained("./checkpoints/some-model", trust_remote_code=True)

# Example 4 - violates prerequisites #3
# There are more than two components in the path, resulting in an invalid HF username
tokenizer = AutoTokenizer.from_pretrained("another-directory/checkpoints/some-model", trust_remote_code=True)

# Example 5 - violates prerequisites #4
# `local_files_only` is set to `True`, preventing Model Confusion
tokenizer = AutoTokenizer.from_pretrained("checkpoints/some-model", local_files_only=True, trust_remote_code=True)

Particularly Interesting Directories/Namespaces

To protect the community during our research, we captured several usernames that could have been used for exploitation and reported them to Hugging Face. This triggered a brief security lockout on our end:

Screenshot of a Hugging Face login error message displayed in a light red alert box. The message reads: “The Hugging Face account associated with user █████████ has been locked out following the violation of our Terms of Service. If you think this is mistake, you can contact safety@huggingface.co.” Below the message is a white button labeled “Login”. Beneath the button is a link that reads “Forgot your password?”
Directory lockout message

We guess it’s true that no good deed goes unpunished. (Luckily, the HF team has since restored our account.)

Anyway, let’s take a look at one particularly interesting case – the checkpoints directory. This directory is commonly used for saving checkpoints, which are fine-tuned models. We found out that this username was available for registration at the time. But rest assured, we’ve captured this username (among others) to prevent malicious actors from exploiting it.

In fact, the recorded demo demonstrates how Model Confusion can be exploited by utilizing the checkpoints directory. The sample looks for a fine‑tuned model in the checkpoints folder, but the repository does not include that file.

Suppose an attacker controls the checkpoints username and uploads a model containing malicious Python code to it. In that case, anyone running the sample will actually download that malicious model and be vulnerable to a Remote Code Execution (RCE) because of the trust_remote_code=True.

However, this directory is not the only one that should be considered “sensitive”.
Additional examples we found in the wild:

  • checkpoints
    • Discovered in real-world code from a Fortune 500 company.
  • outputs
    • Discovered in real-world code from a Fortune 100 company.
  • models-tmp, results, ckpts, and more.

However, unfortunately, it’s impossible to catch them all. Some users have already claimed some sensitive names before we discovered this attack vector. Even though we haven’t found any malicious activity yet and cannot determine the real intentions behind these usernames, it’s still helpful to highlight some examples that may require attention:

  • namespace – Used by the huggingface_hub library (123), which this user captured.
  • pretrained – used by open-mmlab/Amphion and captured by this user
  • Among others: output, result, tmp, checkpoint, etc.

Am I vulnerable?

To know if you’re vulnerable, first check if all your models that are accessed with the format <single_dir_OR_organization>/<model_name> originate from a remote organization you trust.

If not, either of the following options means you need to take active actions:

  • The username exists on Hugging Face, but you don’t recognize it or do not trust it
  • An organization with the same name as the directory doesn’t exist

Note: As mentioned in the prerequisites section, HF restricts certain special characters in usernames. If your directory name contains those characters (e.g., underscore), you’re probably safe for now. However, this may change in the future; therefore, we highly recommend not relying on this as a prevention mechanism and applying the following mitigations in these cases as well.

Mitigations

  • If you know the models are only supposed to be loaded from the local directory, you can use one of those options:
    • Set the HF_HUB_OFFLINE=1 environment variable, which applies globally.
    • Set the local_files_only flag to True in the code, which applies at the code level.
  • Another alternative is to use the absolute path to the model, thus eliminating the risk of using remote models. Or if a relative path is required, you can prepend it with a ./ or ../ (for example, ./dir-name/model-name) to enforce a local path.
  • For a more robust security posture, the Hugging Face security team recommends that large organizations subscribe to Enterprise Hub and configure an allowlist of approved models that users can download.

Disclosure Timeline

Hugging Face

  • First report – 22 Oct 2025
  • Thanks to the Hugging Face security team for adding a couple of additional possible mitigations

Amphion

  • First Report – 08 Dec 2025