Mutation Cross-Site Scripting (mXSS) Vulnerabilities Discovered in Mozilla-Bleach

7 min.

July 8, 2020

As part of the beta testing phase that took place earlier this year for our recently launched Software Composition Analysis solution, CxSCA, the Checkmarx Security Research Team investigated Mozilla-Bleach, finding multiple concerning security vulnerabilities. Patches were released in mid-March 2020, with Checkmarx CxSCA customers using Bleach receiving notice of the issues in advance. Given that the patches have been in-market for some time, giving Bleach users sufficient time to update their software versions, we’re now publishing the full technical report and proof-of-concept video for educational purposes.


According to documentation, “Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes and is intended for sanitizing text from untrusted sources.” In simpler terms, Bleach is a very user-friendly HTML sanitizer, and its main purpose is to disallow arbitrary tags to run (e.g., JavaScript (JS) tags and attributes to prevent cross-site scripting (XSS)).
After a bit of fuzzing and using some different approaches, Checkmarx researchers discovered the possibility that a mutation XSS (mXSS) vulnerability may exist. With further digging, these suspicions were confirmed, and several mXSS vulnerabilities were discovered in the Mozilla-Bleach python package.
An attacker abusing these vulnerabilities would have the ability to execute an arbitrary JavaScript code on the user end, via various sites or projects that use Bleach.

Mutation XSS (mXSS)

A mXSS vulnerability occurs when there is incoherent parsing between the client and the sanitizer. To understand this better, the following example should help.
Let’s see how a standard browser interprets invalid HTML. When we enter the data below into the innerHTML of the page:The browser will modify the data to make it valid html. In this case, this is what the output looks like:
Now let’s try to change the div tag to a different type of tag, for example:

Doing so will generate the result below:

Both examples act differently because the data inside the tags are parsed differently according to the tag type. Now, imagine the parser goes from left to right. In the first case, after entering the div tag, the parser stays as html and opens an a tag with the title attribute (because the “closing” div tag is text in an attribute, it will not close the tag).
In the second case, when the parser enters the style tag, it changes to CSS parser, which means no a tag is created, and the style tag will be closed where the attribute was supposed to be.
So, how can this information help us in finding vulnerabilities? Imagine a tag that parses differently in different cases, for example, the noscript tag. The trick here is that the noscript tag in HTML is treated differently, whether JavaScript (JS) is enabled or disabled. When JS is enabled, the data inside the tag is parsed as JS. But, when it’s disabled, the data is parsed as html. In nearly all cases, JS is enabled in browsers.
Let’s take a look at how the following input is being interpreted with, and without, JS enabled:

Here, JS is disabled:

Here, JS is enabled:

Vulnerability: CVE-2020-6802

When we tried to pass the above input to Bleach, it sanitized the ‘< ‘characters in the attribute, but also it closed the a tag! This means that it parsed the data in noscript as html.

In this case, the only thing left is to avoid this sanitization. If that wasn’t enough of a challenge, we attempted to enter another parsing into the equation.

This provided the outcome we were anticipating.
Sanitizer view:
Enters noscript and the parser is HTML, opens a style tag, and starts parsing as CSS (or raw text). Everything after the style tag isn’t parsed as html, so from the sanitizer’s viewpoint, there is no closing noscript tag nor img tag.
Browser view:
Enters noscript and the parser is changed to JavaScript. Now the “<style>” is just text, not a tag. As you can see, the closing tag, in this case, actually closes the noscript tag, and from there, everything is html.
The conditions to successful exploitation are: noscript tag allowed as well as html comments, or one of the following tags: title, textarea, script, style, noembed, noframes, iframe, xmp.

Vulnerability: CVE-2020-6816

Shortly after, the Checkmarx Security Research Team discovered another mXSS vulnerability in Mozilla-Bleach, this time with the use of svg/math tags.
The caveat here is that the parsing inside those tags is like XML. So, if we enter, for example, a style tag, the data inside will act differently, whether inside or outside.
Inside an svg tag:

Without an svg tag:

This shows how differently the data inside the style tag is being parsed. In addition, some unwanted tags inside the svg/math will automatically pop out of the svg/math and will be parsed as HTML (e.g., <img>).
When the team tried to put a malicious img tag in svg/math->style->img, Bleach acted strangely.
In case the img tag was whitelisted, it parsed it like the browser and sanitized unwanted attributes as expected. And when the “strip” variable was set to true (meaning it will delete unwanted data instead of sanitizing it, default is false), it got deleted. But in case “strip” was not changed, we could use any tag that wasn’t allowed and bypass Bleach.

After further investigation, we saw that html5lib (the parser behind Bleach) does recognize the data inside svg->style as tags. But for some reason, Bleach doesn’t sanitize unwanted tags.


According to GitHub, more than 72,000 repositories are dependent on Bleach. Among them are major vendors, including multiple Fortune 500 tech companies.

Summary of Disclosure and Events

When the first vulnerability was discovered, our research team ensured that they could reproduce the process of exploiting it. Once that was confirmed, the Checkmarx team responsibly notified Mozilla of their findings. Subsequently, they opened a Bugzilla ticket where the team helped Mozilla find a proper mitigation approach, and they fixed the issue rapidly.
Soon after that, the second vulnerability was discovered by the research team. Again, a responsible notification was sent to Mozilla, and a Bugzilla ticket was quickly opened and resolved.
Checkmarx customers using CxSCA were automatically notified to update Mozilla-Bleach.

Bugzilla tickets

CVE-2020-6802 –
CVE-2020-6816 –

Timeline of Disclosure

  • 13-Feb-2020: First vulnerability reported
  • 14-Feb-2020: Checkmarx customers who were using Bleach were warned, without exposing the vulnerability‘s details
  • 19-Feb-2020: Fixed version v3.1.1 and an advisory on GitHub was released
  • 25-Feb-2020: CVE-2020-6802 was assigned
  • 11-Mar-2020: Second vulnerability reported
  • 11-Mar-2020: Checkmarx customers who were using Bleach were warned, without exposing the vulnerability‘s details
  • 17-Mar-2020: Fixed version v3.1.2 and an advisory on GitHub was released
  • 19-Mar-2020: CVE-2020-6816 was assigned

Final Words

Discovering vulnerabilities like the ones documented in this report is why the Checkmarx Security Research Team performs investigations into open source packages. With open source making up the vast majority of today’s commercial software projects, security vulnerabilities must be taken seriously and handled more carefully across the industry. Solutions like CxSCA are essential in helping organizations identify, prioritize, and remediate open source vulnerabilities more efficiently to improve their overall software security risk posture.


mXSS –
CVE-2020-6802 advisory –
CVE-2020-6816 advisory –
CVE-2020-6802 Bugzilla ticket-
CVE-2020-6816 Bugzilla ticket –
For more information or to speak to an expert about how to detect, prioritize, and remediate open source risks in your code, contact us.