This is the third and final blog on Exploitable Path – a unique feature that allows our customers to prioritize vulnerabilities in open-source libraries. In the first blog, we introduced the concept of Exploitable Path and its importance. The conclusion was that a vulnerability in a library is considered exploitable when:
- The vulnerable method in the library needs to be called directly or indirectly from a user’s code.
- An attacker needs a carefully crafted input to reach this method and trigger the vulnerability.
- Using a query language over the CxSAST engine for the abstraction of queries over source code. This allows a more language-agnostic approach, so that Exploitable Path works for every programming language supported by CxSAST.
- We walked through the various CxSAST queries that are required to build a full call graph of a user’s source code and its libraries’ source code. By crossing it with vulnerability data, we can know if a vulnerability is exploitable or not.
Challenge no. 1 – Supporting Multiple Library VersionsThe public data on a CVE usually contains affected versions, but how can we use this information to support Exploitable Path across versions? Meaning, if the source code of a library changes between various versions, how can we have the required data for Exploitable Path for each of those versions? Let’s assume we have a user’s source code that uses a single open-source library. This library contains a vulnerability, and using Mitre, we can figure out the affected versions. To be able to assess if the vulnerability is exploitable, we need the following for each version on the library:
- A call graph of the library’s code. This can be done automatically using CxSAST.
- Is the current version vulnerable?
- If it is, the inner method in which the exploitation occurs is required.
Challenge no. 2 – Data FlowJust because your code calls a vulnerable method, that doesn’t mean you are automatically at risk. To assess the risk properly (and avoid false positives), it’s crucial to have both a call graph and a DFG (Data Flow Graph) of a code to assess its exploitability Let’s start with an example, and assume that a method called parse(content) has a DoS (Denial of Service) vulnerability given the right input. If parse() is only called with a constant value, meaning parse(CONSTANT_VALUE), there is no attack surface for an attacker to exploit it and cause a DoS. On the other hand, if a user of the application controls the input parameter of parse(), it’s a different story. For example, this input can be a comment or other data provided by the user. In such a case, the attacker can easily exploit the vulnerability and craft the required input. The reality is more complex, as there are various ways data can be transferred in code:
- Input parameters
- Global or class members
- The return value of another method invocation
- A constant value. This is not exploitable, of course.
- An input parameter of a method that is not called by other methods. This is a potential data flow compromise, as in the context of the static code scan, we don’t know how the method is invoked.
- An internal method of the language is called, such as fopen() in Python.
- A method of a different library is called, and its source code is not available.
- As a rule of thumb, mark those methods as a potential for data flow compromise since the inner implementation is unknown.
- Mark specific methods as definite data flow compromises. For example, reading contents from a database \ pipe \ file. The same goes for parsing HTTP packets, pulling a message from a message queue, etc.