Appsec Knowledge Center

The Complete Guide to AI: Application Security Testing

17 min.

Application Security hero image

Summary: The use of LLMs and other AI advances have changed the way developers generate code and deploy applications, but they don’t come without risk. This article is a deep-dive into the application of AI in cyber security to support DevSecOps teams in leveraging GenAI and LLMs, while reducing the growing risk landscape associated with AI. 

The application of AI in cyber security is a huge topic, and one that’s changing all the time. In this guide – we will look specifically at artificially intelligent (AI) application security solutions, and click down into areas such as AI-driven software composition analysis, the application of AI in SAST (Static Application Security Testing), and the application of AI in security where it relates to generating and deploying more secure code at every stage of the Software Development Lifecycle (SLDC). 

 

Going deeper into practical strategies and products, this guide will also discuss a variety of tools that leverage AI, including SAST tools, AI tools for IaC (Infrastructure as Code) and Checkmarx features that provide SCA (Software Composition Analysis) and describe how they can reduce risk while supporting DevOps and AppSec teams in keeping up with the pace of development. 

AI in application security CISO survey results

“2024 Appsec Executive Survey:
What is your level of concern about security threats stemming from developers using AI code generation tools to write code?”

 

If you’ve been asking yourself, ‘What are the risks of the growing use of AI in application security?’, ‘Where does GenAI come into the picture?’, and ‘How can teams leverage AI without adding risk factors to their organizations?’ you’re in the right place!  

The Rise in the Application of AI in Cybersecurity

The truth is, the application of AI in security is nothing new. The global market for AI in cyber security is predicted to hit 133.8 billion by 2030. AI tools enable cyber security professionals to free up time from being weighed down by manual tasks, and work in a smarter and more efficient way to face the growing number of threats organizations are facing. AI can already take on roles across varied areas of cyber security such as threat detection, data analysis, automating manual tasks, alerts, reporting and more. 

More recently, GenAI, a subset of AI that involves creating new content such as images, text or code using complex algorithms and models, has stolen many a headline. This branch of AI that focuses on creating original and creative outputs – rather than simply analyzing or classifying existing data, is behind the Large Language Models (LLMs) that are increasingly prevalent, including ChatGPT, Bard, Copilot, and more. LLMs for developers focus on generating code, making it quicker and easier for Dev teams to do their work. 

However, as development teams increase their use of AI to speed up their work, as well as create their own LLMs to enhance their products, security concerns need to be front and center. Organizations should take some time to consider how they can support application security teams with making security a core part of both application development and deployment. 

Already, 82.5% of developers are using GenAI to write code, and 42% say they trust the output of LLMs, with just 31% holding back and saying they are unsure. 77% of respondents are happy to say that they like the experience of using AI tools, and up to 80% of developers recognize that their development workflow will look extremely different 12 months from now – as a direct result of AI tools. There’s no doubt that AI-based code generation is the future, from writing code itself, to getting feedback and troubleshooting during development. 

 

But before rushing into a new way of working, it’s critical for businesses to understand the risks. How does AI work, how can LLMs do so much so quickly, and what threats should security teams be aware of before giving developers the freedom to use AI in their day-to-day work? 

Understanding the Risks of GenAI for Development Teams 

To help fully grasp the risks involved in using LLMs, let’s look to the OWASP Top 10 for LLMs, a list of the most critical vulnerabilities found in applications that are utilizing LLMs. This list is a practical support to guide developers and security professionals in understanding the threat landscape, whether they are leveraging AI and LLMs in their work, or whether they are building LLMs as part of their product or service. It categorizes the challenges businesses will face to help them to implement the right security tools to close gaps in visibility and control, and to enable AppSec velocity while securing the new ways of working. 

1. Prompt injection

Threat actors can manipulate an LLM using unique prompts that make the LLM execute malicious intentions. This can be done directly, usually called jailbreaking – where the underlying system prompt is overwritten or exposed, which means the attacker can access data or backend systems.

biggest AppSec- related concerns Prompt Injections

It can also happen indirectly, when an LLM accepts input from other sources, which could be controlled by an attacker, and may not even be recognizable to the human engaging with the LLM. Prompt injection can lead to data leakage, social engineering attacks, and more. 

2. Insecure output handling

LLM-generated content is controlled by prompts, which means there can be challenges in validating information, sanitizing data, and even with threat actors injecting malicious content before it’s passed downstream. This can result in remote code execution on backend systems, privilege escalation, or CSS and CSRF in web browsers. In some cases, third party plugins may not validate LLM input at all, or the LLM has privileges which it does not need, making this threat even more of a risk. 

3. Training data poisoning

All LLMs use raw data, or training data, to learn what they need to know to generate the right outputs, and also use fine tuning data to narrow down an LLMs remit. Using neural networks, LLMs learn to find patterns and create outputs based on their training data. However, if attackers manipulate the original data, the model can be compromised or contain vulnerabilities. Users may automatically trust an LLM, when actually it is surfacing malicious or incorrect information. Training data poisoning is a kind of integrity attack, which is more prevalent when a model is training from external data sources where the LLM owners do not have control over the data. 

4. Model denial of service

Similar to a traditional Denial of Service attack, where a target is overwhelmed by an exceptionally high amount of traffic, a model denial of service attack is when an attacker consumes enough resources from an LLM to reduce service quality or result in a high level of expense for the owner. One approach is to manipulate the context window – which is the maximum length of the text that an LLM can work with, and tightly linked to the complexity an LLM can manage. Attackers can attempt a model denial of service attack by sending unusual queries that are resource-intensive, continuous input overflow, repetitive long inputs, and forcing recurring resource usage through tasks in a queue. 

5. Supply chain vulnerabilities

There are many players involved in the world of GenAI and LLMs, including third parties who provide pre-trained models or training data, as well as LLM plugin extensions. Unlike with traditional supply chain vulnerabilities, machine learning vulnerabilities do not need to rely on software components, as developers often rely on third party downloads and packages, without considering the wider risk. Threats in this category include poisoned crowd-sourced data, vulnerable pre-trained models, a and the use of deprecated or outdated models. 

6. Sensitive information disclosure

In today’s compliance-heavy landscape, personally identifiable information needs to be protected at all costs. However, LLMs can reveal sensitive information, including customer data, algorithms, or trade secrets and intellectual property. If a user unintentionally inputs information, it can resurface at another time. However, many users leverage LLMs without data sanitization, which stops user data becoming further training data.

biggest AppSec- related concerns Data Leakage

Security teams should look out for incomplete filtering of sensitive information in responses, unintended disclosure of confidential information, and memorization of sensitive data in the training process. 

7. Insecure plugin design

For added agility, LLM plugins can be automatically called by a model, usually without the risk reduction benefit of application control. According to OWASP, they often have free-text inputs, without validation or type checking, which means an attacker could even send configuration strings instead of parameters. A threat actor can use this vulnerability to create a malicious request, forcing their agenda through the model directly, including privilege escalation, data exfiltration, and even remote code execution. Tracking authorization across plugins is critical, as you cannot simply assume that the inputs were sent by the end user. 

8. Excessive agency

LLMs are not predictable, we all know that by now. When LLMs present an unexpected or ambiguous output, the controls may not be in place to restrict what happens next. This is usually across three categories, the functionality of the LLM, the permissions it has, or the autonomy it holds. Either way, the developer has provided it with too much agency. Examples of each include:

 

  • Excessive functionality: An LLM with access to plugins that can modify or delete documents, or plugins that should have been removed during testing.
  • Excessive permissions: An LLM plugin that has a generic identity as a privileged account so that it can provide answers from a document repository. 
  • Excessive autonomy: A vulnerability where LLMs do not verify actions before completing them, such as deleting documents without checking with the user.

9. Overreliance

Especially when leveraging a trusted source or one which has performed well for users in the past, users often take the output of an LLM as gospel. This can cause an overreliance, which becomes an issue if an LLM is hallucinating, or if the tool has been manipulated with. LLM-generated source code without oversight or validation can introduce vulnerabilities into any environment, risking the operational safety and security of an application. OWASP suggests continuous validation mechanisms for all content taken from an LLM. 

10. Model theft

Even a trustworthy LLM model can be accessed and manipulated by threat actors, leading to vulnerabilities. Attackers can gain unauthorized access via network misconfiguration or application security settings, or simply by querying the model API using prompt injection and then creating a shadow model. LLM model theft is a growing concern as these models have access to a huge wealth of data, and the trust of their customers. A comprehensive security framework for LLMs needs to include access control, data encryption, and scanning and monitoring processes. 

Examples of AI-based Attack Scenarios Through AI and LLMs

Let’s look at just two common attack scenarios that can help you to understand in practice how developer use of LLMs can tangibly add risk or open your business up to an attack. 

 

The first one is connected to hallucinations. There’s no doubt that LLMs do hallucinate – which is any situation where the model makes up additional information to close gaps, or where assumptions are made which are incorrect. When a developer asks for code, and the package is a hallucination, it’s easy to consider this merely an annoyance; you thought a package would be helpful, and actually it doesn’t even exist. However, in reality, attackers can leverage hallucinations for something a lot more sinister. When attackers see that a package suggestion is a hallucination, they can then create that exact suggestion, so that when the next person is given the same hallucination in response to a similar prompt, the user will pull it down, unwittingly opening the organization to an attack. 

 

Our second example is an LLM arbitrary code exploit, which is similar to a supply chain attack. On a platform such as Hugging Face, where the machine learning community can collaborate on AI and Machine Learning, there are many LLMs for developers to pull down and use. However, without robust scanning and governance, an attacker could add a malicious packet to a model, and then simply reupload it with a slightly different name. The idea is that users will download and run it, without realizing that they are injecting infected code. 

Implementing the Right Strategies in the Era of AI Application Security

Understanding these risks doesn’t mean hitting the brakes on using GenAI or benefiting from LLMs for developers. Instead, it means that organizations need to focus on implementing the right safeguards and tools to securely leverage AI in their environment – and empower developers to benefit from the latest innovation, without negatively impacting risk reduction. 

 

No matter what, each AI solution will have a level of risk. To reduce that risk, organizations need to go through three discrete stages:

 

  1. Assess the situation: There is no one-size-fits-all solution for AI. Every early adopter of a new LLM or a new AI-based tool needs to consider what the risk might be. Ask yourself questions such as, is the LLM connected to public data? What community is it available to? When you know what you’re dealing with, you can create guidelines accordingly. 
  2. Define your needs: Now that you understand the risk, have you got visibility and control? What policies do you want to put in place for usage and governance? You may need to onboard new technologies specifically for AI security, and you will almost certainly have to implement education for developers, security teams, and the wider organization. One example could be introducing them to tools such as modelscan.ai, which can help developers to make safer choices when leveraging LLMs. 
  3. Execute your solution: Now that you know the risk, and you know your preferred mitigation, it’s time to execute. This could be anything from implementing any new processes and tools to launching new education programs. Make sure you can detect threats and protect your environment, and include governance and reporting that can be scaled to meet growing usage of AI, and iterated moving forward. 

 

Remember, it’s not about dropping the DevSecOps processes you already have in place. AI isn’t coming to replace your existing processes. Instead, think about how you can leverage AI to enforce the policies and the processes you already have and that work for your teams to meet a fast-changing tech landscape and leverage new tools like LLMs to work for you. With a strong strategy around AI, you can expand the scope of what you do, while building solutions that enable smart use of the latest innovation. 

 

Let’s look at a few areas where Checkmarx have innovated with AI to accelerate AppSec teams, reduce AI-based attacks, and enable the developer workflow. 

The Application of AI in SAST Software from Checkmarx

At Checkmarx, we’ve been thinking about protecting applications from AI risks, and using AI more widely across our platform for some time. In particular, the application of AI in SAST (Static Application Security Testing) is one area we have built solutions for – empowering teams to use GenAI to enhance SAST and support developers in using AI and LLMs in a secure way. 

 

AI Security Champion with auto-remediation is a great example. Of course applications will have vulnerabilities, but the real test is whether you can get them mitigated quickly and before your release date. In Checkmarx’ Future of Application Security Report, we found that 29% of AppSec managers knowingly release vulnerable applications to meet a deadline. 

 

To support teams in both identifying and solving all vulnerabilities in a significantly reduced amount of time, AI Security Champion now finds issues in the application, and provides the specific code that can then be used in the development workflow to fix the vulnerability. Developers can review the issue, and implement the fix, without any bottlenecks or further support. Without being security experts, this enables them to fix vulnerabilities at speed and scale. As Checkmarx One is fully integrated into the development workflow, the whole process takes place directly within the IDE. The developer will be presented with a Confidence Score between zero and 100 which indicates how exploitable the vulnerability is in context, an explanation of the vulnerability generated by OpenAI, the customized code snippet to remediate the issue, and the ability to ask additional questions where necessary. 

 

Development velocity is one of the key issues that stands between security and development teams, and so this feature is a powerful tool for collaboration, too – to ensure security isn’t ever put on the back burner, while enabling developers to keep to even their most ambitious timelines. 

 

Another powerful SAST tool that leverages AI is our AI Query Builder. Queries support AppSec teams in avoiding false positives and false negatives, and prioritizing the most critical issues within your environment. 

 

At Checkmarx, we use pre-built queries, as well as presets – a collection of queries which are optimized for a specific type of application, in order to define SAST scans ahead of time. Queries are written in CxSQL, our own language, and identify the most common security issues, to support customers in securing their applications as soon as they start working with Checkmarx. You can also customize them to suit your specific needs. Queries can help search for issues such as SQL injection, insecure access controls, and cross-site scripting, to name just a few. 

 

Our AI Query Builder for SAST takes this approach to the next level, letting AI help write custom queries, or modify existing ones to help AppSec teams write new or edit existing queries. Using AI Query Builder, organizations can fine tune their queries to increase accuracy and minimize the impact of false positives or negatives. Instead of manually creating queries, managers and developers (even those without a high level of technical knowledge) can use AI Query Builder to generate tailored queries that improve risk reduction processes, and cover a far broader range of vulnerabilities across the organization. 

Using AI for IaC Guided Remediation

Instead of being handed an increasing number of problems to fix, we’ve found that guided remediation empowers developers to be 60-80% closer to solving an issue. That’s why we champion our AI Guided Remediation for IaC security and KICS. 

 

Powered by GPT4, our guided remediation solution for Infrastructure as Code (IaC) guides teams through the process of fixing IaC misconfigurations, whether they have been identified through Checkmarx, or via KICS (Keeping Infrastructure as Code Secure), a free open-source solution that performs static analysis on IaC files. Just like with auto-remediation for SAST, everything takes place within the IDE, making it simpler for developers to implement fixes, and work with their day-to-day tools and processes.

 

Whenever a vulnerability is uncovered, developers can either select from common questions or use the free-text option to ask their own questions. Without the need for pre-existing knowledge, developers can use the AI to follow actionable steps to remediate the issue, in real-time. They can then rescan, to validate that the risk has been removed. Altogether, issues in their IaC templates are resolved faster, management no longer needs to get involved with every vulnerability, and developers can feel empowered to deliver secure applications at speed. 

 

Of course, in line with OWASP best practices, at Checkmarx, we integrate secrets detection and removal into the guided remediation process, so that sensitive information like passwords and encryption keys cannot be inadvertently shared at a later date. 

AI-Driven Software Composition Analysis

Traditionally, software composition analysis (SCA) is a technology that protects organizations from the risks inherent in open-source software. While open-source components aid development velocity, they can also introduce security vulnerabilities. SCA identifies all third-party components used by an application, transitively scans dependencies, assesses all components for known risks, and recommends remediation actions. SCA also evaluates relevant third-party software license requirements and restrictions, to avoid potential compliance issues or other legal complications. 

 

When it comes to AI, a significant part of the risk in trusting LLMs, such as ChatGPT and Copilot, is very similar to the issues described above. Our AI Security offering acts similarly with AI-generated code to how SCA operates with open-source components. Checkmarx GPT provides real-time scanning of code generated by Github Copilot within the IDE, to validate the safety of the generated code, line by line, and to provide additional insight. For example, is the code a hallucination? Do  AI-suggested open-source packages include any known vulnerabilities or malicious code? This tool is seamlessly integrated into Visual Studio, so that developers can detect and highlight potential vulnerabilities as the code is generated, providing them the power they need, directly in their workflow. 

 

If and when a package is found to be unsafe, Checkmarx GPT’s real-time scanning capability immediately provides this information to the developer. If there is not a lot of information available about a particular package, the tool educates the team about hallucinations and other risks, and may suggest alternatives to the potentially malicious package. 

 

Going further, Checkmarx GPT does more than just scan generic code; the tool can suggest specific package versions that present the least risk, and share all the open source licenses associated with packages, features ChatGPT and other large language models generally do not provide. Altogether, your developers and your managers will be confident that the code being pulled in is accurate, safe, the best version for the job, and without unknown license risks. 

Checkmarx’ Application of AI in Security

At Checkmarx, we understand the potential of AI, specifically with GenAI and LLMs, and we don’t think the risks should slow your development teams down. We want to empower customers to benefit from the development velocity and knowledge sharing opportunities that come with LLMs, by offering the peace of mind that any output is going to be safe and vetted by a trusted resource, or where it’s dangerous – you know that your teams will be informed about that ahead of time, too. 

 

To make this happen, our approach to the application of AI in cyber security is built around two pillars: 

 

    1. Accelerate AppSec with GenAI: AI has huge potential for supporting AppSec teams in getting products and features to market faster, and with less risk. With IaC guided remediation for developers, and AI Security Champion with auto-remediation for SAST, our AI-based tools ensure developers are closer than ever to fixing problems, with articulate and robust results instead of a sea of challenges and false positives. While generic AI tools won’t always give you the insight and intelligence you need, Checkmarx GPT helps to identify licenses, components and even alternative packages, and acts as full software composition analysis where it’s needed most.
    2. Prevention and Protection: At the same time, the industry can’t afford to ignore the threat landscape that’s growing as a direct result of LLMs and GenAI. Prompt injection, AI hallucinations and AI secrets leakage all have their own unique risks attached. Checkmarx GPT and real-time scanning for GitHub Copilot in the IDE scans your LLM-generated code for vulnerabilities, while features such as AI query builder lets you get granular about the specific protections and scans you need at the earliest possible stages of the SLDC. In addition, to avoid the risk of leaking sensitive information via prompt injection, our partnership with Prompt Security provides browser and IDE extensions that detect the difference between secrets and code when information is shared to a GenAI or collaboration platform. Prompt Security obfuscates secrets such as credentials or IP, while sharing only the code that Checkmarx has confirmed to be non-proprietary.

 

Our own CEO, Sandeep Johri, describes the opportunity best, “Nothing more perfectly represents the decision-making tension faced by CISOs than the existence of both significant opportunities and new vulnerabilities presented by open source and GenAI-generated code. Checkmarx has long been a pioneer in application security for enterprise customers and, with GenAI playing an increasing role in application development, we’re pleased to provide the first solution to help protect against the new generation of attacks already emerging.”

 

15% of companies have banned AI code generation, but 99% of security professionals see it in use, we surveyed over 900 AppSec Managers and CISOs to understand the benefits and challenges surrounding AI

Download the exclusive Checkmarx-commissioned  report “7 Steps to Safely Use Generative AI in Application Security” now.

 

Looking to accelerate the use of AI in Application Security, while simultaneously securing GenAI-related threats in an increasingly complex developer environment? Schedule a demo of our AI-based application security tools.