Why AI-Generated Code May Be Less Secure

Application security testing for AI-generated code highlighting AI risk management and the security risks of artificial intelligence in modern development pipelines.

Securing the software supply chain was tough enough when the proliferation of publicly available third-party libraries, modules, and packages made it all too easy for developers to import code that may be insecure into applications.

Now, the advent of generative AI coding tools has added another major challenge. Code generated by AI is essentially third-party code that, like other forms of external code (such as code from open-source repositories), could introduce security vulnerabilities and risks into a business.

Hence, the need to add AI risk management protections to traditional application security testing procedures. Read on for guidance as we explain why and how businesses can mitigate the security risks of artificial intelligence coding tools.

The ubiquity of AI-generated code

Ever since GitHub Copilot – the first AI-powered coding solution to go mainstream – was introduced in 2021, using AI for coding has become a commonplace practice among software developers. As Stack Overflow reported based on a 2024 developer survey, 76 percent of coders now employ AI to help write software. Even at large enterprises known for hiring talented developers, AI-generated code has gone mainstream. At Microsoft, for example, 30 percent of code is now written by AI, according to CEO Satya Nadella.

This means that, like it or not, significant portions of the code within modern applications originate from AI tools. The amount of AI-generated code is likely only to grow as the sophistication of AI-assisted development tools increases.

In many respects, this is a good thing. From a productivity standpoint, AI-generated code has much to offer because it can significantly reduce the time it takes developers to write and test software. Back in 2023 (when AI-assisted coding tools and integrations were not as mature as they are today), McKinsey concluded that AI can speed up total coding time by as much as 45 percent.

There is also evidence that AI can improve the developer experience. 75 percent of coders report feeling more “fulfilled” when they use AI, presumably because AI can help automate tedious tasks like generating repetitive boilerplate code.

Don’t Just Find Vulnerabilities—Understand Them

Knowing your code is exposed is only half the battle. Learn how Checkmarx helps you identify, categorize, and prioritize code exposure risks before they reach production.

Explore code security best practices

How secure is AI-generated code?

From a security perspective, however, AI-generated code presents some potentially serious risks. Code produced by AI tools can contain multiple types of vulnerabilities and risks, such as:

Lack of proper input validation, leading potentially to injection attacks.
Memory management problems, which threat actors could exploit to launch buffer overflow attacks.
Insecure dependencies, which could lead to software supply chain risks.
Insecure management of secrets (such as passwords, access keys, and tokens) that applications use to connect to other applications, databases, and so on.

To be sure, these security shortcomings may be present in code written by humans, too. However, from a security perspective, there are several key distinctions between human-generated code and AI-generated code that can make security problems in the latter more common or severe.

#1. Lack of review

Code written by a human receives careful consideration and analysis as the coder writes it. In contrast, there is no guarantee that a human will analyze, or think in any detail about, AI-generated code. If the code runs – as it often does – developers can blindly integrate it into their applications without checking whether it includes risks like insecure dependencies or memory management oversights.

#2. Lack of contextual awareness

Most AI coding tools have limited understanding of contextual factors that could influence security best practices. This makes it harder for them to adhere to security best practices, especially those that are specific to a particular organization.

For example, a human developer might know which types of data input an application should accept based on the role that the application plays in the business. An AI tool would have no way of knowing this because it doesn’t know how the business works. All AI knows how to do is generate code. As a result, the AI tool would not be able to write input validation code as reliably as a human.

#3. Inefficiency of fixing security mistakes

If a human makes a security mistake when writing code and catches the issue quickly, developers can resolve the issue without much difficulty. But in workflows where developers generate code using AI and then commit it to a repository without security testing or analysis, they may miss critical security issues until later in the software development lifecycle (SDLC).

This is a problem because the longer an issue persists within the SDLC, the more time and effort it typically requires on the part of developers to fix the problem. They may have to update other code that depends on or integrates with the insecure code as well as mitigate the insecure code itself. Overall, 68 percent of developers say that they are now spending more time resolving security vulnerabilities than they did prior to using AI-generated code.

#4. Hallucination risks

Generative AI tools can “hallucinate,” meaning they produce information that is inaccurate. This flaw may lead to security risks like package hallucination, which occurs when AI code refers to packages or other dependencies that don’t actually exist. If threat actors are able to plant packages containing malicious code in public repositories and assign names that match those of the hallucinated packages, applications that refer to the packages could end up downloading and executing malicious software.

In some respects, these types of attacks are a variation on typosquatting, a method threat actors have long used to take advantage of mistakes made by developers when writing out package names or repository URLs. But AI opens the door to new types of attacks in this vein – and ones that are harder to detect because, unlike package names that contain typos, hallucinated package names often look legitimate.

Watch now

AI Security Champion: Automatic Remediation For Devs

Find out how Checkmarx is using AI to its full potential by providing advanced application security throughout the SDLC.

Watch now

Testing AI-generated code for security flaws

While AI introduces novel types of security challenges, the good news is that developers and DevSecOps teams don’t need fundamentally new types of security solutions for AI risk management. In general, the same types of security testing tools and techniques that help to protect traditional code can secure AI-generated code as well.

Specifically, businesses should deploy the following types of tests for AI-generated code:

Software Composition Analysis (SCA) scanners, which can detect insecure dependencies and software supply chain security risks. Ideally, SCA tools used to test AI-generated code will be able to detect risks like hallucinated packages in addition to more traditional types of third-party code risks (like an external library subject to known vulnerabilities).
Static Application Security Testing (SAST) tools, which check source code and executable files for flaws like injection and buffer overflow vulnerabilities. Emphasize scanners that start by scanning code in real time, before the first commit. This “extreme shift left” provides immediate feedback when code (whether human or AI-created) introduces potential security risks, before commit and running a complete SAST scan.
Dynamic Application Security Testing (DAST) scanners, which evaluate running applications for security risks.

Because no single type of security testing tool can guarantee protection against all types of risks, it’s a best practice to deploy each of these types of testing solutions, preferably as a part of a unified platform that can correlate risk across the different tools and provide a single, unified view of risk. This is true of traditional, human-written code, but it’s all the more important for AI-generated code, since the latter may not undergo thorough review by developers prior to being committed to a repository or compiled into an executable.

Adding AI security to application security testing with Checkmarx One

As a comprehensive application security solution, Checkmarx One delivers all of the key capabilities – including SCA, SAST, and DAST businesses need to secure their code no matter who or what writes it. Learn more by requesting a demo.

Ready to Secure Your Code at the Source?

See how Checkmarx empowers your team to detect code exposure vulnerabilities early—before they become real-world exploits.

Request a Demo

Why AI-Generated Code May Be Less Secure – and How to Protect It