A recent study reveals that software engineers who use code-generating AI systems are more likely to cause security vulnerabilities in the applications they develop. The article, co-authored by a team of researchers affiliated with Stanford, highlights the potential pitfalls of code-generating systems as vendors like GitHub begin to market them in earnest.
“Code generation systems are currently not replacing human developers,” Neil Perry, a Stanford doctoral student and co-lead author of the study, told TechCrunch in an email interview. “Developers who use them to perform tasks outside their own areas of expertise should be concerned, and those who use them to speed up tasks they are already skilled at should carefully double-check the outputs and the context in which they are placed. used throughout the project.”
The Codex was trained on billions of lines of public code to suggest additional lines of code and functions given the context of the existing code. The system surfaces a programming approach or solution in response to a description of what a developer wants to accomplish (e.g. “Say hello to the world”), drawing on both its knowledge base and the current context.
According to the researchers, study participants who had access to Codex were more likely to write incorrect and “insecure” (in the cybersecurity sense) solutions to programming problems compared to a control group. More worryingly, they were more likely to say their unsure answers were safe compared to people in the control group.
Megha Srivastava, graduate student at Stanford and second co-author of the study, pointed out that the results do not constitute a complete condemnation of Codex and other code-generating systems. Study participants lacked security expertise that would have allowed them to better spot code vulnerabilities, to begin with. That aside, Srivastava thinks code generation systems are reliably useful for tasks that aren’t high-risk, like exploratory research code, and could, with fine-tuning, improve their coding suggestions. .
“Companies that develop their own [systems]perhaps more trained on their internal source code, might be better off as the model can be encouraged to generate outputs that are more in line with their coding and security practices,” Srivastava said.
So how could vendors like GitHub prevent the introduction of security holes by developers using their code-generating AI systems? The co-authors have a few ideas, including a mechanism to “tune” user prompts to be safer – much like a supervisor reviewing and revising code drafts. They also suggest that developers of cryptography libraries ensure that their defaults are secure, as code-generating systems tend to stick to defaults that are not always exploit-free.
“The AI assistant code generation tools are a really exciting development and it’s understandable that so many people are eager to use them. These tools raise issues to consider moving forward, however… Our goal is to make a broader statement about the use of code generation templates,” Perry said. “More work needs to be done to explore these issues and develop techniques to address them.”
GitHub’s attempt to rectify this is a filter, first introduced on the Copilot platform in June, that checks code suggestions with their surrounding code of around 150 characters against the public GitHub code and hides the suggestions. whether there is a match or a “near match”. But this is an imperfect measure. Tim Davis, a computer science professor at Texas A&M University, discovered that enabling the filter caused Copilot to emit large chunks of its copyrighted code, including all attribution and license text. .
“[For these reasons,] we largely express caution against using these tools to replace training entry-level developers in solid coding practices,” Srivastava added.