Using ChatGPT creates risks for code development

Updated 5 months ago on May 26, 2024

Using code created by artificial intelligence without proper testing can lead to a number of risks and consequences for organizations.

Untested code may contain bugs and errors resulting in crashes and poor performance, may not meet quality standards or best practices, and may not comply with regulatory standards and compliance requirements, which can lead to legal and financial consequences.

To ensure high quality code, IT leaders must prioritize human oversight and continuous monitoring, and combine these measures with rigorous testing and code validation to ensure that the AI-generated code complies with security protocols.

"Conduct testing when, where, and how your developers work," says Scott Gerlach, co-founder and CSO of StackHawk, "Think about testing requirements in advance and involve all key stakeholders in the process design to ensure buy-in." He recommends making testing an integral part of the software development lifecycle, automating testing as part of continuous integration and continuous delivery (CI/CD) while developers work on the code.

"Educate developers with targeted, pattern-based training in the context of their code and business importance," he adds. "You also need to provide self-service tools to help developers understand what problems occur, why they're important, and how to recreate the problem so they can fix it, adopt and document solutions."

Jim Scheibmayr, a senior research director at Gartner, explained via email that using code from artificial intelligence coding assistants carries the same risk as copying and pasting code from Stack Overflow or other online resources.

"We need to utilize artificial intelligence coding assistants to create code documentation to improve understanding and knowledge of the solution and speed up the process," Scheibmayr says.

Human-centered code review processes

Randy Watkins, CTO of Critical Start, advises organizations to build their own policies and methodology when it comes to incorporating artificial intelligence-generated code into their software development practices.

"In addition to some standard coding best practices and technologies, such as static and dynamic code analysis and secure CI/CD techniques, organizations should continue to monitor the evolving field of software development and security," he told InformationWeek via email.

He said organizations should use code generated by artificial intelligence as a starting point, but bring in human developers to test and refine the code so it meets standards.

John Bambenek, chief threat hunter at Netenrich, adds that management should "value secure code" and ensure that all code sent to production goes through at least automated testing.

"Ultimately, many of the risks of generative AI code can be addressed through effective and thorough mandatory testing," he said in an email.

He explains that the CI/CD pipeline needs to ensure mandatory testing of all production commits and regular comprehensive evaluation of the entire code base.

"Keep an inventory of software libraries in use so you can check for updates or include packages with typos, and manage secrets to keep keys and credentials out of code repositories," Bambenek says.

Paving the way to clarity

A recent Sauce Labs survey of 500 U.S. developers found that more than two-thirds (67%) of respondents admitted to sending code to production without testing, and six out of 10 developers surveyed admitted to using untested code created with ChatGPT.

Jason Baum, director of community engagement at Sauce Labs, says it's about leaders being able to take action and forge a clear path amidst the clutter.

"With code generated by artificial intelligence, we're often flying blind on context and functionality, so rigorous testing is not just prudent, it's essential to avoid financial and reputational losses," he explains. "When we set crystal clear expectations, we're not just speeding code to market, we're supporting a culture where quality and safety are honored, not compromised."

Baum says balancing AI efficiency with code quality is like expecting fresh coffee to be served straight from the coffee bean: skipping the grinding and brewing process is not an option.

"Just like journalists won't let ChatGPT publish a story without a review, we can't let code generated by artificial intelligence go into production without a thorough review," he explains. "It's about educating our developers and having a strong review network to catch the unseen, ensuring our code gets to the finish line quickly and safely."

Josh Thorngren, head of developer advocacy at ForAllSecure, agrees that quality and security testing should be simplified as much as possible and not distract developers from the code/build/shipping workflow.

For example, if an organization runs a security testing tool in the CI process, developers should get the results of that tool through their issue tracker or CI tool - they don't need to log into the security product to see the results.

"We also have to create a culture where the balance between quality and speed will not always tilt toward speed," he adds. "These are not new problems, but the speed of AI code generation increases their impact on safety, stability and quality, increasing the complexity of each."

Let's get in touch!

Please feel free to send us a message through the contact form.

Drop us a line at mailrequest@nosota.com / Give us a call over skypenosota.skype