As we continue to explore the legal implications of using AI-generated code, I wanted to extend a big thanks to ZDNET commenter @pbug5612 for inspiring us to journey down this rabbit hole.
In our first article of the series, we looked at who owns the code created by AI chatbots like ChatGPT. In this article, we’ll discuss issues of liability.
Also: The best AI chatbots: ChatGPT and other noteworthy alternatives
Functional liability
To frame this discussion, I’ll turn to attorney and long-time Internet Press Guild member Richard Santalesa. With his tech journalism background, Santalesa understands this stuff from both a legal and a tech perspective. (He’s a founding member of the SmartEdgeLaw Group.)
“Until cases grind through the courts to definitively answer this question,” Santalesa advises, “the legal implications of AI-generated code are the same as with human-created code.”
Keep in mind, he continues, that code generated by humans is far from error-free. You’ll never see a service level agreement warranting that code is perfect or that users will have uninterrupted use of the services.
He also points out that it’s rare for all parts of a software application to be entirely home-grown. He says, “Most coders use SDKs and code libraries that they have not personally vetted or analyzed, but rely upon nonetheless. I think AI-generated code — for the time being — will be in the same bucket as to legal implications.”
Also: ChatGPT and the new AI are wreaking havoc on cybersecurity in exciting and frightening ways
Send in the trolls
Sean O’Brien, a lecturer in cybersecurity at Yale Law School and founder of the Yale Privacy Lab, pointed out a risk for developers that’s undeniably worrisome:
The chances that AI prompts might output proprietary code are very high, if we’re talking about tools such as ChatGPT and Copilot which have been trained on a massive trove of code of both the open source and proprietary variety.
We don’t know what trained the chatbots. And so we don’t know if segments of code output from ChatGPT and other similar tools are generated by the AI or merely echoed back from code that it ingested as part of the training process.
If you’re a developer, it’s time to sit down. Here’s O’Brien’s prediction:
I believe there will soon be an entire sub-industry of trolling that mirror patent trolls, but this time surrounding AI-generated works. As more authors use AI-powered tools to ship code under proprietary licenses, a feedback loop is created. There will be software ecosystems polluted with proprietary code that will be the subject of cease-and-desist claims by enterprising firms.
As soon as Yale’s Sean O’Brien mentioned the troll factor, the hairs on the back of my neck stood up. This is going to be very, very messy. Think about that and try not to get ill.
Here’s another thought. There will be those who attempt to corrupt the training corpora (the sources of knowledge that AIs use to provide their results). One of the things we humans do is find ways to game the system. So not only will there be armies of legal trolls trying to find folks to sue, there will be hackers, criminals, rogue nation states, high school students, and crackpots, all attempting to feed erroneous data into every AI they can find, either for the lulz or for much more nefarious reasons.
Also: 5 ways to explore the use of generative AI at work
Maybe try not to think about that too much.
Canadian attorney Robert Piasentin, a partner in the Technology Group at McMillan LLP, a Canadian business law firm, points out that chatbots could have been trained on open-source work and legitimate sources, but they may also have been trained on copyrighted work. And that training data might include flawed or biased data (or algorithms) as well as corporate proprietary data.
Here’s how he describes it: “If the AI draws on incorrect, deficient or biased information, the output of the AI tool may give rise to various potential claims depending on the nature of the potential damage or harm that the output may have caused (whether directly or indirectly).”
Who is at fault?
What none of the lawyers discussed is who is at fault if the code generated by an AI results in some catastrophic outcome.
For example: The person/company delivering a product shares some responsibility for, say, choosing a library that has known deficiencies. If a product ships using a library that has known exploits, and that product causes something that results in tangible harm, who owns that failure? The product maker, the library coder, or the company that chose the product?
Usually, it’s all three.
Also: ChatGPT’s latest challenger: The Supreme Court
Now add AI code into the mix. Clearly, most of the responsibility falls on the shoulders of the coder who chooses to use code generated by an AI. After all, it’s common knowledge that the code may not work and needs to be very thoroughly tested.
But in a comprehensive lawsuit, will claimants also go after the companies that produce the AIs and even those organizations whose content was taken (even if without permission) to train those AIs?
As every attorney has told me, there is very little case law thus far. We won’t really know the answers until something goes horribly wrong, parties wind up in court, and it’s adjudicated thoroughly.
We’re in uncharted waters here. My best advice, for now, is to test your code thoroughly. Test, test, and then test some more.
Also: Low and no-code software may soon test the limits of IT hand-holding
You can follow my day-to-day project updates on social media. Be sure to follow me on Twitter at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.