Sitemap

Code generation with AI: Trust But Verify

4 min readApr 11, 2025

--

After experimenting with various AI tools, such as ChatGPT, Gemini, Microsoft Copilot, Claude, etc., for software engineering, I want to share some of my observations and conclusions.

Press enter or click to view image in full size
Trust but verify

Serving as a board member for a software company with massive amounts of data, what better way to experiment than to build a web-based dashboard to monitor the leading key performance indicators for the company?

Building a web application from scratch involves a lot of boilerplate coding, and using an AI tool such as ChatGPT or JetBrains AI boosts the speed of writing code.

Now that my application has been deployed and is running, it is time to reflect on the benefits and downsides of using AI tools.

These tools generate code much faster than I can think and write. The code is usually syntactically correct and compiles and runs as expected. Furthermore, the code generated primarily uses idiomatic (best practice) coding patterns.

However, you only get what you ask for, and sometimes, the code generated has severe security vulnerabilities and maintenance issues, even though it is syntactically correct and compiles and executes as expected.

Such flaws are often called “zero-day” vulnerabilities. They refer to developers having had zero days to fix them because they didn't even know the vulnerability existed.

Zero-day vulnerability example: generating code for JSON Web Tokens (JWT) for security

A JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties.

Using a short-lived JWT access token combined with a long-lived refresh JWT token increases security by minimising the impact of token theft:

Short-lived access tokens expire quickly (e.g., in minutes), so they can only be used for a short time, even if stolen.

Refresh tokens are kept securely (e.g., server-side or in an HTTP-only cookie) and used to obtain new access tokens when needed.

• This setup limits exposure and allows for revocation or detection of abuse through refresh token controls (e.g., rotation, IP checks, device binding).

It reduces the attack window and lets you safely maintain sessions without compromising security.

The tokens are generated during login and returned to the browser as two separate cookies. The short-lived JWT access token is supplied by the client browser on every interaction with the server, like when making an HTTP REST request, but it needs to be refreshed quite often, every 15 minutes in my case.

To request another access token, the client supplies the refresh token.

This technique exploits the fact that the browser always includes HTTP cookies with a path that matches the path of the REST resource.

Writing boilerplate code for implementing authentication with JWT is boring, but it is a perfect job for an AI tool. After a couple of iterations with code generation, refinement and some more AI prompts, the generated Rust code looked good, was organised into separate files, and followed the DRY principle.

However, I noticed these lines of code:

.... many lines of code were deleted for readability...

// INSECURE: DON'T USE THIS CODE!
let access_cookie = format!("access={access_token}; HttpOnly; Max-Age=900; Path=/; SameSite=Strict");
let refresh_cookie = format!("refresh={refresh_token}; HttpOnly; Max-Age=604800; Path=/; SameSite=Strict");

..... even more deleted code here

Can you spot the vulnerability? The code seems perfectly fine at first glance, but beware, the Path=/ is identical for both tokens! Consequently, the browser will reveal both tokens for every HTTP request being made as the path / will match any path on the server side.

This piece of code effectively increases the attack window rather than reducing it.

So here is what the code should have looked like:

.... many lines of code were deleted for readability...

// The access token is only transmitted for URLs starting with /protected/api
let access_cookie = format!("access={access_token}; HttpOnly; Max-Age=900; Path=/protected/api; SameSite=Strict");
// The refresh token API has a separate and unique path
let refresh_cookie = format!("refresh={refresh_token}; HttpOnly; Max-Age=604800; Path=/token/refresh; SameSite=Strict");

..... even more deleted code here

Lessons learned

As a die-hard software engineer, I love and trust my code-generating LLM friends to generate high-quality code that works well. However, inserting the generated code without review is risky from many perspectives:

  • Generated code could introduce zero-day security vulnerabilities, performance bottlenecks, violations of the overall architecture principles, etc.
  • Generated code can be as challenging to maintain as reading other peoples’ code once they have left the building. Hey, I struggle to remember the thinking behind the code I wrote myself after some time, even more so when someone else has generated it.
  • There are no incentives to write unit and integration tests, which is imperative; I usually write the tests before or in parallel with the implementation.
  • A senior engineer needs to review the generated code before deployment to production. Peer review is considered a best practice, so keep doing it.

Happy coding! Trust your new best friend, but remember: you get what you ask for, so trust but verify!

--

--

No responses yet