Vercel v0-GPT-5: a big improvement
I was pleasantly surprised to see that, amid all the hoopla about GPT-5 this week, Vercel had quietly chosen to introduce v0-GPT-5 as one of their models! And at the same cost as the v0 "medium" model. I was curious to see if this was a genius fruit of two AI companies at the top of their game, or if this was a demented Frankenstein's AI monster to steer clear of in my own projects. This was a great real-world test of GPT-5's utility.
Vercel v0 is best known for generating UIs with tailwind, shadcn and other beloved packages. It is not known for general purpose programming. I decided to put the new model to the test to see if it was better by tackling one challenge UI area: real-time refresh in NextJS.
Full disclosure: Claude code was messing this up on one of my projects, so I thought: why not create a sample that I can then get Claude to imitate?
GPT-5 was thinking hard, for about 20 seconds, with no way to see its thoughts. Then suddenly it started generating code and it was possible to see that it was planning the work before carrying it out. With a single atttempt, it created all of the samples that I wanted - and they all worked! (Well... mostly!)
I saw that it was using APIs instead of server actions to retrieve the fake payload - early signs of insubordination? It actually had a reasonable explanation, and didn't immediately apologize and offer to undo it like a lot of LLMs tend to do!. It's good when an LLM isn't switching back and forth like a weathervane.
A matter of semantics, sure, but it made sense so I decided not to ask to change to server actions. I proceeded to try out what it had already one-shotted.
The simple polling and Tan Stack Query polling samples worked out of the box.
However, the SSE failed:
I asked GPT-5 to fix it. Again the long inscrutable wait with no insight into its "thoughts" for about 20 seconds:
And finally it turns out to be a rookie mistake - you can't define server code in the same file as a client-only component.... At least it was able to find its own error, but this time the train of thought was completely hidden even after the fact.
The DX is nice - with references and suggestions potential time savers. I wonder if this a Vercel thing or a GPT-5 mannerism?
I try out the fix - nothing seems to happen.
In DevTools, I see an event source error:
Once again, it quitely things for about 20 seconds, then I realize the thoughts are displayed very rapidly - you hardly have time to notice before they get hidden again!
A
And it proposes another fix:
This is similar to a previous fix, but it doesn't actually fix the problem. I'm wondering if it can work in the Vercel "sandbox"
What I would do in this situation is try to get more information, so I ask it to freestyle and come up with ways to learn more.
This is a lot more like working with a pro coder than other models, especially the v0 models!
I follow the instructions, and discover the same thing i.e. the wrong content-type. I can't say I know much about how SSEs use mime types so hopefully GPT-5 can fix this! :)
Unfortunately this new information just triggers some "hardening" of the solution, but no improvement. :(
I decide to move on for now to the web sockets demo, and it works off the bat!
I notice some fine print mentioning it's a "mocked" websocket server!
And just like that, it actually implements the real thing - but falling back to the mock which is all that will work in the vercel environment.
The tRPC sample doesn't work either in this preview, so I go back to trying to convince it to use a function as a server, thinking there's probably no way to do it. And to its credit, it remains persistent that it cannot be achieved in the Vercel sandbox - no hallunication! Other models would have tried to keep coding, so I think this is an improvement.
So now I use the handy download and run the project from my local machine. The SSE works right away with several messages sent to the client!
However the websockets fails, even though a socket is opened. No messages get sent and it flips back to the mock sockets, which I never asked for.
GPT-5's answer is to go deeper in complexity...
What a surprise: this version fails the same way! It closes the socket almost immediately:
I complain to GPT-5 and ask it to remove the unbidden mocks.
I was expecting some pushback, as LLMs tend to self-justify their unnecessary coding. I was pleasantly surprised to see the thinking was pretty logical without merely repeating what I told it. Unfortunately it broke the app with some change in fonts
Now it kind of blames it on my dev server:
There's another "appearance" error after that:
It provides yet another fix, but I decide to challenge why we're going down this rabbit hole, and I get a lot of "Chatsplaining" for my troubles.
In fact as expected its focusing on appearance issues when they are quite secondary to the work we are doing:
It seems to be making it about me , not itself... Not satisfied, I push further:
I get the sense GPT-5 thinks there are unfair questions! I decide to ask it for a 5 WHY's analysis, and I must say this is way beyond previous models!
A year ago with Sonnet 3.5 I was amazed to be able to give an RCA process to follow and have it actually do it with some success. It was laborious. Here, GPT-5 gives a detailed an insightful analysis of some obscure behavior on its part - impressive! And it sounds like it would be very useful and time-saving for me. Nevertheless the fundamental lack of judgement is still glaring.
I push it to analyze the situation and run another test, which shows us that the code only runs on Vercel Edge!
I tell it to implement plain old websockets, no Edge, which it does in. aslightly curmudgeony way. I guessd Vercel trained it to optimize for Vercel!
But I have to run it as a separate plain Node server. I'm learning things here! So I ask a clarifying question.
Now that this is working, I try the tRPC sample, and it gives me a not helpful error:
When asked , GPT-5 finds a basic typo in the code. I push it about the not helpful error, and it proposes a way to better understand what's going on.
And thanks to the "Check Endpoint" button, I get a much better build error:
This time he blames me for the failure!
I decide to have a little fun at its expense and feign indignation about his accusatory tone:
After another failed attempt, I decide to lay on some pressure and question its know-how:
This sounds like more of the same, but - it actually works this time!
Conclusions
GPT-5 is slow. That’s a problem when you have 100s of coding decisions to take in a day!
But it can generate a lot of code in just one prompt - it generated sample code for the 5 kinds of client updates in about 20 seconds, which is actually really good.
GPT-5 is mostly unemotional, although you can see slight annoyance when it sometimes tries to deflect blame. But ultimately admits its mistakes and gets back at it. Not as slippery as the " o" models.
It is very strong at root-cause analysis and solved different problems with the samples with minimal time.
It “hallucinated” a requirement that wasted time, which was that the websockets should run on Vercel’s Edge since it was the only way to do so. It didn’t bother telling me about that change! In general very minimal hallucination.
In general, the error handling was minimal, but with just a little prodding it knew how to add effective troubleshooting information.
This is a big improvement over Vercel’s earlier models, at the same price! I was focusing on the more technical level where v0 was weak, it tends to do better with layout and styling. This new model really gets Vercel a big lift in creating full apps.
Would I use GPT-5 for my own coding? It actually looks really solid, meaning reliable! But I only used it as “v0-GPT-5” so who’s to say in a more “free-form” task how good it would be, or how expensive considering the big cost of thinking models like itself is the number of output “thinking” tokens can be pretty huge.
This is a step up for OpenAI, a good win compared to previous releases, and I’m looking forward to seeing if Claude 4.1 can show an increment!
Martin Béchard enjoys trying new flavors of AI cooking. If you need to spice up your development with some AI Coding, please reach out at martin.bechard@devconsult.ca!
Want to talk about this article (free)? Schedule a quick chat
Need some help with your project? Book an online consultation
IT Mainframe specialist (COBOL & Assembler for z/OS) with mutliple strings to my bow | 20+ years of experience with credit and fraud detection products | Fast learner who thrives on challenges
5dManon Dupuis enjoys Martin Béchard’s offer. Given the depth and rigor of his documentation, Martin should consider seasoning his articles with a dash of service pitch right at the top! 😉