Software Engineering with Generative AI in 2025: A Reality Check
The Gap Between AI Hype and Field Reality: Where Are Developers Heading?
In recent years, as artificial intelligence (AI) has permeated every aspect of our lives, generative AI (GenAI), in particular, is increasingly making headlines with its promise to revolutionize the world of software engineering. From Microsoft's CEO claiming that 30% of code is written by AI to Anthropic's CEO's bold prediction that "all code will be generated by AI within a year," the executive narrative in this field is often dazzling and equally ambitious. But how well do these glossy portrayals align with the experiences of software engineers on the ground? Or is there a significant gap between the hype and the reality?
This blog post will delve into this question, addressing the topic with direct observations and anecdotes presented in a talk titled "Software engineering with LLMs in 2025: reality check" by Gergely Orosz known for The Pragmatic Engineer newsletter. The video offers valuable insights from leading AI development tool companies, major tech giants, next-generation startups, and independent software engineers. The focus will particularly be on why this topic matters: Is AI fundamentally changing our software development practice, or is it merely another set of tools? The answers to these questions will play a critical role in shaping future strategies for both individual engineers and technology companies.
Executive Narrative vs. Field Reality: Why the Discrepancy?
Anyone following the AI agenda has likely encountered striking headlines in recent months. Even engineers like Jeff Dean from Google suggest that AI could reach the "level of a junior coder" within a year.
However, when we look at the ground reality, the picture shifts somewhat. For instance, a report published in January revealed that Devin, an autonomous AI tool costing $500 per month, introduced a bug that unnecessarily generated 6 million events, incurring an additional $700 cost for the company. This serves as a clear reminder that AI is not flawless. Furthermore, Microsoft's attempt to add a fix to the .NET codebase using Co-pilot agents at its Build conference resulted in a fiasco. The agents introduced code that broke existing tests, requiring significant effort from engineers to fix them. This situation demonstrates that even large corporations face challenges in AI integration. While Microsoft's transparency on this matter is commendable, it's clear that such tools still have serious limitations.
This situation highlights the significant disconnect between executive vision and engineering practice. Executives often paint an optimistic picture driven by market strategies and product launches, while engineers on the ground approach with more caution, aware of AI's current capabilities and limitations. Understanding what is truly happening by listening to the experiences of software engineers is a highly valuable input.
The State of AI Developer Tool Startups
Startups selling AI developer tools are naturally the companies that use this technology most intensively and expect to benefit most from it. These companies develop their products by "dogfooding" (using their own products internally).
Anthropic: Surprising Adoption with Claude Code
During discussions with the Anthropic team, when asked about the internal adoption rates of Claude Code, keeping in mind they might be biased, the response was quite surprising: “When we gave our engineers access to code in Claude Code, they all started using it every day.” This was observed internally months before Claude Code was publicly released and instantly gained significant traction. Despite Claude Code being a command-line interface (CLI) running in the terminal, its rapid internal adoption is noteworthy.
Even more striking, Anthropic claims that 90% of their products are written with Claude Code. While this figure sounds excessively high, the engineers spoken to were not the type to inflate numbers. The 160% increase in usage within a month after Claude Code's release demonstrates the strong demand for their product.
Anthropic also developed a protocol called Model Context Protocol (MCP) and open-sourced it. This protocol allows IDEs or AI agents to connect to various services such as databases, GitHub, and Google Drive. According to Anthropic, this protocol, open-sourced in November, began to be supported by small companies in December and February, and by major players like OpenAI, Google, and Microsoft in March and April. Today, thousands of MCP servers are estimated to be active. This could mark the beginning of a new era for inter-system automation and interaction.
Windsurf and Cursor: Different Experiences
Another AI-powered IDE, Windsurf, claims that 95% of its code is written with its own tools. This rate is again quite high and somewhat surprising. This indicates how extensively companies use their own products.
In discussions with Cursor, a more candid picture emerged. They stated that approximately 40-50% of their code is written with AI. Their honest remark, “Some of it works, some of it doesn't,” suggests that AI tools are still in the maturation phase. While this honesty is appreciated, it's important to remember that these companies aim for 100% AI-assisted coding.
AI Integration in Major Tech Giants
Major tech companies are quietly making massive investments in integrating AI tools. These companies typically use their own proprietary tools and infrastructure, making it difficult for outsiders to fully understand what's happening internally.
Google: AI Everywhere and Covert Automation
Five engineers spoken to at Google emphasized that everything is proprietary. They use their own tools like Borg instead of Kubernetes, their own code repository instead of GitHub, and Critique for Code Review. Their IDE, Cider (Cloud Integrated Development Environment and Repository), currently functions as a fork of VS Code, integrated with the entire Google ecosystem.
Engineers stated that AI has permeated everywhere within Google. LLMs integrated into Cider offer autocompletion and a chat-based IDE experience. The critical code review tool provides "sensible" and "useful" feedback. Google's internal search tool, Code Search, with LLM support, instantly retrieves specific parts of the codebase, speeding up engineers' workflows.
A former Google employee noted that these tools were not widely used until six months ago, but a rapid evolution has occurred. A current engineer commented that Google is approaching this topic with a cautious and slow approach, aiming to gain engineers' trust.
Other tools used at Google include NotebookLM, which allows chatting with documents, and an LLM prompt playground similar to OpenAI's. Notably, the LLM-based knowledge base called the Momo search engine is constantly used by engineers. As one Google employee anonymously stated: “This is what leadership wants to see, so organization-specific GenAI tools are being developed everywhere, and honestly, that's how more funding is secured these days.” This situation indicates that AI investments in large organizations like Google are driven not only by technological advancements but also by corporate and financial motivations.
The most striking piece of information came from a former SRE: “From what I hear from my SRE friends at Google, they are preparing for 10x the lines of code going into production.” This implies they are strengthening their infrastructure, deployment pipelines, code review tools, and feature flagging systems. What is Google seeing that we are not yet aware of? This question is a strong indication that a fundamental shift in software development methodologies might occur in the future.
Amazon: From API-First Approach to MCP-First Transformation
While Amazon may not be as prominent in the AI space as Google, it's making significant internal progress. Engineers spoken to stated that almost all developers use a tool called Amazon Q Developer Pro. It's said to be very effective, especially for AWS (Amazon Web Services) related coding. They even expressed surprise that developers outside Amazon are not familiar enough with this tool. Engineers who were not as enthusiastic six months ago now say Q works very well.
Amazon developers also intensively use Claude internally due to their relationship with Anthropic. They benefit greatly from Claude in writing tasks, particularly for PR/FAQ and during performance review periods.
Amazon's most interesting aspect is the internal prevalence of MCP servers. Thanks to Jeff Bezos's famous API-focused mandate in 2002, all of Amazon's internal services communicate via APIs. This partly explains how AWS was born. It's extremely easy to add an MCP server on top of a service that has an API. This allows IDEs or AI agents to interact with systems using existing APIs.
Reportedly, most internal tools and websites at Amazon already have MCP support. Developers stating “Automation is happening everywhere,” indicate that they are automating many workflows, from ticketing systems to emails and internal systems, and are quite pleased with this situation. Some have even automated a significant portion of their workflows. Amazon's API-first approach, adopted since 2002, is perhaps transforming them into an "MCP-first" company in 2025. This points to a covert automation revolution taking place.
The Situation in Independent Startups and Individual Engineers
Unlike large tech companies, startups that did not start as AI-focused but are trying to leverage AI, along with independent software engineers, are also an important part of this transformation.
Incident.io: A Culture of Trial and Error
Startups like Incident.io, which didn't begin as AI-focused but are integrating AI into their business processes, are gaining significant momentum in this area. According to Lawrence Jones, their teams are intensively using AI to accelerate their work. They share tips and tricks within the company via Slack.
For instance, an engineer realized that an MCP server was very effective for well-defined tickets and shared this information with the team. Another engineer noted that asking AI to "provide various options" was their new favorite trick: "Can you give me options for writing code that does this and that? You see this error, why do you think it is? Can you give me explanations? How do you train the prompt?" This shows the formation of an experimental learning and sharing culture within the company. Claude Code's widespread adoption by the Incident.io team in just three weeks highlights the practical value of these tools.
An Anonymous Biotechnology Startup: Why It Doesn't Always Work
However, the picture is not the same for every company. An unnamed biotechnology AI startup uses AI and machine learning models in complex areas like protein design. With a team of approximately 50-100 people, this company conducts Python-heavy automated numerical processes.
An engineer from this company stated that they experimented with various LLMs, but none truly stuck: “It's still faster to write the correct code, review the LLM code, and fix all those problems.” Even when using the latest models (e.g., Solid 3.7 or 4), the situation remained unchanged. The engineer noted that this situation made them feel like they were in a "weird niche" given the general AI hype in the industry. That's why they preferred not to be named; they didn't want to be labeled an "AI skeptic." This clearly shows that AI is not yet a "one-size-fits-all" solution in every field and for every problem set. Especially for teams developing new and never-before-built software, reviewing and correcting AI-generated code can take more time than writing it from scratch.
Independent Engineers: "The Love of Coding" Is Back
In the age of AI, the experiences of independent software engineers are also quite intriguing. These individuals, who have been coding for years and are passionate about their craft, are redefining their relationship with AI.
Armin Ronacher (Creator of Flask Framework): Armin Ronacher, creator of the Flask framework and founding engineer at Sentry, recently announced great excitement for AI development. In an article published a few weeks ago, he wrote: “Even six months ago, if you had told me I'd prefer to be an engineering leader to a virtual programmer intern (i.e., an agent), I wouldn't have believed you.” Ronacher attributes this change to the improved quality of Claude Code and overcoming adoption barriers by intensively using LLMs. Most importantly, he notes that model errors (like hallucinations) can be prevented by running the tool itself, seeing the results, and getting feedback. This highlights the potential of autonomous agents.
🦄 Peter Steinberger (Creator of PSPDFKit): Peter Steinberger, a well-known figure in the iOS world and creator of PSPDFKit, also expressed great excitement about AI. After selling his company, Steinberger was passive for a while, but he stated that AI "brought the spark back" and that he hadn't felt such excitement, astonishment, and admiration for technology in a long time.
According to him, languages and frameworks are now less important because switching between languages has become very easy thanks to AI. Peter said he even started writing code in languages like TypeScript, which he never thought he'd touch before. He believes a skilled engineer can produce much more output with AI. He even shared on social media that all his tech friends are fascinated and having trouble sleeping. Another engineer observed that many developers experiencing burnout are returning to the field thanks to AI.
Birgitta Boeckeler (Distinguished Engineer at ThoughtWorks): Brigitta, a distinguished engineer at ThoughtWorks, views LLMs as a tool that can be used at every level of abstraction. According to her, this is a horizontal movement; meaning it creates a change across the entire stack rather than just adding a new layer on top of the existing structure. This stands out as the fundamental reason why LLMs are exciting. This comment from Brigitta, who has been thinking about LLMs for years and was already a successful engineer, provides an important clue about the transformative potential of the technology.
Simon Willison (Creator of Django): Simon Willison, creator of Django and a blogger for 23 years, summarized the current state of AI development tools as follows: "Coding agents actually work. You can run them in a loop, do compilers, and all that other stuff." According to him, model improvements in the last six months have been a turning point, and these tools are now truly useful. Simon's comment from an independent perspective indicates that AI tools have reached a significant stage in their maturation process.
Emerging Patterns and Unanswered Questions
Based on the observations above, some clear patterns and critical questions that still await answers emerge regarding the impact of the AI revolution on software engineering.
General Trends:
Unanswered Questions:
However, despite all these positive developments, there are still important questions on my mind:
Are We Experiencing a Step Change? A Philosophical Look at the Future
Bringing all this data together, the general impression is that a significant step change is occurring in our software development practice. The enthusiasm of company executives and founders regarding AI may also be related to financial goals, especially for AI-focused companies, which is understandable. It's also a logical strategy for major tech companies to invest in AI. Startups experimenting with these tools is also an expected situation. However, what's most striking is that experienced engineers, who have been in the industry for many years, are achieving more success with AI and are more eager to use these tools. This indicates that the technology is moving beyond being just a tool for beginners and is starting to integrate deeply into the workflows of seasoned professionals.
Martin Fowler 's comment on this is quite thought-provoking. He believes that LLMs will change software development at a level similar to the transition from assembly language to high-level programming languages. High-level languages provided a similar increase in efficiency compared to assembly. Fowler suggests that LLMs will also offer a similar efficiency boost, but the major difference will be their non-deterministic nature for the first time. This means that the code may not always produce the same output, but it could still be functional. This situation could bring about radical changes in software testing and verification processes.
Another figure supporting this philosophical approach is Kent Beck , a seasoned software engineer who has been coding for 52 years. In a long conversation on the podcast, Kent used a surprising phrase like, "I haven't had this much fun programming in 52 years." Kent, who stated he was tired of learning new technologies and constantly switching to new frameworks, says that LLMs are giving him the opportunity to undertake much more ambitious projects. He is currently working on a Smalltalk server he's wanted to build for years.
Kent Beck compares LLMs to revolutionary technological changes like microprocessors, the internet, and smartphones. According to him, these changes transformed "the entire landscape of what is cheap and what is expensive." Things we didn't do before because they were expensive or difficult have become incredibly cheap thanks to AI.
In conclusion, we are living in a changing world, and more experimentation is needed. Just like startups, we need to understand what works and what doesn't, and discover what's cheap and what's still expensive. Software development is no longer just about writing code; it's also becoming a new craft that involves interacting with AI agents, understanding their output, and guiding them. This is an exciting period that requires every engineer to experiment in their own workflow and explore these new tools.