AI Writes It. You Ship It. Who's Responsible?

AI Writes It. You Ship It. Who's Responsible?

What happens when we stop understanding the thing we built?

Summary:

  • The new tension: speed versus observability
  • From metal/code to models/outcome
  • Matrix: What do we lose when we gain all this productivity?
  • We don’t own the code, we own the result
  • Conclusion: more sugar


The new tension: speed versus observability

Modern developers updates their codebase using Copilot, Cursor, Windsurf, and build full-stack app using generators like Bolt, v0, or Base44. The entire flow starting from "OK, where is the code for this? where should I start? which pattern should I use?" is abstracted away. The code often becomes somehow disposable. In disposable apps, do we even care what language is used? It's irrelevant. AI fills in the blanks, a platform deploys it, we can use it it, and observability dashboards tell us if it works. That's good: outcome matters more than output, right?

Yet every abstraction hides complexity. A prompt that outputs 5000 lines of code (see this user using Cursor), no one is going to review the code. We’ve never been more productive, but we’ve also never been further from understanding and dealing with unknown risks.

This is the new tension: speed versus observability. We’re shifting from building systems to building outcomes, and hoping everything in between behaves.


From metal/code to models/outcome

You know this xkcd, right?

Article content
xkcd 2347: components that are dependent on components in lower levels

Decades ago, we punched holes in cards and waited hours to see if programs ran. Then came assembly, C, and higher-level languages. Each step brought more power, better tooling, and opened new domains, better games, more powerful OS, web apps, etc. Productivity exploded.

Between 2008 and 2020, the number of mobile apps on the App Store and Google Play grew from a few thousand to over 5 million combined. Low-code platforms and rapid deployment tools—like Vercel, Netlify, Railway, Heroku, and Supabase—accelerated this growth even further. The bar to ship something became incredibly low. Everyone could build—few needed to understand the stack. But with every leap, we moved one layer further from the bare metal, and the number of developers who could reason about what's happening underneath got smaller.

Then came cloud. We stopped managing servers. With Kubernetes, we handed over orchestration. With serverless, we stopped thinking about infrastructure. Each step: more abstraction, more leverage, less control.

Then with GenAI, we hit a new curve. We don’t just abstract infrastructure—we abstract implementation. Describe what you want, and the model generates the code, the config, even the tests. AI agents plan, loop, and act across tools. LLMs auto-generate Terraform, SQL, or API scaffolds. We’re no longer writing the system—we’re prompting it.

This shift is massive. It flips engineering from creators to stewards. Less about syntax, more about outcomes. We’re slowly switching from "hey, let me type the code" to "okay, is it working? Is it delivering what we expected?"


Matrix: What do we lose when we gain all this productivity?

We lose people who understand the foundations. We lose the ones who know how the system works at its core. The number of developers who can write low-level code, tune compilers, or optimize memory handling keeps shrinking. Those skills are fading. They're still needed—in chip design, in real-time systems, in critical infrastructure—but fewer people are learning them.

The more productive we become, the more foundational knowledge we lose. That’s the trade-off. It’s not just a trend—it’s a pattern that repeats with every leap in abstraction. Fewer engineers can explain memory layout, thread contention, or backpressure. It’s harder to find someone who can debug a misconfigured TCP stack or optimize IO throughput.

Legacy systems still run banks. AS400s and Cobol are not just alive—they're essential to daily operations. Yet few engineers understand them anymore. The institutional knowledge is vanishing, and the teams that can maintain or migrate these systems are aging out. As a result, large-scale migration projects—costly, risky, and often multi-year—have become urgent and unavoidable.

In modern stacks, the same drift happens. We deploy databases with Terraform, autoscale them using Kubernetes operators, and wire in metrics with OpenTelemetry. When the dashboard is green, we assume the system is healthy. We forget what tuning levers exist—memory pressure thresholds, connection pool settings, retry budgets. Over time, we stop even knowing what "normal" looks like.

AI tooling amplifies this. With Copilot or Windsurf, we don’t read code—we skim it. We paste in a prompt and trust the output.

It's common now to YOLO-merge large AI-generated changes without review—because it's not "human" code, who is to blame? For disposable apps, who cares. But in stateful, enterprise-grade systems with real operational risk and revenue impact, this is reckless. AI isn’t accountable, and not trusted at all.

We still review code in production systems. We still discuss schema evolution, Protobuf versions, or Avro compatibility. But for how long? What happens when even review is abstracted? When the model decides what gets pushed to main? (which is ultimately where we are going)

When systems get too deep to inspect, we stop debugging and start reacting. If the dashboard is green, we assume all is well. If not, we will prompt to patch the outcome (i want green) but not the root cause anymore (code or configuration lost somewhere).

We will more and more rely on behavior, not understanding.

That’s not control. That’s dependency.

Remember Zion in The Matrix? Relying on old machines just to survive. Humanity thought it had freed itself, but it only traded one layer of servitude for another, less visible one. We trust systems we no longer understand, hoping they keep working. And when they don’t, we’ll discover how little control we truly have.

Article content
Zion in Matrix, boring machines helping humanity surviving there


We don’t own the code anymore, we own the result

This is where we are heading: engineering for outcomes.

With GenAI, we've started to operate differently. We tune RAG pipelines, tweak context windows, test reranking strategies, and adjust retrieval logic. We don’t just measure accuracy, we evaluate hallucinations, cost, latency, and track whether the agent loop converges. The engineering shifts from writing control flow to designing learning flow.

LLMOps is a real discipline:

  • prompt tracing
  • token budgeting
  • caching
  • eval tracking

We rely on tools like LangSmith, Traceloop, and PromptLayer, not to inspect model internals, but to monitor how LLM/agentic stacks behave. We trust outcomes because we can’t (and won’t) trace and understand all the internal logic. We use GitOps and dbt models to threat what we want with declarative static code (our intents) without checking what's really out there. We're trusting behavior over code. We don’t ask “Is this the best implementation?” anymore. We ask “Is this working?”

  • From delivering features to delivering value (what we always want)
  • From writing code to measuring impact (business observability)

The real impact won’t come from writing better code, but from thinking better about what the code should achieve. Outcome-thinkers will outperform output-producers. A sharp junior aligned with business outcomes can have more impact than a lone senior engineer coding in isolation. This is where the industry is moving (look at all these layoffs).


Conclusion: more sugar

As humans, we love abstractions, we love having less cognitive load, it makes our life easier and our time-to-satisfaction faster. 🍬 The only way to keep control is to own the outcomes and define the limits. AI didn’t replace engineering. It just raised the bar on what matters.

Michael Pihosh

CSO at Crunch | Leveraging AI, ML & Agentic AI Initiatives | Scalable Software Development

2mo

Great insights on the importance of observability.

Like
Reply
Dino Fancellu

Senior Scala Developer (Cats Effect, ZIO) (Contract, 100% remote)

2mo

The one who gets paid

To view or add a comment, sign in

Others also viewed

Explore topics