Hidden Technical Debt in AI

Tomasz Tunguz

Published Jul 17, 2025

That little black box in the middle is machine learning code.

I remember reading Google’s 2015 Hidden Technical Debt in ML paper & thinking how little of a machine learning application was actual machine learning.

The vast majority was infrastructure, data management, & operational complexity.

With the dawn of AI, it seemed large language models would subsume these boxes. The promise was simplicity : drop in an LLM & watch it handle everything from customer service to code generation. No more complex pipelines or brittle integrations.

But in building internal applications, we’ve observed a similar dynamic with AI.

Agents need lots of context, like a human : how is the CRM structured, what do we enter into each field - but input is expensive the Hungry, Hungry AI model.

Reducing cost means writing deterministic software to replace the reasoning of AI.

For example, automating email management means writing tools to create Asana tasks & update the CRM.

As the number of tools increases beyond ten or fifteen tools, tool calling no longer works. Time to spin up a classical machine learning model to select tools.

Then there’s watching the system with observability, evaluating whether it’s performant, & routing to the right model. In addition, there’s a whole category of software around making sure the AI does what it’s supposed to.

Guardrails prevent inappropriate responses. Rate limiting stops costs from spiraling out of control when a system goes haywire.

Information retrieval (RAG - retrieval augmented generation) is essential for any production system. In my email app, I use a LanceDB vector database to find all emails from a particular sender & match their tone.

There are other techniques for knowledge management around graph RAG & specialized vector databases.

More recently, memory has become much more important. The command line interfaces for AI tools save conversation history as markdown files.

When I publish charts, I want the Theory Ventures caption at the bottom right, a particular font, colors, & styles. Those are now all saved within .gemini or .claude files in a series of cascading directories.

The original simplicity of large language models has been subsumed by enterprise-grade production complexity.

This isn’t identical to the previous generation of machine learning systems, but it follows a clear parallel. What appeared to be a simple “AI magic box” turns out to be an iceberg, with most of the engineering work hidden beneath the surface.

Tomasz Tunguz

116,027 followers

+ Subscribe

Rhett Sampson

Founder and CTO at GT Systems

100% Tomasz Tunguz we collapse this debt into #SPAN_AI the #semanticfabric for the #agenteconomy. See link below. Love to have a chat. https://www.linkedin.com/feed/update/urn:li:activity:7350522766414012416/

Mrinal Wadhwa

CTO at Ockam

Tomasz deeply agree. Your post perfectly illustrates why we've built a developer focused, platform-as-a-service for AI agents. My bet is: the surface of white boxes in your picture will continue to grow and multiply in complexity. We provide that surface as opinionated managed services - with a dead simple DX. To add some spice to the mix, we've also made it trivial to create embarrassingly parallel fleets of collaborating agents - so they can achieve goals orders of magnitude faster.

Paresh Yadav

AI/Agentic AI/AIOPs/MLOPs/AI Agents/GCP- Architect/Engineer

Oh and we haven't shown the backstage processes/work needed like version control, CI/CD, code promotion from Dev to QA to Prod, maintaining multiple versions of the code base for different clients (if applicable), maintaining shared code/codee dependencies if any between this product and other products etc.

Mary Mendoza

Salesforce Certificated Administrator & Platform App Builder Certified | 4x Trailhead Ranger | 3 Trailhead Super Badges | Boston #SalesforceSaturday Co-Lead

And this doesn't even begin to address how clean or not clean the data is that the LLM or AI is using. Throw in a little bias, and you have a really spicy mix. Now, while that may be good for food, any chef can tell you that not everyone appreciates spicy food, and my instincts are that even fewer will appreciate the spicy results of data that isn't clean and is full of bias.

2 Reactions

Chris Parsons

Levelling up tech teams with AI that works | CTO | Agent builder | Cherrypick co-founder

Definitely seeing the same thing in my agent builds, and in my training cohorts. People are figuring out that determinism is actually really valuable to structure the system and LLMs add the magic at set points. Still a lot of work to be done to figure out the best interfaces.

Hidden Technical Debt in AI

Tomasz Tunguz

Tomasz Tunguz

116,027 followers

More articles by this author

Others also viewed

Almost Timely News: 🗞️ What The Heck is n8n And Why Are AI Nerds Obsessed With It? (2025-04-13)

SPARK: An AI Prompting Framework for Superior Model Outputs: Part 1

The Hungry, Hungry AI Model

The Future of Ai Infused Documents: The Intelligent Document format.

Navigating OpenAI Models: A Practical Guide for Business Users

Building an Advanced AI Tech Stack: A Comprehensive Guide

Unlocking Agentic AI: Why the Model Context Protocol (MCP) Matters

Software Company Vs. Data Analytics Vs. Artificial Intelligence (AI) Company

Stop Blaming the AI. Start Fixing Your Prompts.

AI Weekly by Daily Code Solutions!

Explore topics

Tomasz Tunguz

116,027 followers

From Knowledge to Action

Aug 8, 2025

Small Action Models Are the Future of AI Agents

Aug 4, 2025

The AI-Driven Cloud Market Share Shift

Aug 4, 2025

Why Seed Rounds Are Growing as Startups Shrink

Jul 28, 2025

PageRank in the Age of AI

Jul 23, 2025

The AI Search Tipping Point

Jul 22, 2025

The Question I Ask Myself Before I AI

Jul 21, 2025

The Sales Strategy Conquering the AI Market

Jul 18, 2025

The Rise of the Agent Manager

Jul 15, 2025

Budgeting for AI in Your Startup

Jul 11, 2025

Others also viewed

Almost Timely News: 🗞️ What The Heck is n8n And Why Are AI Nerds Obsessed With It? (2025-04-13)

SPARK: An AI Prompting Framework for Superior Model Outputs: Part 1

The Hungry, Hungry AI Model

The Future of Ai Infused Documents: The Intelligent Document format.

Navigating OpenAI Models: A Practical Guide for Business Users

Building an Advanced AI Tech Stack: A Comprehensive Guide

Unlocking Agentic AI: Why the Model Context Protocol (MCP) Matters

Software Company Vs. Data Analytics Vs. Artificial Intelligence (AI) Company

Stop Blaming the AI. Start Fixing Your Prompts.

AI Weekly by Daily Code Solutions!

Explore topics