The Day We Stopped Fighting the System
July 3
The day started with what seemed like a simple mission: complete UI testing for Piper Morgan. You know how these things go by now, I’m sure? Six hours later, I’m having philosophical conversations with my Lead Developer about architectural integrity and the nature of document summarization.
Let me walk you through this particular journey into the depths of technical debt, POC contamination, and the humbling realization that sometimes your AI knows better than you do.
The ghost of POC past
We kicked off with what looked like a straightforward import error:
ImportError: cannot import name 'WorkflowDefinition' from 'services.domain.models'
“This should be quick,” I thought. Famous last words, right?
What we uncovered was a classic case of POC haunting: code from a month-old proof of concept had survived to contaminate the MVP implementation. The POC used input_data and output_data fields, while the production code had moved to context and result. Tests were passing because they were mocked to death, but the actual persistence layer was failing silently.
My gut check to the Lead Developer: “Are we still properly following domain-driven design?” Often, this question helped nudge us back on track, but this time the answer was yes. It’s just that we’d allowed POC code corrupt our clean architecture. Time for an exorcism.
The repository that wasn’t
After successfully removing the POC contamination, we hit our next puzzle. File queries were returning “file not found” despite the file clearly existing in the database. This led to my favorite exchange of the session:
Lead Developer: [Writes elaborate grep commands to investigate repository patterns]
Me: “Why were you guessing at all? Just ask me for a tree or for files if you need to inspect directly?”
Touché. Sometimes the PM needs to remind the architect-bot that they have a human with direct file access right there. (This is one of those things they don’t teach you in distributed team management courses.)
What we discovered was architecturally fascinating: Piper Morgan was using a two-tier data access pattern:
This wasn’t a bug — it was an intentional performance optimization. Files don’t need ORM overhead. But I hadn’t reviewed the architecture docs recently enough to remember this design “decision.”
The case of the stubborn LLM
Here’s where things got genuinely interesting. We wanted document summarization to be a simple QUERY operation. The LLM had other ideas:
Intent classified as - Category: IntentCategory.SYNTHESIS, Action: generate_summary
We updated prompts. We added examples. We restructured the classification logic. The LLM remained unmoved — summarization was SYNTHESIS, and that was final. I started wonderign if we had someone mistrained something but was assured we weren’t even doing any of our own training yet.
My response: “OK back… what’s next? I can do a few more steps then need to run out to pick up dinner.”
This perfectly captured our incremental approach — life happens, architecture endures. But also: I was starting to feel that familiar itch of fighting the system instead of listening to it.
The architectural reckoning
As we tried to patch our way to a working summarization feature, I had to drop the wisdom bomb on myself:
“I am concerned we may be losing our architectural perspective if we keep patching bugs as we find them.”
I sometimes feel like the stern papa asking my bots if they really brushed their teeth and washed their hands before bed but the question needed to be asked.
We were fighting the system instead of working with it. The LLM classified summarization as SYNTHESIS because it is synthesis — creating new content from existing material. Our attempts to force it into a QUERY pattern were architectural hubris.
You know that feeling when you’re trying to force a USB cable in upside down? That’s what we were doing to our intent classification system.
The plot twist
The real discovery? SYNTHESIS was completely unimplemented. It had no routing, no handlers, just a generic “I’ll help you create that” response. We’d been trying to shoehorn functionality into the wrong category when the right category was sitting there, abandoned and waiting.
It’s like finding out you’ve been trying to unlock your front door with your car key when the right key was in your other pocket the whole time.
Some things I learned (or how systematic thinking saved us from ourselves)
1. Consult architecture docs first
My failure to review the architecture docs cost us investigation time. The two-tier data pattern was documented — I just didn’t look. This is the PM equivalent of RTFM, and I earned that lesson the hard way.
2. Work WITH the system
When the intent classifier consistently makes a choice, maybe it knows something we don’t. Document summarization IS synthesis, not a query. Sometimes the AI is trying to tell you something about the nature of the work itself.
3. TDD works
Our FileQueryService succeeded because we followed TDD strictly: Red → Green → Refactor. No shortcuts. The discipline pays off every time, even when (especially when) you’re tempted to skip it.
4. Know when to stop
The moment we started our third “quick fix,” we should have stopped. Multiple patches indicate architectural work is needed, not more patches.
5. Session logs are critical
I noticed the Lead Developer wasn’t maintaining a session log as I generally ask each bot to do. Creating one, even retrospectively, immediately helped track our journey and decisions. Documentation isn’t bureaucracy — it’s institutional memory.
The code we shipped
Despite the challenges, we accomplished significant work:
The code we didn’t ship
More importantly, we chose NOT to ship a hacky summarization implementation. Instead, we documented exactly what still needs to be built:
Sometimes the best code is the code you choose not to write. Know when to fold ’em and all that, right?
The human side
This session reminded me why being the primate in the loop working with AI agents is so valuable. My interventions weren’t disruptions — they were course corrections:
Each intervention prevented technical debt and maintained architectural integrity. It’s like having a very patient pair programming partner who never gets tired of answering the question, “are we sure this is the right approach?”
The denouement
We ended the session not because we ran out of time, but because we chose to stop. The next session would begin with architectural design, not bug fixes.
This is what systematic thinking looks like in practice: recognizing when you’re fighting the system instead of working with it, and having the discipline to pause and reassess.
Final thought: If your AI assistant argues with you about categorization, maybe listen. It might be trying to tell you something about the nature of the work itself.
Session Stats:
Next on Building Piper Morgan: We’ll implement SYNTHESIS properly and finally get that document summary in “The Day We Taught Piper to Summarize (Almost).” But first, architecture.
Have you ever found yourself fighting your own system’s wisdom? When has stepping back and reassessing turned a frustrating debugging session into an architectural insight? I’d love to hear your stories of learning to work with, rather than against, the systems you’re building.