When Your Tests Tell You What Your Code Should Do

Christian Crumlish

Kind Director of Product, 18F alum, Product Management for UX People author, Piper Morgan (AI product assistant) maker, Design in Product curator, Layers of Meta bandleader

Published Aug 6, 2025

June 27

We were down to just two failing tests out of 64, and my first instinct was what it always is when tests fail after escaping a crisis: these tests must be wrong.

I mean, it made sense. We’d just come through the 48-hour rollercoaster — that epic journey to complete disaster and back with from “LIFE SAVER !!!” redemption. The system was working. File uploads were processing. Analysis was running end-to-end. Surely a couple of stubborn test failures were just… artifacts from the chaos.

Turns out, the tests weren’t wrong. They were trying to teach me about my own architecture.

Coming back from chaos

After the 48-hour rollercoaster, we had msfr this beautiful recovery: 64 tests, clean architecture, everything working. But when I ran the full suite, 2 tests were still failing in the DocumentAnalyzer.

Classic developer reaction: “Well, those tests are probably just outdated expectations from before we fixed everything.”

The failing tests were complaining that DocumentAnalyzer was throwing FileAnalysisError exceptions instead of returning AnalysisResult objects with error metadata. My first thought? “These tests just haven’t been updated to match our new exception-handling approach.”

So I started to “fix” the tests. (Always a bad instinct.)

The moment everything clicked

But then I did something I don’t always remember to do: I looked at the other analyzers to see what pattern they were actually following (OK, I asked Cursor to look and tell me).

CSVAnalyzer: Returns AnalysisResult with error info in metadata.
TextAnalyzer: Returns AnalysisResult with error info in metadata.
DocumentAnalyzer: Throws FileAnalysisError exceptions.

Hmm.

One of these things was not like the others.

That’s when it hit me: the tests weren’t wrong about what they expected. DocumentAnalyzer was doing it wrong.

Tests as architectural documentation

Here’s what I learned from those two “failing” tests: they weren’t testing what the code currently did. They were documenting what the code should do according to our domain model.

The AnalysisResult domain object was designed with a clear contract: analyzers always return results, with errors going into the metadata field. Never throw exceptions. Always return something useful.

CSVAnalyzer and TextAnalyzer were following this contract perfectly. But somewhere along the way — probably during one of those frantic “let’s just get this working” moments — someone (okay, it was probably me) had allowed DocumentAnalyzer to throw exceptions instead.

The tests were trying to tell me: “Hey, you have a domain contract here. DocumentAnalyzer is violating it.”

When code drifts from intention

This is one of those subtle ways systems drift from their original intentions. It’s not dramatic architectural decay — it’s just inconsistency. One component doing things slightly differently from its siblings.

But that inconsistency compounds. Today it’s “DocumentAnalyzer throws exceptions while everyone else returns error metadata.” Tomorrow it’s “well, some analyzers throw exceptions, so I guess PDFAnalyzer can too.” Before you know it, your error handling is a mess and nobody remembers what the original pattern was supposed to be.

Mind you, this kind of drift happens to everyone. Especially when you’re moving fast, fixing urgent issues, or trying to get something working after a crisis. The question is whether you catch it before it spreads.

The beauty of domain contracts

What struck me about this whole episode was how clear the domain contract actually was, once I paid attention to it.

AnalysisResult was designed to always be returned. Success or failure, you get an AnalysisResult object. If something went wrong, the error information goes in the metadata field, but you still get a proper result object with whatever analysis could be completed.

This isn’t just a nice-to-have pattern. It’s what allows the rest of the system to handle analysis results consistently, regardless of which analyzer produced them or whether everything went perfectly.

When DocumentAnalyzer started throwing exceptions instead, it broke that contract. The calling code had to start handling two different error patterns: sometimes you get an AnalysisResult with error metadata, sometimes you get an exception. That’s cognitive overhead for every developer who touches the code.

Following the test guidance

So instead of changing the tests to match the exception-throwing behavior, I changed DocumentAnalyzer to honor the domain contract.

The fix was straightforward: wrap the analysis logic in try-catch, and when things go wrong, return an AnalysisResult with the error information in metadata instead of throwing the exception.

The moment I made that change: 64/64 tests passing.

Tests as conversation partners

What this experience taught me is that tests can be conversation partners in architectural decisions, not just validators of current behavior.

When tests fail, especially after a period of rapid change, the question isn’t just “what do I need to change to make this pass?” It’s “what is this test trying to tell me about my system’s intentions?”

Those two “failing” tests weren’t obstacles to overcome. They were documentation of a better way to structure error handling across the analysis layer. They were architectural guidance disguised as test failures.

The larger pattern

This connects to something larger about how tests function in a mature codebase. Good tests don’t just verify that code works — they document what “working” means according to your system’s design principles.

When you write tests, you’re not just checking current behavior. You’re encoding architectural intentions. You’re creating a conversation between your current understanding and your future self’s implementation decisions.

The pattern recognition — seeing that DocumentAnalyzer was the outlier — came from slowing down enough to actually look at what the tests were expecting versus what the code was doing.

When to trust your tests

So when should you trust your tests over your code? Here are some signals I’ve learned to watch for:

Trust tests when: They’re checking domain contracts, not just implementation details. When multiple similar components follow one pattern and one outlier follows another. When tests are failing after rapid changes or crisis periods.

Question tests when: They’re testing implementation details that could reasonably change. When they’re checking specific error messages instead of error handling patterns. When they were written for a different phase of the project’s evolution.

The key is learning to distinguish between tests that encode architectural wisdom and tests that encode temporary implementation choices.

The 64/64 moment

There’s something satisfying about seeing 64/64 tests pass after making an architectural decision based on test guidance. It’s not just “yay, green checkmarks.” It’s validation that your system has a coherent design philosophy, and that your tests are documenting it accurately.

That moment when DocumentAnalyzer fell into line with the established pattern, and suddenly all the error handling across the analysis layer was consistent — that’s what good architecture feels like. Not perfect code, but coherent code.

Listening to your codebase

The broader lesson here is about listening to your codebase. Tests are one way it talks to you. Patterns across similar components are another. Inconsistencies that make you pause and think “wait, why does this one work differently?” — those are conversations waiting to happen.

Your codebase is constantly trying to tell you about its own design principles. Sometimes through test failures. Sometimes through code that feels awkward to write. Sometimes through inconsistencies that make onboarding new team members harder than it should be.

The trick is slowing down enough to listen.

Next on Building Piper Morgan: “Following Your Own Patterns” on that magical state where your architecture makes new features feel inevitable rather than difficult.

What’s your experience with tests as architectural guidance? Have you had moments when “failing” tests led you to better design decisions? I’d love to hear about times when your codebase taught you something you didn’t expect to learn.

Building Piper Morgan

626 followers

+ Subscribe

Jon Innes

Founder, UX Innovation LLC

Writing or at least making sure you have good tests is even more critical with AI generated code. Even if your tests are generated with the assistance of AI. In fact I have been telling folks that the best prompt engineering for application development is evolving into something like Gherkin (associated with Cucumber testing tool) and mock objects and stubs.

1 Reaction

Peter Boersma

Design Org Designer: Design Operations, Design Management, Design Process, and Design Strategy.

THIS! "When you write tests, you’re not just checking current behavior. You’re encoding architectural intentions. You’re creating a conversation between your current understanding and your future self’s implementation decisions."

When Your Tests Tell You What Your Code Should Do

Christian Crumlish

Kind Director of Product, 18F alum, Product Management for UX People author, Piper Morgan (AI product assistant) maker, Design in Product curator, Layers of Meta bandleader

Coming back from chaos

The moment everything clicked

Tests as architectural documentation

When code drifts from intention

The beauty of domain contracts

Following the test guidance

Tests as conversation partners

The larger pattern

When to trust your tests

The 64/64 moment

Listening to your codebase

Building Piper Morgan

626 followers

More articles by this author

Others also viewed

The Demo That Broke (And Why That’s Perfect)

In the Middle of the War, Don’t Throw Away Your Rifle: How to Maintain Legacy Code with Minimal but Meaningful Change

Interfaces in C#

Factory Pattern without if-else

Buildah: A Complete Overview

Master the Chain of Responsibility Pattern in Go with This Real-World Example

Understanding the Node.js Event Loop: An In-Depth Guide

How I Broke 13 Endpoints and Learned the Hard Way About SOLID’s Open/Closed Principle

Unlock the Secrets of Null Handling in C#: Say Goodbye to Null Reference Exceptions Forever!

What is the different between library and framework?

Explore topics

Coming back from chaos

The moment everything clicked

Tests as architectural documentation

When code drifts from intention

The beauty of domain contracts

Following the test guidance

Tests as conversation partners

The larger pattern

When to trust your tests

The 64/64 moment

Listening to your codebase

Building Piper Morgan

626 followers

When Overconfidence Meets rm -rf

Aug 23, 2025

The Pygmalion Effect

Aug 22, 2025

The Bug That Made Us Smarter

Aug 22, 2025

Weekly Ship #005: From Infrastructure to Impact

Aug 22, 2025

When the Bugs Lead You Home

Aug 21, 2025

Two-Fisted Coding: Wrangling Robot Programmers When You’re Just a PM

Aug 20, 2025

The Coordination Tax: When Copy-Paste Becomes Your Biggest Bottleneck

Aug 19, 2025

The Debugging Cascade: A 90-Minute Journey Through Integration Hell

Aug 18, 2025

Teaching an AI to Sound Like Me (Without Losing My Mind)

Aug 17, 2025

Session Logs: A Surprisingly Useful Practice for AI Development

Aug 16, 2025

Others also viewed

The Demo That Broke (And Why That’s Perfect)

In the Middle of the War, Don’t Throw Away Your Rifle: How to Maintain Legacy Code with Minimal but Meaningful Change

Interfaces in C#

Factory Pattern without if-else

Buildah: A Complete Overview

Master the Chain of Responsibility Pattern in Go with This Real-World Example

Understanding the Node.js Event Loop: An In-Depth Guide

How I Broke 13 Endpoints and Learned the Hard Way About SOLID’s Open/Closed Principle

Unlock the Secrets of Null Handling in C#: Say Goodbye to Null Reference Exceptions Forever!

What is the different between library and framework?

Explore topics