Bug Introduced During Refactoring Despite 100% Branch Coverage in Automated Tests
Many years ago, a colleague and I — both senior developers — were tasked with refactoring a legacy backend Java method, approximately 200 lines of spaghetti code, in the semiconductor industry. The refactoring was necessary to extend the method for a new feature. Before implementing the feature, we created a separate merge request that only contained the refactored code.
The Java method modified fields in a SQL table and achieved 100% branch coverage in automated tests. However, the tests only verified a few columns from the first row, leaving other data — including critical numeric fields — unvalidated.
During the refactoring, we introduced a bug in the data processing logic, causing some of those unvalidated values to be modified incorrectly. The tests still passed because they didn’t assert the correctness of those parts of the data.
After that, we started verifying complete SQL tables by saving them to CSV files and comparing them during next test runs (snapshot testing).
Data testing becomes more relevant as software grows over the years, becomes more complex, and eventually turns into a form of legacy system. Such systems typically generate large volumes of data, making it crucial to validate data processing logic through automated tests, thereby improving data coverage."
Practices such as pair programming, small merge requests, and snapshot testing are highly effective in preventing serious bugs in data processing logic.
Super spannend - danke fürs Teilen!