SlideShare a Scribd company logo
coverage tool is
measuring the
wrong thing
(on purpose)
A deep(-ish) dive
into
code coverage
About Me
Sean Reilly
@seanjreilly
Who uses code
coverage now?
Why use code
coverage?
“We want to know how
well
our code is tested”
Code
coverage is
good• Testing code coverage is
good
• How it’s done is often not so
good
• Why is that?
The Starting
Point• “How well is our code
tested?”
• This is a qualitative measure
• Computers don’t do
qualitative
• Can we make it quantitive?
A Quantitive
Measure
“How many lines* of code can
I delete without causing any
tests** to fail?”
*statements, methods, branches, et
**or compilation
Why is this a good
measure?
• Direct translation of the
qualitative question
• Makes sense
• Minimises code written for a
set of tests
This is
expensive• Really, really, expensive
• n
statements/branches/method
s = n(n-1) compile and test
cycles
• We need something cheaper
Downgrade from
business class to
cattle class
Find a lower cost
approximation
A (low-budget)
Quantitive Measure
“How many lines* of code are
executed when all of the tests
are run?”
*statements, methods, branches, et
A (low-budget)
Quantitive Measure
• Much cheaper
• Approximately* the same
thing
* All the fun happens here!
The
differences
Problem areas
• Synthetic methods
• Things you do to “make it
compile”
• Java 7 features
• Useless code
Learnings
• All code coverage engines
are flawed
• Some are profoundly flawed
• It’s possible to lower
coverage by deleting
untested code
Proof
Proof
Learnings
• You cannot reliably enforce:
• “X% or higher coverage”
• “Coverage always goes up”
Automatic
enforcement of
coverage levels is
troublesome
(Unless the coverage level is 1
What to do
instead?
• Spot check
• Manually use the more stringent
measure
• Compare to last week, not last
commit
• If the number goes down, know
why
• Separate covered and uncovered
Two more
things
Should I test getters
and setters?
• No
• But…
• Delete getters and setters
that you can delete without
making a test fail
Should I enforce 100%
code coverage?
• It depends…
• Why you’re doing it
• Who decides
• If it feels like work
Mutation
Testing?
What is mutation
testing?
• Mutate statements instead of
deleting them
• Every mutation should make
a test fail
Thoughts on
mutation testing
• Seems decent for loop logic or
math logic
• Doesn’t know how to mutate a lot
of statements
• Doesn’t mutate source code, just
object code
• Based on a traditional coverage
run
UNITED KINGDOM
+44 203 603 7830
helloUK@equalexperts.com
Equal Experts UK Ltd
30 Brock Street
London NW1 3FG
INDIA
+91 20 6607 7763
helloIndia@equalexperts.com
Equal Experts India Private Ltd
Office No. 4-C
Cerebrum IT Park No. B3
Kumar City, Kalyani Nagar
Pune, 411006
CANADA
+1 403 775 4861
helloCanada@equalexperts.com
Equal Experts Devices Inc
205 - 279 Midpark way S.E.
T2X 1M2
Calgary, Alberta
PORTUGAL
+351 211 378 414
helloPortugal@equalexperts.com
Equal Experts Portugal
Avenida Dom João II, Nº35
Edificio Infante 11ºA
1990-083 Parque das Nações
Lisboa – Portugal
USA
helloUSA@equalexperts.com
Equal Experts Inc
315 Hudson Street
9th Floor
New York City, NY 10013
Thanks!

More Related Content

PDF
Agile Mumbai 2020 Conference | How to get the best ROI on Your Test Automati...
PPTX
clean code - uncle bob
PPTX
Adopting Agile
PDF
Test Smarter: Efficient Coverage Metrics That Won't Leave You Exposed
PDF
Do you have a #bug? Your unit tests are not well planned
PDF
Testing strategies for legacy code
PPTX
The Psychology of C# Analysis
PDF
Leaping over the Boundaries of Boundary Value Analysis
Agile Mumbai 2020 Conference | How to get the best ROI on Your Test Automati...
clean code - uncle bob
Adopting Agile
Test Smarter: Efficient Coverage Metrics That Won't Leave You Exposed
Do you have a #bug? Your unit tests are not well planned
Testing strategies for legacy code
The Psychology of C# Analysis
Leaping over the Boundaries of Boundary Value Analysis

What's hot (20)

PPTX
Key learnings from my refactor journey.
PPTX
Code review at large scale
PPTX
How to successfully grow a code review culture
PDF
Unit testing - An introduction
PPTX
Pertanyaan dan jawaban (graham et.al 2011) part 3
PPTX
Effective Code Review
PPS
Test case design_the_basicsv0.4
PPT
An insight to test driven development and unit testing
PPT
BugDay2012 Test Design with CTE XL(SharingDay)
PPTX
What You are Doing Wrong with Automated Testing
PPTX
Effective Code Review
ODP
Testing Philosphies
PDF
Imrad structure
PDF
Code reviews
PPT
Domain analysis in Software Testing
PDF
Usability testing
PPTX
Random testing & prototyping
PDF
Random testing
PPTX
Fantastic Tests - The Crimes of Bad Test Design
PDF
Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...
Key learnings from my refactor journey.
Code review at large scale
How to successfully grow a code review culture
Unit testing - An introduction
Pertanyaan dan jawaban (graham et.al 2011) part 3
Effective Code Review
Test case design_the_basicsv0.4
An insight to test driven development and unit testing
BugDay2012 Test Design with CTE XL(SharingDay)
What You are Doing Wrong with Automated Testing
Effective Code Review
Testing Philosphies
Imrad structure
Code reviews
Domain analysis in Software Testing
Usability testing
Random testing & prototyping
Random testing
Fantastic Tests - The Crimes of Bad Test Design
Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...
Ad

Similar to Every code coverage tool is measuring the wrong thing (on purpose) (20)

PPT
Code coverage
PDF
Code Coverage vs Test Coverage_ A Complete Guide.pdf
PDF
Code Coverage vs Test Coverage_ A Complete Guide.pdf
PDF
Code Coverage
PDF
Which Development Metrics Should I Watch?
PDF
Which Development Metrics Should I Watch?
DOCX
Chapter 10 Testing and Quality Assurance1Unders.docx
PDF
Pragmatic Code Coverage
PPTX
Test Coverage: An Art and a Science
PPT
Pert. 11 - Slide Materi Black Box Testing
PPT
13-blackwhiteboxtestingfreedownloading.ppt
PPTX
Voxxed Days Athens - Improve your tests with Mutation Testing
PPTX
Code Coverage
PPTX
GeeCON - Improve your tests with Mutation Testing
PPTX
ConFoo - Improve your tests with mutation testing
PPTX
TestCon Europe - Mutation Testing to the Rescue of Your Tests
PPT
13-blackwhiteboxtesting.ppt
PPT
13-blackwhiteboxtesting.ppt
PPT
black box and white box testing .ppt
PPT
13-blackwhiteboxtesting.ppt
Code coverage
Code Coverage vs Test Coverage_ A Complete Guide.pdf
Code Coverage vs Test Coverage_ A Complete Guide.pdf
Code Coverage
Which Development Metrics Should I Watch?
Which Development Metrics Should I Watch?
Chapter 10 Testing and Quality Assurance1Unders.docx
Pragmatic Code Coverage
Test Coverage: An Art and a Science
Pert. 11 - Slide Materi Black Box Testing
13-blackwhiteboxtestingfreedownloading.ppt
Voxxed Days Athens - Improve your tests with Mutation Testing
Code Coverage
GeeCON - Improve your tests with Mutation Testing
ConFoo - Improve your tests with mutation testing
TestCon Europe - Mutation Testing to the Rescue of Your Tests
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt
black box and white box testing .ppt
13-blackwhiteboxtesting.ppt
Ad

Recently uploaded (20)

PDF
medical staffing services at VALiNTRY
PPTX
L1 - Introduction to python Backend.pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
System and Network Administration Chapter 2
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
Transform Your Business with a Software ERP System
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
history of c programming in notes for students .pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
top salesforce developer skills in 2025.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
AI in Product Development-omnex systems
PDF
Digital Strategies for Manufacturing Companies
medical staffing services at VALiNTRY
L1 - Introduction to python Backend.pptx
Design an Analysis of Algorithms I-SECS-1021-03
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Which alternative to Crystal Reports is best for small or large businesses.pdf
Understanding Forklifts - TECH EHS Solution
Navsoft: AI-Powered Business Solutions & Custom Software Development
System and Network Administration Chapter 2
Reimagine Home Health with the Power of Agentic AI​
Transform Your Business with a Software ERP System
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
history of c programming in notes for students .pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How Creative Agencies Leverage Project Management Software.pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
top salesforce developer skills in 2025.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
AI in Product Development-omnex systems
Digital Strategies for Manufacturing Companies

Every code coverage tool is measuring the wrong thing (on purpose)

Editor's Notes

  • #5: And why? Sometimes, this is “because my boss makes me”, but setting that aside…
  • #7: This is a thing I hope most of us can get behind. The problems are not that this is done, but how it’s done. Sometimes this are people problems (we’ll get into that later), but sometimes they aren’t.
  • #9: This isn’t the only measure… “can I change a statement without making a test fail?” is also a good one?
  • #11: The sample java later in this presentation has 23 statements that could potentially be removed (excluding trivial things like throws clauses and import statements). That’s 506 build and test cycles.
  • #15: 1 instrumented test run, which is more expensive than normal, but cheaper than hundreds or thousands of test runs
  • #17: Synthetic methods: Default constructors, methods on enums Java 7: ARM blocks. The compiler purposefully puts in more blocks than will be executed (null checks in the finally, etc) knowing that the JIT will optimise the extra ones away. Coverage tools don’t even attempt to detect useless code.
  • #18: Profoundly flawed = Java 7 support, etc. Delete an untested method that does nothing but was executed during a test… coverage goes down slightly. In our example, with the spurious method, instruction coverage is 82%. Without it, coverage is 70%.
  • #20: Note that branch coverage went from 100% to undefined!
  • #21: Profoundly flawed = Java 7 support, etc.
  • #22: If it’s 100%, and you delete a chunk of untested code, it should still be 100%… because all of the code that’s less should still be covered. This also holds for 0% coverage. I assume we’re all happy to ignore that case.
  • #23: Separate code: consider a module with 100% (or high) coverage, and another module without enforced coverage. Move things into the one module over time.
  • #25: Trivial getters and setters don’t need to be tested directly. Tests are executable documentation, and documentation isn’t needed for that.
  • #26: should you enforce 100% code coverage? twice in my career I’ve been on teams where we were close to 100% code coverage. In 2013 we were three or four statements/branches away for a while. Spot checking every week or so. So finally, we put in explicit tests to cover those three or four spots. A week later, we were still at 100%. I talked to some of the guys on the team… should we fail the build if coverage isn’t 100%? Let’s try it.. see what happens. We turned it on, and forgot about it for a couple of weeks. Then the first few times we tripped it, it was definitely areas where we had forgotten to write a test… so we decided to keep it. Also, when somebody asked “what’s your code coverage?” and you can say 100% without checking anything you feel like an absolute boss. Good for political reasons sometimes. :-)
  • #27: should you enforce 100% code coverage? twice in my career I’ve been on teams where we were close to 100% code coverage. In 2013 we were three or four statements/branches away for a while. Spot checking every weak-ish. So finally, we put in explicit tests to cover those three or four spots. A week later, we were still at 100%. I talked to some of the guys on the team… should we fail the build if coverage isn’t 100%? Let’s try it.. see what happens. We turned it on, and forgot about it for a couple of weeks. Then the first few times we tripped it, it was definitely areas where we had forgotten to write a test… so we decided to keep it. Also, when somebody asked “what’s your code coverage?” and you can say 100% without checking anything you feel like an absolute boss. Good for political reasons sometimes. :-)
  • #28: The last two times I’ve done this talk, people have mentioned mutation testing — specifically PIT. (Which seems to be the viable option in the Java world)
  • #29: Example mutations: return null instead of a value, subtract instead of add, that sort of thing
  • #30: The class of problem PIT is really good at catching is tests that don’t assert anything. To improve performance, PIT does a single traditional coverage run… which it then uses to learn which tests to run which mutations against. Which means it’s got a gap for statements that aren’t executed by any tests…. Same old problem. Mutating object code and not source code means that we can’t see that a mutation doesn’t make something not compile. False positives mean that improving code can still make the coverage percentage go down. An example of this would be removing one of two duplicate methods.