Mutation Testing

Mutation Testing
Hernán Wilkinson Nicolás Chillo Gabriel Brunstein
UBA - 10Pines UBA UBA
hernan.wilkinson@gmail.com
nchillo@gmail.com gaboto@gmail.com

What is Mutation Testing?

Technique to verify the quality of the tests

What is Mutation Testing?

Verify Quality of… Verify Quality of…

Source Code Tests Mutation
Testing

How does it work?
1st Step: Create the Mutant

Mutation
Process

The Source
Code The “Mutant”

The Mutation “Operator”

Examples
DebitCard>>= anotherDebitCard
^(type = anotherDebitCard type)
and: [ number = anotherDebitCard number ]

Operator: Change #and: by #or:

CreditCard>>= anotherDebitCard
^(type = anotherDebitCard type)
or: [ number = anotherDebitCard number ]

Examples
Purchase>>netPaid
^self totalPaid – self totalRefunded

Change #- with #+

Purchase>>netPaid
^self totalPaid + self totalRefunded

How does it work?
2nd Step: Try to Kill the Mutant

A Killer
The “Mutant” tries to kill the Mutant!

All tests run  The Mutant Survives!!!
The Test Suite
A test fails or errors  The Mutant Dies

Meaning…

The Mutant Survives  The case generated by the mutant
is not tested

The Mutant Dies  The case generated by the mutant is
tested

Example: The mutant survives
^(type = anotherDebitCard type) and: [ number = anotherDebitCard number ]


^(type = anotherDebitCard type) or: [ number = anotherDebitCard number ]

DebitCardTest>>testDebitCardWithSameNumberShouldBeEqual
self assert: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 123).

Example: The mutant dies
^(type = anotherDebitCard type) and: [ number = anotherDebitCard number ]


^(type = anotherDebitCard type) or: [ number = anotherDebitCard number ]

DebitCardTest>>testDebitCardWithSameNumberShouldBeEqual
self assert: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 123).

DebitCardTest >>testDebitCardWithDifferentNumberShouldBeDifferent
self deny: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 789).

Example: The mutant survives
Purchase>>netPaid

Change #- with #+
Purchase>>netPaid

Purchase>>testNetPaid
| purchase |
purchase := Purchase for: 20 * euros.
self assert: purchase netPaid = (purchase totalPaid – purchase totalRefunded)

Example: The mutant dies
Purchase>>netPaid

Change #- with #+
Purchase>>netPaid

Purchase>>testNetPaidWithOutRefunds  Renamed!
| purchase |

Purchase>>testNetPaidWithRefunds
| purchase |
purchase addRefundFor: 10 * euros.

How does it work? - Summary
• Changes the original source code with
special “operators” to generate “Mutants”
• Run the test suite related to the changed
code
• If a test errors or fails  Kills the mutant
• If all tests run  The Mutant survives
• Surviving Mutants show not tested cases

The Important Thing!

MuTalk

Mutation Testing Tool for Smalltalk (Pharo
and Squeak)

MuTalk – How does it work?
• Runs the test to be sure that all run
• For each method m
• For each operator o
• Changes m AST using o
• Compiles mutated code
• Changes method dictionary
• Run the tests

MuTalk – Operators
• Boolean messages
• Remove #not
• Replace #and: with #eqv:
• Replace #and: with #nand:
• Replace #and: with #or:
• Replace #and: with #secondArgResult:
• Replace #and: with false
• Replace #or: First Condition with false
• Replace #or: Second Condition with false
• Replace #or: with #and:
• Replace #or: with #xor:

• Magnitude messages
• Replace #'<=' with #<
• Replace #'<=' with #=
• Replace #'<=' with #>
• Replace #'>=' with #=
• Replace #'>=' with #>
• Replace #'~=' with #=
• Replace #< with #>
• Replace #= with #'~='
• Replace #> with #<
• Replace #max: with #min:
• Replace #min: with #max:

• Collection messages
• Remove at:ifAbsent:
• Replace #reject: with #select:
• Replace #select: with #reject:
• Replace Reject block with [:each | false]
• Replace Reject block with [:each | true]
• Replace Select block with [:each | false]
• Replace Select block with [:each | true]
• Replace detect: block with [:each | false] when #detect:ifNone:
• Replace detect: block with [:each | true] when #detect:ifNone:
• Replace do block with [:each |]
• Replace ifNone: block with [] when #detect:ifNone:
• Replace inject:aValue into:aBlock with aValue
• Replace sortBlock:aBlock with sortBlock:[:a :b| true]

• Number messages
• Replace #* with #/
• Replace #+ with #-
• Replace #- with #+
• Replace #/ with #*

• Flow control messages
• Remove Exception Handler Operator
• Replace #ifFalse: receiver with false
• Replace #ifFalse: receiver with true
• Replace #ifFalse: with #ifTrue:
• Replace #ifFalse:IfTrue: receiver with false
• Replace #ifFalse:IfTrue: receiver with true
• Replace #ifTrue: receiver with false
• Replace #ifTrue: receiver with true
• Replace #ifTrue: with #ifFalse:
• Replace #ifTrue:ifFalse: receiver with false
• Replace #ifTrue:ifFalse: receiver with true

Is not new … - History

Begins in 1971, R. Lipton, “Fault Diagnosis of
Computer Programs”

Generally accepted in 1978, R. Lipton et al,
“Hints on test data selection: Help for the
practicing programmer”

Why is not widely used?

Maturity Problem: Because Testing is not
widely used YET!
(Although it is increasing)


Integration Problem: Inability to successfully
integrate it into the software development
process
(TDD plays a key role now)


Technical Problem: It is a Brute Force
technique!

Technical Problems
• Brute force technique

NxM
N = number of tests
M = number of mutants

Aconcagua
• Number of Tests: 666
• Number of Mutants: 1005
• Time to create a mutant/compile/link/run:
10 secs. each aprox.?
• Total time:
– 6693300 seconds
– 1859 hours, 15 minutes

Another way of doing it…
CreditCard>>= anotherCreditCard
^(anotherCreditCard isKindOf: self class) and: [ number =
anotherCreditCard number ]

CreditCard>>= anotherCreditCard
MutantId = 12 ifTrue: [ ^(anotherCreditCard isKindOf: self class) or: [
number = anotherCreditCard number ].
MutantId = 13 ifTrue: [ ^(anotherCreditCard isKindOf: self class)
nand: [ number = anotherCreditCard number ].
MutantId = 14 ifTrue: [ ^(anotherCreditCard isKindOf: self class) eqv: [
number = anotherCreditCard number ].

Aconcagua
• Number of Tests: 666
• Number of Mutants: 1005
• Time to create the
metamutant/compile/link: 2 minutes?
• Time to run the tests per mutant: 1 sec
• Total time:
– 1125 seconds
– 18 minutes 45 seconds

MuTalk Optimizations
Running Strategies
Mutate all methods, run all tests per Mutate covered methods, run all
mutant tests per mutant
– Create a mutant for each method – Takes coverage running all tests
– Run all the test for each mutant – Mutate only covered methods
– Disadvantage: Slower strategy – Run all methods per mutant
– Relies on coverage

Mutate all methods, run only test Mutate covered methods, run only test
that cover mutated method that covered mutated methods
– Run coverage keeping for each – Run coverage keeping for each
method the tests that covered it method the tests that covered it
– Create a mutant for each method – Create a mutant for only covered
– For each mutant, run only the methods
tests that covered the original – For each mutant, run only the tests
method that covered the original method

MuTalk - Aconcagua Statistics
• Mutate All, Run All: 1 minute, 6 seconds
• Mutate Covered, Run Covering: 36
seconds
• Result:
• 545 Killed
• 6 Terminated
• 83 Survived

MuTalk Optimizations
Terminated Mutants

Try to kill the Mutant!

The killer has to be
“Terminated”

The Test Suite

MuTalk - Terminated Mutants

• Take the time it runs each test the first
time
• If the test takes more thant 3 times,
terminate it

Let’s redefine MuTalk as…

Mutation Testing Tool for Smalltalk (Pharo
and Squeak) that uses meta-facilities to
run faster and provide inmediate feedback

Work in progress

• Operators Categorization based on how
useful they are to detect errors
• Filter Operators on View
• Cancel process

Future work

• Make Operators more “inteligent”
• a = b ifTrue: [ … ]
• a = b ifFalse: [] is equivalent to a ~= b ifTrue: []
• Suggest tests using not killed mutants
• Use MuTalk to test MuTalk?

Why does it work?

“Complex faults are coupled to simple faults
in such a way that a test data set that detects
all simple faults in a program will detect most
complex faults” (Coupling effect)
Demonstrated in 1995, K. Wah, “Fault coupling in finite
bijective functions”

Why does it work?

“In practice, if the software contains a fault,
there will usually be a set of mutants that can
only be killed by a test case that also detects
that fault”
Geist et al, “Estimation and enhancement of real-time
software reliability through mutation analysis”, 1992

How does it compare to
coverage?
• Does not replaces coverage because
some methods do not generate mutants
• But:
• Mutants on not covered methods will survive
• It provides better insight than coverage
• Method Coverage fails with long
methods/conditions/loops/etc.

MuTalk - Mutation
Testing for Smalltalk

Hernán Wilkinson Nicolás Chillo Gabriel Brunstein
UBA - 10Pines UBA UBA
hernan.wilkinson@gmail.com
nchillo@gmail.com gaboto@gmail.com

Mutation Testing

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Mutation Testing (20)

More from ESUG (20)

Recently uploaded (20)

Mutation Testing