SlideShare a Scribd company logo
Besides the obvious tools:
improving your testing with
state-of-the-art techniques
Maurício Aniche
m.f.aniche@tudelft.nl
@mauricioaniche
Photo by Sora Sagano
https://unsplash.com/photos/WA-QRL5wDMw
Content and License
• This presentation can be found at:
http://www.mauricioaniche.com/talks/2018/tad
• You can use it and modify it.
• You always have to give credits to the original author.
• You agree not to sell it or make profit in any way with this.
Test Automation Day 2018
! Jeroen Castelein
" Mozhan Soltani # Annibale Panichella
! Joop Aué ! Maikel Lobbezoo ! Rick Wieman
! Sicco Verwer
! Felienne Hermans # Davide Spadini# $ Alberto Bacchelli
! Arie van DeursenKristín Fjóla
! Peter Evers
Qianqian Zhu
• First job as a developer in 2004
• First important project in 2016
• First important bug: 2016
• Tests are important!
A little story
Photo by Michael Mims
https://unsplash.com/photos/0ZL0O-eDOpU
TEST ANALYSIS
& TEST DESIGN
clipart by j4p4n, adlerweb
https://openclipart.org/detail/297959/standing-robot
https://openclipart.org/detail/262444/bubble-person
“Testing is different from writing tests.
Developers write tests as a a way to give them
space to think and confidence for refactoring.
Testing focuses on finding bugs. Both should
be done.”
https://medium.com/@mauricioaniche/testing-vs-writing-tests-d817bffea6bc
The literature on test oracles has introduced techniques for oracle
automation, including modelling, specifications, contract-driven
development and metamorphic testing. When none of these is
completely adequate, the final source of test oracle information
remains the human, who may be aware of informal specifications,
expectations, norms and domain specific information that provide
informal oracle guidance.
TEST ANALYSIS
& TEST DESIGN
Find systematic and
automated ways to design
and execute tests!
Topics of today
• Structural testing and MC/DC
• Log monitoring and passive learning
• Search-based software testing
• Mutation testing
• Fuzzing
• Property-based testing
• Code review
• Static analysis tools
Who are you?
• Software developers?
• Software testers?
• What are your expectations here today?
• Fill this out: https://bit.ly/tad2018
clipart by GDJ
https://openclipart.org/detail/230150/crowd-of-kids
Structural
Testing
clipart by J_Alves
https://openclipart.org/detail/61405/threonine-amino-acid
Given the points of two
different players, the
program must return the
number of points the one
who wins has!
public int play(int left,
int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn > 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
public int play(int left,
int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn > 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
First criteria: “going
through all the lines”
If our test suite
exercises all the lines,
we are happy.
public int play(int left,
int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn > 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
First criteria: “going
through all the lines”
If our test suite
exercises all the lines,
we are happy.
T1 = (30, 30)
public int play(int left,
int right) {
1 int ln = left;
2 int rn = right;
3 if(ln > 21)
4 ln = 0;
5 if(rn > 21)
6 rn = 0;
7 if(ln > rn)
8 return rn;
9 else
10 return ln;
}
First criteria: “going
through all the lines”
If our test suite
exercises all the lines,
we are happy.
T1 = (30, 30)
9 / 10 = 90% line coverage
public int play(int left,
int right) {
1 int ln = left;
2 int rn = right;
3 if(ln > 21)
4 ln = 0;
5 if(rn > 21)
6 rn = 0;
7 if(ln > rn)
8 return rn;
9 else
10 return ln;
}
First criteria: “going
through all the lines”
If our test suite
exercises all the lines,
we are happy.
T1 = (30, 30)
T2 = (10,9) <-- left player wins
Make it true
public int play(int left,
int right) {
1 int ln = left;
2 int rn = right;
3 if(ln > 21)
4 ln = 0;
5 if(rn > 21)
6 rn = 0;
7 if(ln > rn)
8 return rn;
9 else
10 return ln;
}
First criteria: “going
through all the lines”
If our test suite
exercises all the lines,
we are happy.
T1 = (30, 30)
T2 = (10,9) <-- left player wins
10 / 10 = 100% line coverage
9/10 = 90%,
5/6 = 83%...
From now on, I’ll write as
many lines as I can!!
Xclipart by GDJ
https://openclipart.org/detail/230143/female-engineer-9
Test Automation Day 2018
Test Automation Day 2018
Given a sentence, you
should count the number
of words that end with
either an “s” or an “r”. A
word ends when a non-
letter appears.
int words = 0;
char last = ' ';
for(int i = 0;
i<str.length();
i++)
if(!Character.isLetter
(str.charAt(i)) &&
(last == ‘s’ || last
== ‘r’))
words++;
last = str.charAt(i);
if(last == ‘s’
|| last == ‘r’)
words++;
return words;
true
false
false
false
true
true
Control-flow graph
(CFG)
We should cover
all the branches
(arrows)
int words = 0;
char last = ' ';
for(int i = 0;
i<str.length();
i++)
if(!Character.isLetter
(str.charAt(i)) &&
(last == ‘s’ || last
== ‘r’))
words++;
last = str.charAt(i);
if(last == ‘s’
|| last == ‘r’)
words++;
return words;
true
false
false
false
true
true
“cats|dogs”
int words = 0;
char last = ' ';
for(int i = 0;
i<str.length();
i++)
if(!Character.isLetter
(str.charAt(i)) &&
(last == ‘s’ || last
== ‘r’))
words++;
last = str.charAt(i);
if(last == ‘s’
|| last == ‘r’)
words++;
return words;
true
false
false
false
true
true
“cats|dog”
Branch coverage means
we exercise all the
branches!
I wonder if that’s
enough…
if(!Character.isLetter
(str.charAt(i)))
last == 'r'last == 's’
words++;
last = str.charAt(i);
false
true
true
false
true
false
If we “explode” the if into
its several conditions, we
have more paths to
explore!
int words = 0;
char last = ' ';
for(int i = 0;
i<str.length();
i++)
if(!Character.isLetter
(str.charAt(i))
last == 'r'last == 's’
words++;
last = str.charAt(i);
if(last == ‘s'
last == ‘r’)
words++;
return words;
true
false
true
true
false
false
false
true
false
true
true
false
Ok, condition coverage
seems to cover more
than branch coverage!
If we aim for condition
coverage, are we testing
all the paths?
(A && (B | C))
Tests a b c Outcome
1 T T T T
2 T T F T
3 T F T T
4 T F F F
5 F T T F
6 F T F F
7 F F T F
8 F F F F
Path Coverage
Can we actually achieve
100% path coverage?
• The subpaths through this control flow
can include or exclude each of the
statements Si, so that in total N
branches result in 2^N paths that must
be traversed
• Choosing input data to force execution
of one particular path may be very
difficult, or even impossible if the
conditions are not independent
if (a) {
S1;
}
if (b) {
S2;
}
if (C) {
S3;
}
...
if (x) {
Sn;
}
The number of paths can
still grow exponentially
Can we test just the
important
combinations?
Modified Condition/
Decision Coverage
(MC/DC)
(A && (B | C))
Tests a b c Outcome
1 T T T T
2 T T F T
3 T F T T
4 T F F F
5 F T T F
6 F T F F
7 F F T F
8 F F F F
(A && (B | C))
Tests a b c Outcome
1 T T T T
2 T T F T
3 T F T T
4 T F F F
5 F T T F
6 F T F F
7 F F T F
8 F F F F
A = {1, 5}, {2, 6}, {3,7}
B = {2, 4}
C = {3, 4}
Final = {2, 3, 4, 6}
They are the same!
We don’t need them all
So, for N conditions, I
always have only N+1
tests! That’s definitely
better than 2n!!
McCabe’s Cyclomatic Complexity
• C = |E| - |N| + 2
• C = # decision points + 1
• C = # of decision-statements
+ 1
C > 10: method too complex
[McCabe, 1976]
[ C correlated with #lines of
code ]
32
1
7
65
4
McCabe for Testing?
No empirical evidence
that it is better than
just decision coverage.
How many tests?
• Branch: 2 tests
• All paths: 4 tests
• McCabe: 3 tests
32
1
7
65
4
McCabe: Easy to count, limited usefulness
as coverage metric
Strategy Subsumption
MC/DC
Branch + Condition
Coverage
Branch
Coverage
Statement
Coverage
• Strategy X subsumes strategy Y if
all elements that Y exercises are
also exercised by X
• No conclusive results on relative
bug-finding effectiveness have
been established.
Path coverage
What do YOU think:
Do we need 100% code coverage?
Don’t worry about
coverage, just write some
good tests.
I am ready to write some
unit tests. What code
coverage should I aim for?
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
clipart by 10_boss, bibbleycheese
https://openclipart.org/detail/202573/my-yoda
https://openclipart.org/detail/248493/pretzel-ninja
How many grains of rice
should put in that [boiling
water] pot?
I am ready to write some
unit tests. What code
coverage should I aim for?
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
It depends on how many
people you need to feed, how
hungry they are, what other
food you are serving, how
much rice you have available,
and so on Exactly!
80% and no less!
I am ready to write some
unit tests. What code
coverage should I aim for?
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
The first programmer is new and just getting started with testing.
Right now he has a lot of code and no tests. He has a long way to
go; focusing on code coverage at this time would be depressing and
quite useless. He’s better off just getting used to writing and
running some tests. He can worry about coverage later.
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
The second programmer, on the other hand, is quite experience
both at programming and testing. When I replied by asking her
how many grains of rice I should put in a pot, I helped her realize
that the amount of testing necessary depends on a number of
factors, and she knows those factors better than I do – it’s her code
after all. There is no single, simple, answer, and she’s smart enough
to handle the truth and work with that.
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
The third programmer wants only simple
answers – even when there are no simple
answers … and then does not follow them
anyway.
Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
Test Automation Day 2018
Mutation testing
Gif by h1flosse
https://openclipart.org/detail/190026/mutant
Imagine your code is a small town, where
crimes happen from times to times…
Photo by Jesus in Taiwan
https://unsplash.com/photos/c6aunWXHZZ0
Imagine your code is a small town, where
crimes happen from times to times…
clipart by kolbasun
https://openclipart.org/detail/219619/ninja-cop
Let’s simulate crimes and see
if the cops can get it!
City -> Program
Crime -> Bugs in code
Police -> Unit testing
Fake crime -> Mutation Testing
public int play(int
left, int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn > 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
public int play(int
left, int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn < 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
public int play(int
left, int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn > 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
public int play(int
left, int right) {
int ln = left;
int rn = right;
if(ln > 21)
ln = 0;
if(rn < 21)
rn = 0;
if(ln > rn)
return rn;
else
return ln;
}
If your test still passes, this is no good!
Common mutants
• Replace arithmetic operator (+, -, *, /, …)
• Replace relational operators (>, >=, <, <=, ==, !=, …)
• Replace constants (a -> a+1)
As a research field
• Since the 70s
• Benefits:
• Better fault exposing capability
• A good alternative to real faults
• Limitations:
• High computational power
• Undecidable Equivalent Mutant Problem
•Mutants for other problems
• SQL
In order to alleviate the computational issues, we
present a diff-based probabilistic approach to
mutation analysis that drastically reduces the number
of mutants by omitting lines of code without
statement coverage and lines that are determined to
be uninteresting
Mutations:
http://pitest.org/quickstart/mutators/
Is (preventive)
testing enough?
Maybe not…
clipart by dani ela
https://openclipart.org/detail/229476/14-flowers
Context:
Payments
Payment
Provider
DEV OPS
Logs are our current bridge!
One Billion Log Lines a Day:
Monitoring using the ELK Stack
• Logstash: Unify different logging sources
• Elastic Search: Search and filter large log data
• Kibana: Visual interactive dashboard
Image credit: www.neteye-blog.com
Poll: Java Exceptions in a Payment System
Your payment system in production generates 1 billion log lines per day.
How many errors / warnings with exceptions do you expect to see?
A. None. “We have a zero exception policy.”
B. 1 Thousand. “Some exceptions are unavoidable.”
C. 1 Million. “Most exceptions are harmless.”
D. 1 Billion. “We only log errors and exceptions.”
Adyen, Nov 2016:
~1,000,000 per
day
Complex API Integration
• Payment APIs are complex
• Integration faults are easily made
• Merchant needs assistance with API
usage
• Merchant may not notice mistakes
• 2.5M http error responses per month
• What can we learn from them?
66
11 Common Causes for API Error Reponses
Integrators are definitely the main responsible for API integration problems!
11 Common Causes for API Error Reponses
Integrators are definitely the main responsible for API integration problems!
Understand your errors
Payment
Terminals
Payment
Provider
Point of sale terminal variability
• Card brands
• Card entry modes
(chip, swipe, contactless)
• Currency conversion
• Loyalty points
• Validation type (pin, signature)
• Issuer responses
(declined, insufficient balance)
• Cancellations
(shopper, merchant)
Passive learning
Identifying system behavior from observations,
and representing it in the smallest possible model.
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
Rick Wieman, Maurício Aniche, Willem Lobbezoo, Sicco Verwer and Arie van Deursen.
An Experience Report on Applying Passive Learning in a Large-Scale Payment Company. ICSME Industry Track, 2017
https://automatonlearning.net/
DFASAT / FlexFringe
Heule & Verwer, ICGI 2010
Use Inferred Models to Analyze:
Bugs in Test Phase
• Terminal asked for PIN
• AND asked for signature
• Domain expert noted this unwanted
behavior in inferred model.
• Fixed before it went into production
Use Inferred Models to Analyze:
Differences Between Card Brands
Twice as many chip errors
Informed
merchant
about issue.
Use Inferred Models to Analyze:
Time out problems
Timeout
Improved
performance under
network instability
by adding targeted
retry mechanism
Test Automation Day 2018
Can the machine
generate tests for us?
Automated test
generation!
clipart by bingenberg
https://openclipart.org/detail/229476/14-flowers
1
5 2
6 7 3 4
8 9
10
1
5 2
6 7 3 4
8 9
10 (1,2,3)
1
5 2
6 7 3 4
8 9
10
@Test
public void test(){
// Constructor (init)
// Method Calls
// Assertions (check)
}
1
5 2
6 7 3 4
8 9
10
@Test
public void test(){
Triangle t = new Triangle (1,2,3);
// Method Calls
// Assertions (check)
}
1
5 2
6 7 3 4
8 9
10
@Test
public void test(){
Triangle t = new Triangle (1,2,3);
t.computeTriangleType();
// Assertions (check)
}
1
5 2
6 7 3 4
8 9
10
@Test
public void test(){
Triangle t = new Triangle (1,2,3);
t.computeTriangleType();
String typ = t.getType();
assertTrue(typ.equals(“SCALENE”));
}
Random testing
1. Pick one of the available constructors (with
random input)
2. Pick one or more public methods (with
random input)
3. Generate the assertions by checking the
final state of the object using get methods
clipart by 10binary
https://openclipart.org/detail/175047/february-11-2013
Test Automation Day 2018
Fuzzing tests in practice
Genetic Algorithm
Initialization
Fitness
Calculations
Terminate?
Selection
Crossover
Mutation
Elitism
Yes
No
1
5 2
6 7 3 4
8 9
10
(2,2,3) -> <1,2,4>
(2,3,3) -> <1,5,7,8>
1
5 2
6 7 3 4
8 9
10 (2,2,3) -> <1,2,4>
(2,3,3) -> <1,5,7,8>
Fitness = Approach + Distance
Approach = # of control nodes
between the execution and the
target.
Distance = The normalized
distance for the control node
that diverged to “not diverge”.
n/(n+1)
1
5 2
6 7 3 4
8 9
10 (2,2,3) -> <1,2,4> = 2 + [1/(1+1)] = 2.5
(2,3,3) -> <1,5,7,8> = 0 + [1/(1+1)] = 0.5
Fitness = Approach + Distance
Approach = # of control nodes
between the execution and the
target.
Distance = The normalized
distance for the control node
that diverged to “not diverge”.
n/(n+1)
1
5 2
6 7 3 4
8 9
10 (2,2,3) -> <1,2,4> = 2 + [1/(1+1)] = 2.5
(2,3,3) -> <1,5,7,8> = 0 + [1/(1+1)] = 0.5 <-- better!
Fitness = Approach + Distance
Approach = # of control nodes
between the execution and the
target.
Distance = The normalized
distance for the control node
that diverged to “not diverge”.
n/(n+1)
Genetic Algorithm
Initialization
Fitness
Calculations
Terminate?
Selection
Crossover
Mutation
Elitism
Yes
No
Fraser, Gordon, and Andrea Arcuri. "Evosuite: automatic test suite generation for object-oriented software." Proceedings of
the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 2011.
Test Automation Day 2018
Test Automation Day 2018
Test Automation Day 2018
Test Automation Day 2018
Testing SQL
Query
SELECT Name
FROM Product
WHERE Price > 20
Name Price
- 19
- 20
- 21
Test Database
Table: Product
Coverage Criterion
1. False
Price = 19
2. Boundary
Price = 20
3. True
Price = 21
Testing SQL
Query
SELECT *
FROM `account`
LEFT JOIN `user` AS `assignedUser` ON account.assigned_user_id = assigneduser.id
LEFT JOIN `user` AS `modifiedBy` ON account.modified_by_id = modifiedby.id
LEFT JOIN `user` AS `createdBy` ON account.created_by_id = createdby.id
LEFT JOIN `entity_email_address` AS `emailAddressesMiddle`
ON account.id = emailaddressesmiddle.entity_id
AND emailaddressesmiddle.deleted = '0'
AND emailaddressesmiddle.primary = '1'
AND emailaddressesmiddle.entity_type = 'Account'
LEFT JOIN `email_address` AS `emailAddresses`
ON emailaddresses.id = emailaddressesmiddle.email_address_id
AND emailaddresses.deleted = '0'
LEFT JOIN `entity_phone_number` AS `phoneNumbersMiddle`
ON account.id = phonenumbersmiddle.entity_id
AND phonenumbersmiddle.deleted = '0'
AND phonenumbersmiddle.primary = '1'
AND phonenumbersmiddle.entity_type = 'Account'
LEFT JOIN `phone_number` AS `phoneNumbers`
ON phonenumbers.id = phonenumbersmiddle.phone_number_id
AND phonenumbers.deleted = '0'
WHERE (( account.name LIKE 'Besha%'
OR account.id IN (SELECT entity_id
FROM entity_email_address
JOIN email_address
ON email_address.id =
entity_email_address.email_address_id
WHERE entity_email_address.deleted = 0
AND entity_email_address.entity_type =
'Account'
AND email_address.deleted = 0
AND email_address.name LIKE 'Besha%') ))
AND account.deleted = '0'
x 42 Coverage Rules
ü
EvoSQL
EvoSQL
SQLFpc
Test Data
Query
Database Schema
Coverage
Rules
Jeroen Castelein, Maurício Aniche, Mozhan Soltani, Annibale Panicchella, Arie Van Deursen
Search-Based Test Data Generation for SQL Queries. ICSE 2018.
Study Context
2,135 queries / 4 systems:
• Alura, e-learning platform
• EspoCRM, open source software for customer relations
• SuiteCRM, open source software for customer relations
• ERPNext, open source resource planning software for enterprises.
EvoSQL Evaluation Outcomes
• 100% of targets covered for 98% of the queries
• On average 86% covered for the remaining 2%
• Usually within seconds
• Outperforms biased and random alternatives:
• Biased random can handle 90% of simple queries (< 10 rules)
• Biased random often finds no solution for complex queries (10+ rules)
Test Automation Day 2018
Property-
Based Testing
clipart by GDJ
https://openclipart.org/detail/232264/colorful-fleur-de-lis-fractal-3
Alan Turing on Assertions
(wo)
Assertions Defined
An assertion is a Boolean expression
at a specific point in a program
which will be true
unless there is a bug in the program.
http://wiki.c2.com/?WhatAreAssertions
Assertions in the
program: They hold
for any execution
of that point.
Unlike test code
assertion, which
holds for one
execution only105
The Java (C, C++, …) assert Statement
If boolean-expression is true, do nothing.
If it is false, throw an AssertionError,
with the string as message
“assert” boolean-expression [“:” string ]
LLVM Assertion Examples (BitcodeReader.cpp)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
assert(BlockAddrFwdRefs.empty() && "Unresolved blockaddress fwd references");
assert(Ty == V->getType() && "Type mismatch in constant table!");
assert((Ty == 0 || Ty == V->getType()) && "Type mismatch in value table!");
assert(It != ResolveConstants.end() && It->first == *I);
assert(isa<ConstantExpr>(UserC) && "Must be a ConstantExpr.");
assert(V->getType()->isMetadataTy() && "Type mismatch in value table!");
assert((!Alignment || isPowerOf2_32(Alignment)) && "Alignment must be a power of two.");
assert((Record[i] == 3 || Record[i] == 4) && "Invalid attribute group entry");
assert(Record[i] == 0 && "Kind string not null terminated");
assert(Record[i] == 0 && "Value string not null terminated");
assert(ResultTy && "Didn't read a type?");
assert(TypeList[NumRecords] == 0 && "Already read type?");
assert(NextBitCode == bitc::METADATA_NAMED_NODE); (void)NextBitCode;
assert((CT != LandingPadInst::Catch || !isa<ArrayType>(Val->getType())) &&
"Catch clause has a invalid type!");
assert((CT != LandingPadInst::Filter || isa<ArrayType>(Val->getType())) &&
"Filter clause has invalid type!");
assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!");
assert(DeferredFunctionInfo.count(F) && "No info to read function later?");
assert(M == TheModule && "Can only Materialize the Module this BitcodeReader is attached to.");
https://blog.regehr.org/archives/1091
Thinking in Assertions
• Method preconditions:
• Propositions that must hold before calling the method
• Method postconditions
• Propositions that are guaranteed to hold after the method has finished
• Structural invariants
• Properties over the state of an object throughout the object’s lifetime
• Helps to improve / reason about design
• Can be turned into assertions that can be checked at run time
• Supports the testing process
Formal Specifications via Hoare Triples
• Any execution of A,
• starting in a state where P holds
• will terminate in a state where Q holds
{ P } A { Q }
{ preconditions } Method { postconditions }
Precondition Design
• The “strength” of your preconditions is a design choice.
• The weaker your precondition
• The more situations your method needs to handle
• The less thinking the client needs to do (easier to use)
• However, with weak preconditions:
• The server will always do the checking
• This may be redundant:
checks also done if we’re sure they’ll pass.
Examples: File has been crated; Player has been moved;
Points have been added; Resulting tile is never null;
If client invokes a (server) method and meets its preconditions,
the server guarantees the postcondition will hold.
clipart by floEdelmann
https://openclipart.org/detail/260432/beach-chair
If you (as client) invoke a (server) method
without meeting its preconditions, anything can happen.
E.g.: Null pointer
exception
clipart by tzunghaor
https://openclipart.org/detail/166696/nuclear-explosion
Design By Contract
• Contract metaphor:
• Contract: an explicit statement of the rights and obligations
between a client and a server
• Server perspective:
• If you call me and meet my precondition, I ensure that after returning
I deliver a state in which my postcondition holds
• If not, you’re on your own.
Bertrand Meyer, Applying "Design by Contract",
IEEE Computer 25, 10, October 1992, pages 40-51
Bertrand Meyer’s
Seven Principles of Software Testing
1. To test a program is to try to make it fail.
2. Tests are no substitute for specifications
3. Any failed execution must yield a test case
4. Determining success or failure of tests must be an automatic
process (4.b: via contracts)
Bertrand Meyer, IEEE Software, 2008. Required Reading!
Seven Principles of Software Testing
5. An effective testing process must include both manually and
automatically produced test cases.
6. Test strategies must be empirically validated
7. A testing strategy’s most important property is the number of faults
it uncovers as a function of time.
Assertions Pro / Con
Great
• Support better testing
• Make debugging easier
(less distance)
• Executable comments
• “Gateway drug to formal
methods”
Less than Great
• Slow down code
• Make programs incorrect when
used improperly
• Might trick some of us lazy
programmers into using them to
implement error handling
• Are commonly misunderstood
http://blog.regehr.org/archives/1091
Required reading
Property-Based Testing
• Think of ”properties” (assertions) for functions
• Let “generator” produces series of random input values for function
• For each random input check the assertions.
Property: length of concatenated strings
equals sum of length of individual strings
Quickcheck:
will generate 100 random strings
to check this property.
Can tools help us
find bugs
automatically?
Yes, even without running the code!
clipart by Machovka
https://openclipart.org/detail/2676/lady-bug
Test Automation Day 2018
Examples of bugs
• Equals checks for incompatible operand
• HE: Class defines equals() but not hashCode()
• RpC: Repeated conditional tests
• FL: Method performs math using floating point precision
• RANGE: Array offset is out of bounds (RANGE_ARRAY_OFFSET)
• Etc etc…
• Full list:
https://spotbugs.readthedocs.io/en/latest/bugDescriptions.html#
Test Automation Day 2018
Linters are prevalent
• OSS systems have been intensively using linters.
• Tools are highly flexible, and developers have different strategies to
configure it.
• Challenge: false positives.
• You should develop your own!!
• Bugs specific to your context, e.g., config files.
Beller, Moritz, et al. "Analyzing the state of static analysis: A large-scale evaluation in open source software." Software Analysis, Evolution, and Reengineering (SANER),
2016 IEEE 23rd International Conference on. Vol. 1. IEEE, 2016.
Tómasdóttir, K. F., Aniche, M., & Deursen, A. V. (2017, October). Why and how JavaScript developers use linters. In Proceedings of the 32nd IEEE/ACM International Conference on
Automated Software Engineering (pp. 578-589). IEEE Press.
Why Developers Use Linters
Importance of the different rules
1. Stylistic Issues
2. Best Practices
3. Variables
4. Possible Errors
5. Node.js &
CommonJS
6. ECMAScript 6
7. Strict Mode
1. Possible Errors 92.5%
2. Best Practices 89%
3. ECMAScript 6 86.7%
4. Variables 86,4%
5. Stylistic Issues 78.2%
6. Node.js & CommonJS 62.6%
7. Strict Mode 57.8%
Code review in test files!
Test files are almost 2 times less likely to be discussed
during code review when reviewed together with
production files!!
Davide Spadini, Maurício Aniche, Magiel Bruntink, Margaret-Anne Storey, Alberto Bacchelli. When Testing Meets Code
Review: Why and How Developers Review Tests. ICSE 2018.
Code review in test files!
Little on
finding more
bugs!
Davide Spadini, Maurício Aniche, Magiel Bruntink, Margaret-Anne Storey, Alberto Bacchelli. When Testing Meets Code
Review: Why and How Developers Review Tests. ICSE 2018.
0% 10% 20% 30%
0% 10% 20% 30%
Code improvement
Understanding
Social communication
Defect
Knowledge transfer
Misc
Learn software
testing is
challenging!
clipart by frankes
https://openclipart.org/detail/190242/comic-girl-tini-at-school
Common mistakes
• Test coverage (20.87%)
• Maintainability of test code (20.42%)
• Understanding test concepts (15.35%)
• Boundary testing (12.95%)
• State-based testing (12.39%)
• Assertions (8.93%)
• Mock Objects (5.87%)
• Tools (4.21%)
Difficult topics
Maurício Aniche, Felienne Hermans, Arie van Deursen. An Exploratory Study on Challenges in Software Testing
Education. TU Delft. In submission.
17%
19%
30%
31%
42%
35%
27%
35%
29%
46%
56%
36%
30%
44%
54%
46%
73%
76%
49%
42%
33%
32%
27%
25%
25%
25%
21%
20%
19%
18%
16%
16%
14%
14%
2%
1%
34%
39%
37%
37%
31%
40%
48%
41%
50%
35%
26%
46%
54%
40%
32%
41%
25%
23%
Minimum set of tests Q18 (80)
Avoid flaky tests Q17 (81)
Exploratory Testing Q16 (80)
Defensive programming Q15 (81)
How much to test Q14 (80)
Acceptance tests Q13 (81)
Design by contracts Q12 (81)
TDD Q11 (81)
Testability Q10 (81)
Best practices Q9 (81)
State−based testing Q8 (81)
Apply MC/DC Q7 (83)
Structural testing Q6 (82)
Boundary Testing Q5 (84)
Mock Objects Q4 (84)
Choose the test level Q3 (84)
Arrange−Act−Assert Q2 (81)
JUnit tests Q1 (83)
100 50 0 50 100
How to Learn?
Maurício Aniche, Felienne Hermans, Arie van Deursen. An Exploratory Study on Challenges in Software Testing
Education. TU Delft. In submission.
0%
1%
7%
6%
9%
10%
7%
31%
30%
35%
29%
93%
93%
80%
75%
73%
72%
65%
33%
32%
30%
20%
7%
6%
12%
19%
19%
18%
28%
36%
38%
34%
51%
Midterm exam Q11 (81)
AMA sessions Q10 (82)
Related papers Q9 (79)
Support from TAs Q8 (82)
Labwork Q7 (83)
ISTQB book Q6 (81)
PragProg book Q5 (80)
Interaction Q4 (83)
Live coding Q3 (83)
Guest lectures Q2 (83)
Lectures Q1 (83)
100 50 0 50 100
Peopledonotlikebooksandpapers…
The majority of projects and users [from 416
participants and 1,337,872 intervals] do not
practice testing actively.
We should change it.
Moritz Beller, Georgios Gousios, Annibale Panichella, Andy Zaidman. When, How, and Why Developers (Do Not) Test in Their IDEs. FSE 2015. clipart by laobc
https://openclipart.org/detail/65257/sad-baby
Topics of today
• Structural testing and MC/DC
• Log monitoring and passive learning
• Search-based software testing
• Mutation testing
• Fuzzing
• Property-based testing
• Code review
• Static analysis tools
Maurício Aniche
m.f.aniche@tudelft.nl
@mauricioaniche
http://www.mauricioaniche.com/talks/2018/tad

More Related Content

PDF
From jUnit to Mutationtesting
PPTX
Java chapter 2
PDF
Lecture 06 - 07 - 08 - Test Techniques - Whitebox Testing.pdf
PPSX
White Box testing by Pankaj Thakur, NITTTR Chandigarh
PPS
Testing techniques
PPT
11 whiteboxtesting
PPTX
Test design techniques: Structured and Experienced-based techniques
PDF
Software Engineering : Software testing
From jUnit to Mutationtesting
Java chapter 2
Lecture 06 - 07 - 08 - Test Techniques - Whitebox Testing.pdf
White Box testing by Pankaj Thakur, NITTTR Chandigarh
Testing techniques
11 whiteboxtesting
Test design techniques: Structured and Experienced-based techniques
Software Engineering : Software testing

Similar to Test Automation Day 2018 (20)

PPT
Testing foundations
PDF
Software testing with examples in Angular (and AngularJS)
PPT
2. Lect 27 to 28 White box s/w testing.ppt
PPTX
Test Coverage: An Art and a Science
PPT
Pert. 11 - Slide Materi Black Box Testing
PPT
13-blackwhiteboxtestingfreedownloading.ppt
PPT
13-blackwhiteboxtesting.ppt
PPT
13-blackwhiteboxtesting.ppt
PPT
black box and white box testing .ppt
PPT
13-blackwhiteboxtesting.ppt
PPTX
Software Testing Foundations Part 5 - White Box Testing
PPTX
Software engineering module 4 notes for btech and mca
PPT
white box testing.ppt
PDF
White-box Testing: When Quality Really Matters
PPT
Testing
PPTX
Ch07-3-sourceCode.pptxhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
PPTX
Structural Testing: When Quality Really Matters
PPT
Qat09 presentations dxw07u
PDF
Code Coverage vs Test Coverage_ A Complete Guide.pdf
PDF
Pragmatic Code Coverage
Testing foundations
Software testing with examples in Angular (and AngularJS)
2. Lect 27 to 28 White box s/w testing.ppt
Test Coverage: An Art and a Science
Pert. 11 - Slide Materi Black Box Testing
13-blackwhiteboxtestingfreedownloading.ppt
13-blackwhiteboxtesting.ppt
13-blackwhiteboxtesting.ppt
black box and white box testing .ppt
13-blackwhiteboxtesting.ppt
Software Testing Foundations Part 5 - White Box Testing
Software engineering module 4 notes for btech and mca
white box testing.ppt
White-box Testing: When Quality Really Matters
Testing
Ch07-3-sourceCode.pptxhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Structural Testing: When Quality Really Matters
Qat09 presentations dxw07u
Code Coverage vs Test Coverage_ A Complete Guide.pdf
Pragmatic Code Coverage
Ad

More from Maurício Aniche (20)

PDF
Can ML help software developers? (TEQnation 2022)
PDF
Tracing Back Log Data to its Log Statement: From Research to Practice
PDF
Pragmatic software testing education - SIGCSE 2019
PPTX
Software Testing with Caipirinhas and Stroopwafels
PPTX
Code smells in MVC applications (Dutch Spring meetup)
PPTX
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
PDF
Code quality in MVC systems - BENEVOL 2016
PDF
A Validated Set of Smells for MVC Architectures - ICSME 2016
PDF
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
PDF
DNAD 2015 - Métricas de código, pra que te quero?
PDF
Como eu aprendi que testar software é importante?
PDF
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
PDF
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
PDF
Test-Driven Development serve pra mim?
PDF
O que estamos temos feito com mineração de repositório de código no IME?
PDF
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
PDF
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
PDF
Minicurso sobre Evolução de Software no CBSoft 2011
PDF
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
PDF
[TDC 2014] Métricas de código, pra que te quero?
Can ML help software developers? (TEQnation 2022)
Tracing Back Log Data to its Log Statement: From Research to Practice
Pragmatic software testing education - SIGCSE 2019
Software Testing with Caipirinhas and Stroopwafels
Code smells in MVC applications (Dutch Spring meetup)
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
Code quality in MVC systems - BENEVOL 2016
A Validated Set of Smells for MVC Architectures - ICSME 2016
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
DNAD 2015 - Métricas de código, pra que te quero?
Como eu aprendi que testar software é importante?
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Test-Driven Development serve pra mim?
O que estamos temos feito com mineração de repositório de código no IME?
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Minicurso sobre Evolução de Software no CBSoft 2011
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
[TDC 2014] Métricas de código, pra que te quero?
Ad

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Machine Learning_overview_presentation.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
1. Introduction to Computer Programming.pptx
Group 1 Presentation -Planning and Decision Making .pptx
Encapsulation_ Review paper, used for researhc scholars
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine Learning_overview_presentation.pptx
Big Data Technologies - Introduction.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Empathic Computing: Creating Shared Understanding
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Test Automation Day 2018

  • 1. Besides the obvious tools: improving your testing with state-of-the-art techniques Maurício Aniche m.f.aniche@tudelft.nl @mauricioaniche Photo by Sora Sagano https://unsplash.com/photos/WA-QRL5wDMw
  • 2. Content and License • This presentation can be found at: http://www.mauricioaniche.com/talks/2018/tad • You can use it and modify it. • You always have to give credits to the original author. • You agree not to sell it or make profit in any way with this.
  • 4. ! Jeroen Castelein " Mozhan Soltani # Annibale Panichella ! Joop Aué ! Maikel Lobbezoo ! Rick Wieman ! Sicco Verwer ! Felienne Hermans # Davide Spadini# $ Alberto Bacchelli ! Arie van DeursenKristín Fjóla ! Peter Evers Qianqian Zhu
  • 5. • First job as a developer in 2004 • First important project in 2016 • First important bug: 2016 • Tests are important! A little story Photo by Michael Mims https://unsplash.com/photos/0ZL0O-eDOpU
  • 6. TEST ANALYSIS & TEST DESIGN clipart by j4p4n, adlerweb https://openclipart.org/detail/297959/standing-robot https://openclipart.org/detail/262444/bubble-person
  • 7. “Testing is different from writing tests. Developers write tests as a a way to give them space to think and confidence for refactoring. Testing focuses on finding bugs. Both should be done.” https://medium.com/@mauricioaniche/testing-vs-writing-tests-d817bffea6bc
  • 8. The literature on test oracles has introduced techniques for oracle automation, including modelling, specifications, contract-driven development and metamorphic testing. When none of these is completely adequate, the final source of test oracle information remains the human, who may be aware of informal specifications, expectations, norms and domain specific information that provide informal oracle guidance.
  • 9. TEST ANALYSIS & TEST DESIGN Find systematic and automated ways to design and execute tests!
  • 10. Topics of today • Structural testing and MC/DC • Log monitoring and passive learning • Search-based software testing • Mutation testing • Fuzzing • Property-based testing • Code review • Static analysis tools
  • 11. Who are you? • Software developers? • Software testers? • What are your expectations here today? • Fill this out: https://bit.ly/tad2018 clipart by GDJ https://openclipart.org/detail/230150/crowd-of-kids
  • 13. Given the points of two different players, the program must return the number of points the one who wins has! public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn > 21) rn = 0; if(ln > rn) return rn; else return ln; }
  • 14. public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn > 21) rn = 0; if(ln > rn) return rn; else return ln; } First criteria: “going through all the lines” If our test suite exercises all the lines, we are happy.
  • 15. public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn > 21) rn = 0; if(ln > rn) return rn; else return ln; } First criteria: “going through all the lines” If our test suite exercises all the lines, we are happy. T1 = (30, 30)
  • 16. public int play(int left, int right) { 1 int ln = left; 2 int rn = right; 3 if(ln > 21) 4 ln = 0; 5 if(rn > 21) 6 rn = 0; 7 if(ln > rn) 8 return rn; 9 else 10 return ln; } First criteria: “going through all the lines” If our test suite exercises all the lines, we are happy. T1 = (30, 30) 9 / 10 = 90% line coverage
  • 17. public int play(int left, int right) { 1 int ln = left; 2 int rn = right; 3 if(ln > 21) 4 ln = 0; 5 if(rn > 21) 6 rn = 0; 7 if(ln > rn) 8 return rn; 9 else 10 return ln; } First criteria: “going through all the lines” If our test suite exercises all the lines, we are happy. T1 = (30, 30) T2 = (10,9) <-- left player wins Make it true
  • 18. public int play(int left, int right) { 1 int ln = left; 2 int rn = right; 3 if(ln > 21) 4 ln = 0; 5 if(rn > 21) 6 rn = 0; 7 if(ln > rn) 8 return rn; 9 else 10 return ln; } First criteria: “going through all the lines” If our test suite exercises all the lines, we are happy. T1 = (30, 30) T2 = (10,9) <-- left player wins 10 / 10 = 100% line coverage
  • 19. 9/10 = 90%, 5/6 = 83%... From now on, I’ll write as many lines as I can!! Xclipart by GDJ https://openclipart.org/detail/230143/female-engineer-9
  • 22. Given a sentence, you should count the number of words that end with either an “s” or an “r”. A word ends when a non- letter appears.
  • 23. int words = 0; char last = ' '; for(int i = 0; i<str.length(); i++) if(!Character.isLetter (str.charAt(i)) && (last == ‘s’ || last == ‘r’)) words++; last = str.charAt(i); if(last == ‘s’ || last == ‘r’) words++; return words; true false false false true true Control-flow graph (CFG) We should cover all the branches (arrows)
  • 24. int words = 0; char last = ' '; for(int i = 0; i<str.length(); i++) if(!Character.isLetter (str.charAt(i)) && (last == ‘s’ || last == ‘r’)) words++; last = str.charAt(i); if(last == ‘s’ || last == ‘r’) words++; return words; true false false false true true “cats|dogs”
  • 25. int words = 0; char last = ' '; for(int i = 0; i<str.length(); i++) if(!Character.isLetter (str.charAt(i)) && (last == ‘s’ || last == ‘r’)) words++; last = str.charAt(i); if(last == ‘s’ || last == ‘r’) words++; return words; true false false false true true “cats|dog”
  • 26. Branch coverage means we exercise all the branches!
  • 27. I wonder if that’s enough…
  • 28. if(!Character.isLetter (str.charAt(i))) last == 'r'last == 's’ words++; last = str.charAt(i); false true true false true false If we “explode” the if into its several conditions, we have more paths to explore!
  • 29. int words = 0; char last = ' '; for(int i = 0; i<str.length(); i++) if(!Character.isLetter (str.charAt(i)) last == 'r'last == 's’ words++; last = str.charAt(i); if(last == ‘s' last == ‘r’) words++; return words; true false true true false false false true false true true false
  • 30. Ok, condition coverage seems to cover more than branch coverage!
  • 31. If we aim for condition coverage, are we testing all the paths?
  • 32. (A && (B | C)) Tests a b c Outcome 1 T T T T 2 T T F T 3 T F T T 4 T F F F 5 F T T F 6 F T F F 7 F F T F 8 F F F F Path Coverage
  • 33. Can we actually achieve 100% path coverage?
  • 34. • The subpaths through this control flow can include or exclude each of the statements Si, so that in total N branches result in 2^N paths that must be traversed • Choosing input data to force execution of one particular path may be very difficult, or even impossible if the conditions are not independent if (a) { S1; } if (b) { S2; } if (C) { S3; } ... if (x) { Sn; } The number of paths can still grow exponentially
  • 35. Can we test just the important combinations?
  • 37. (A && (B | C)) Tests a b c Outcome 1 T T T T 2 T T F T 3 T F T T 4 T F F F 5 F T T F 6 F T F F 7 F F T F 8 F F F F
  • 38. (A && (B | C)) Tests a b c Outcome 1 T T T T 2 T T F T 3 T F T T 4 T F F F 5 F T T F 6 F T F F 7 F F T F 8 F F F F A = {1, 5}, {2, 6}, {3,7} B = {2, 4} C = {3, 4} Final = {2, 3, 4, 6} They are the same! We don’t need them all
  • 39. So, for N conditions, I always have only N+1 tests! That’s definitely better than 2n!!
  • 40. McCabe’s Cyclomatic Complexity • C = |E| - |N| + 2 • C = # decision points + 1 • C = # of decision-statements + 1 C > 10: method too complex [McCabe, 1976] [ C correlated with #lines of code ] 32 1 7 65 4
  • 41. McCabe for Testing? No empirical evidence that it is better than just decision coverage. How many tests? • Branch: 2 tests • All paths: 4 tests • McCabe: 3 tests 32 1 7 65 4 McCabe: Easy to count, limited usefulness as coverage metric
  • 42. Strategy Subsumption MC/DC Branch + Condition Coverage Branch Coverage Statement Coverage • Strategy X subsumes strategy Y if all elements that Y exercises are also exercised by X • No conclusive results on relative bug-finding effectiveness have been established. Path coverage
  • 43. What do YOU think: Do we need 100% code coverage?
  • 44. Don’t worry about coverage, just write some good tests. I am ready to write some unit tests. What code coverage should I aim for? Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677 clipart by 10_boss, bibbleycheese https://openclipart.org/detail/202573/my-yoda https://openclipart.org/detail/248493/pretzel-ninja
  • 45. How many grains of rice should put in that [boiling water] pot? I am ready to write some unit tests. What code coverage should I aim for? Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677 It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on Exactly!
  • 46. 80% and no less! I am ready to write some unit tests. What code coverage should I aim for? Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
  • 47. The first programmer is new and just getting started with testing. Right now he has a lot of code and no tests. He has a long way to go; focusing on code coverage at this time would be depressing and quite useless. He’s better off just getting used to writing and running some tests. He can worry about coverage later. Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
  • 48. The second programmer, on the other hand, is quite experience both at programming and testing. When I replied by asking her how many grains of rice I should put in a pot, I helped her realize that the amount of testing necessary depends on a number of factors, and she knows those factors better than I do – it’s her code after all. There is no single, simple, answer, and she’s smart enough to handle the truth and work with that. Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
  • 49. The third programmer wants only simple answers – even when there are no simple answers … and then does not follow them anyway. Testivus on Code Coverage. Alberto Savoia. https://www.artima.com/weblogs/viewpost.jsp?thread=204677
  • 51. Mutation testing Gif by h1flosse https://openclipart.org/detail/190026/mutant
  • 52. Imagine your code is a small town, where crimes happen from times to times… Photo by Jesus in Taiwan https://unsplash.com/photos/c6aunWXHZZ0
  • 53. Imagine your code is a small town, where crimes happen from times to times… clipart by kolbasun https://openclipart.org/detail/219619/ninja-cop Let’s simulate crimes and see if the cops can get it!
  • 54. City -> Program Crime -> Bugs in code Police -> Unit testing Fake crime -> Mutation Testing
  • 55. public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn > 21) rn = 0; if(ln > rn) return rn; else return ln; } public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn < 21) rn = 0; if(ln > rn) return rn; else return ln; }
  • 56. public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn > 21) rn = 0; if(ln > rn) return rn; else return ln; } public int play(int left, int right) { int ln = left; int rn = right; if(ln > 21) ln = 0; if(rn < 21) rn = 0; if(ln > rn) return rn; else return ln; } If your test still passes, this is no good!
  • 57. Common mutants • Replace arithmetic operator (+, -, *, /, …) • Replace relational operators (>, >=, <, <=, ==, !=, …) • Replace constants (a -> a+1)
  • 58. As a research field • Since the 70s • Benefits: • Better fault exposing capability • A good alternative to real faults • Limitations: • High computational power • Undecidable Equivalent Mutant Problem •Mutants for other problems • SQL
  • 59. In order to alleviate the computational issues, we present a diff-based probabilistic approach to mutation analysis that drastically reduces the number of mutants by omitting lines of code without statement coverage and lines that are determined to be uninteresting
  • 61. Is (preventive) testing enough? Maybe not… clipart by dani ela https://openclipart.org/detail/229476/14-flowers
  • 63. DEV OPS Logs are our current bridge!
  • 64. One Billion Log Lines a Day: Monitoring using the ELK Stack • Logstash: Unify different logging sources • Elastic Search: Search and filter large log data • Kibana: Visual interactive dashboard Image credit: www.neteye-blog.com
  • 65. Poll: Java Exceptions in a Payment System Your payment system in production generates 1 billion log lines per day. How many errors / warnings with exceptions do you expect to see? A. None. “We have a zero exception policy.” B. 1 Thousand. “Some exceptions are unavoidable.” C. 1 Million. “Most exceptions are harmless.” D. 1 Billion. “We only log errors and exceptions.” Adyen, Nov 2016: ~1,000,000 per day
  • 66. Complex API Integration • Payment APIs are complex • Integration faults are easily made • Merchant needs assistance with API usage • Merchant may not notice mistakes • 2.5M http error responses per month • What can we learn from them? 66
  • 67. 11 Common Causes for API Error Reponses Integrators are definitely the main responsible for API integration problems!
  • 68. 11 Common Causes for API Error Reponses Integrators are definitely the main responsible for API integration problems! Understand your errors
  • 70. Point of sale terminal variability • Card brands • Card entry modes (chip, swipe, contactless) • Currency conversion • Loyalty points • Validation type (pin, signature) • Issuer responses (declined, insufficient balance) • Cancellations (shopper, merchant)
  • 71. Passive learning Identifying system behavior from observations, and representing it in the smallest possible model. 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved Rick Wieman, Maurício Aniche, Willem Lobbezoo, Sicco Verwer and Arie van Deursen. An Experience Report on Applying Passive Learning in a Large-Scale Payment Company. ICSME Industry Track, 2017 https://automatonlearning.net/ DFASAT / FlexFringe Heule & Verwer, ICGI 2010
  • 72. Use Inferred Models to Analyze: Bugs in Test Phase • Terminal asked for PIN • AND asked for signature • Domain expert noted this unwanted behavior in inferred model. • Fixed before it went into production
  • 73. Use Inferred Models to Analyze: Differences Between Card Brands Twice as many chip errors Informed merchant about issue.
  • 74. Use Inferred Models to Analyze: Time out problems Timeout Improved performance under network instability by adding targeted retry mechanism
  • 76. Can the machine generate tests for us? Automated test generation! clipart by bingenberg https://openclipart.org/detail/229476/14-flowers
  • 77. 1 5 2 6 7 3 4 8 9 10
  • 78. 1 5 2 6 7 3 4 8 9 10 (1,2,3)
  • 79. 1 5 2 6 7 3 4 8 9 10 @Test public void test(){ // Constructor (init) // Method Calls // Assertions (check) }
  • 80. 1 5 2 6 7 3 4 8 9 10 @Test public void test(){ Triangle t = new Triangle (1,2,3); // Method Calls // Assertions (check) }
  • 81. 1 5 2 6 7 3 4 8 9 10 @Test public void test(){ Triangle t = new Triangle (1,2,3); t.computeTriangleType(); // Assertions (check) }
  • 82. 1 5 2 6 7 3 4 8 9 10 @Test public void test(){ Triangle t = new Triangle (1,2,3); t.computeTriangleType(); String typ = t.getType(); assertTrue(typ.equals(“SCALENE”)); }
  • 83. Random testing 1. Pick one of the available constructors (with random input) 2. Pick one or more public methods (with random input) 3. Generate the assertions by checking the final state of the object using get methods clipart by 10binary https://openclipart.org/detail/175047/february-11-2013
  • 85. Fuzzing tests in practice
  • 87. 1 5 2 6 7 3 4 8 9 10 (2,2,3) -> <1,2,4> (2,3,3) -> <1,5,7,8>
  • 88. 1 5 2 6 7 3 4 8 9 10 (2,2,3) -> <1,2,4> (2,3,3) -> <1,5,7,8> Fitness = Approach + Distance Approach = # of control nodes between the execution and the target. Distance = The normalized distance for the control node that diverged to “not diverge”. n/(n+1)
  • 89. 1 5 2 6 7 3 4 8 9 10 (2,2,3) -> <1,2,4> = 2 + [1/(1+1)] = 2.5 (2,3,3) -> <1,5,7,8> = 0 + [1/(1+1)] = 0.5 Fitness = Approach + Distance Approach = # of control nodes between the execution and the target. Distance = The normalized distance for the control node that diverged to “not diverge”. n/(n+1)
  • 90. 1 5 2 6 7 3 4 8 9 10 (2,2,3) -> <1,2,4> = 2 + [1/(1+1)] = 2.5 (2,3,3) -> <1,5,7,8> = 0 + [1/(1+1)] = 0.5 <-- better! Fitness = Approach + Distance Approach = # of control nodes between the execution and the target. Distance = The normalized distance for the control node that diverged to “not diverge”. n/(n+1)
  • 92. Fraser, Gordon, and Andrea Arcuri. "Evosuite: automatic test suite generation for object-oriented software." Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 2011.
  • 97. Testing SQL Query SELECT Name FROM Product WHERE Price > 20 Name Price - 19 - 20 - 21 Test Database Table: Product Coverage Criterion 1. False Price = 19 2. Boundary Price = 20 3. True Price = 21
  • 98. Testing SQL Query SELECT * FROM `account` LEFT JOIN `user` AS `assignedUser` ON account.assigned_user_id = assigneduser.id LEFT JOIN `user` AS `modifiedBy` ON account.modified_by_id = modifiedby.id LEFT JOIN `user` AS `createdBy` ON account.created_by_id = createdby.id LEFT JOIN `entity_email_address` AS `emailAddressesMiddle` ON account.id = emailaddressesmiddle.entity_id AND emailaddressesmiddle.deleted = '0' AND emailaddressesmiddle.primary = '1' AND emailaddressesmiddle.entity_type = 'Account' LEFT JOIN `email_address` AS `emailAddresses` ON emailaddresses.id = emailaddressesmiddle.email_address_id AND emailaddresses.deleted = '0' LEFT JOIN `entity_phone_number` AS `phoneNumbersMiddle` ON account.id = phonenumbersmiddle.entity_id AND phonenumbersmiddle.deleted = '0' AND phonenumbersmiddle.primary = '1' AND phonenumbersmiddle.entity_type = 'Account' LEFT JOIN `phone_number` AS `phoneNumbers` ON phonenumbers.id = phonenumbersmiddle.phone_number_id AND phonenumbers.deleted = '0' WHERE (( account.name LIKE 'Besha%' OR account.id IN (SELECT entity_id FROM entity_email_address JOIN email_address ON email_address.id = entity_email_address.email_address_id WHERE entity_email_address.deleted = 0 AND entity_email_address.entity_type = 'Account' AND email_address.deleted = 0 AND email_address.name LIKE 'Besha%') )) AND account.deleted = '0' x 42 Coverage Rules ü
  • 99. EvoSQL EvoSQL SQLFpc Test Data Query Database Schema Coverage Rules Jeroen Castelein, Maurício Aniche, Mozhan Soltani, Annibale Panicchella, Arie Van Deursen Search-Based Test Data Generation for SQL Queries. ICSE 2018.
  • 100. Study Context 2,135 queries / 4 systems: • Alura, e-learning platform • EspoCRM, open source software for customer relations • SuiteCRM, open source software for customer relations • ERPNext, open source resource planning software for enterprises.
  • 101. EvoSQL Evaluation Outcomes • 100% of targets covered for 98% of the queries • On average 86% covered for the remaining 2% • Usually within seconds • Outperforms biased and random alternatives: • Biased random can handle 90% of simple queries (< 10 rules) • Biased random often finds no solution for complex queries (10+ rules)
  • 103. Property- Based Testing clipart by GDJ https://openclipart.org/detail/232264/colorful-fleur-de-lis-fractal-3
  • 104. Alan Turing on Assertions (wo)
  • 105. Assertions Defined An assertion is a Boolean expression at a specific point in a program which will be true unless there is a bug in the program. http://wiki.c2.com/?WhatAreAssertions Assertions in the program: They hold for any execution of that point. Unlike test code assertion, which holds for one execution only105
  • 106. The Java (C, C++, …) assert Statement If boolean-expression is true, do nothing. If it is false, throw an AssertionError, with the string as message “assert” boolean-expression [“:” string ]
  • 107. LLVM Assertion Examples (BitcodeReader.cpp) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 assert(BlockAddrFwdRefs.empty() && "Unresolved blockaddress fwd references"); assert(Ty == V->getType() && "Type mismatch in constant table!"); assert((Ty == 0 || Ty == V->getType()) && "Type mismatch in value table!"); assert(It != ResolveConstants.end() && It->first == *I); assert(isa<ConstantExpr>(UserC) && "Must be a ConstantExpr."); assert(V->getType()->isMetadataTy() && "Type mismatch in value table!"); assert((!Alignment || isPowerOf2_32(Alignment)) && "Alignment must be a power of two."); assert((Record[i] == 3 || Record[i] == 4) && "Invalid attribute group entry"); assert(Record[i] == 0 && "Kind string not null terminated"); assert(Record[i] == 0 && "Value string not null terminated"); assert(ResultTy && "Didn't read a type?"); assert(TypeList[NumRecords] == 0 && "Already read type?"); assert(NextBitCode == bitc::METADATA_NAMED_NODE); (void)NextBitCode; assert((CT != LandingPadInst::Catch || !isa<ArrayType>(Val->getType())) && "Catch clause has a invalid type!"); assert((CT != LandingPadInst::Filter || isa<ArrayType>(Val->getType())) && "Filter clause has invalid type!"); assert(DFII != DeferredFunctionInfo.end() && "Deferred function not found!"); assert(DeferredFunctionInfo.count(F) && "No info to read function later?"); assert(M == TheModule && "Can only Materialize the Module this BitcodeReader is attached to."); https://blog.regehr.org/archives/1091
  • 108. Thinking in Assertions • Method preconditions: • Propositions that must hold before calling the method • Method postconditions • Propositions that are guaranteed to hold after the method has finished • Structural invariants • Properties over the state of an object throughout the object’s lifetime • Helps to improve / reason about design • Can be turned into assertions that can be checked at run time • Supports the testing process
  • 109. Formal Specifications via Hoare Triples • Any execution of A, • starting in a state where P holds • will terminate in a state where Q holds { P } A { Q } { preconditions } Method { postconditions }
  • 110. Precondition Design • The “strength” of your preconditions is a design choice. • The weaker your precondition • The more situations your method needs to handle • The less thinking the client needs to do (easier to use) • However, with weak preconditions: • The server will always do the checking • This may be redundant: checks also done if we’re sure they’ll pass.
  • 111. Examples: File has been crated; Player has been moved; Points have been added; Resulting tile is never null; If client invokes a (server) method and meets its preconditions, the server guarantees the postcondition will hold. clipart by floEdelmann https://openclipart.org/detail/260432/beach-chair
  • 112. If you (as client) invoke a (server) method without meeting its preconditions, anything can happen. E.g.: Null pointer exception clipart by tzunghaor https://openclipart.org/detail/166696/nuclear-explosion
  • 113. Design By Contract • Contract metaphor: • Contract: an explicit statement of the rights and obligations between a client and a server • Server perspective: • If you call me and meet my precondition, I ensure that after returning I deliver a state in which my postcondition holds • If not, you’re on your own. Bertrand Meyer, Applying "Design by Contract", IEEE Computer 25, 10, October 1992, pages 40-51
  • 114. Bertrand Meyer’s Seven Principles of Software Testing 1. To test a program is to try to make it fail. 2. Tests are no substitute for specifications 3. Any failed execution must yield a test case 4. Determining success or failure of tests must be an automatic process (4.b: via contracts) Bertrand Meyer, IEEE Software, 2008. Required Reading!
  • 115. Seven Principles of Software Testing 5. An effective testing process must include both manually and automatically produced test cases. 6. Test strategies must be empirically validated 7. A testing strategy’s most important property is the number of faults it uncovers as a function of time.
  • 116. Assertions Pro / Con Great • Support better testing • Make debugging easier (less distance) • Executable comments • “Gateway drug to formal methods” Less than Great • Slow down code • Make programs incorrect when used improperly • Might trick some of us lazy programmers into using them to implement error handling • Are commonly misunderstood http://blog.regehr.org/archives/1091 Required reading
  • 117. Property-Based Testing • Think of ”properties” (assertions) for functions • Let “generator” produces series of random input values for function • For each random input check the assertions.
  • 118. Property: length of concatenated strings equals sum of length of individual strings Quickcheck: will generate 100 random strings to check this property.
  • 119. Can tools help us find bugs automatically? Yes, even without running the code! clipart by Machovka https://openclipart.org/detail/2676/lady-bug
  • 121. Examples of bugs • Equals checks for incompatible operand • HE: Class defines equals() but not hashCode() • RpC: Repeated conditional tests • FL: Method performs math using floating point precision • RANGE: Array offset is out of bounds (RANGE_ARRAY_OFFSET) • Etc etc… • Full list: https://spotbugs.readthedocs.io/en/latest/bugDescriptions.html#
  • 123. Linters are prevalent • OSS systems have been intensively using linters. • Tools are highly flexible, and developers have different strategies to configure it. • Challenge: false positives. • You should develop your own!! • Bugs specific to your context, e.g., config files. Beller, Moritz, et al. "Analyzing the state of static analysis: A large-scale evaluation in open source software." Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on. Vol. 1. IEEE, 2016. Tómasdóttir, K. F., Aniche, M., & Deursen, A. V. (2017, October). Why and how JavaScript developers use linters. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (pp. 578-589). IEEE Press.
  • 124. Why Developers Use Linters
  • 125. Importance of the different rules 1. Stylistic Issues 2. Best Practices 3. Variables 4. Possible Errors 5. Node.js & CommonJS 6. ECMAScript 6 7. Strict Mode 1. Possible Errors 92.5% 2. Best Practices 89% 3. ECMAScript 6 86.7% 4. Variables 86,4% 5. Stylistic Issues 78.2% 6. Node.js & CommonJS 62.6% 7. Strict Mode 57.8%
  • 126. Code review in test files! Test files are almost 2 times less likely to be discussed during code review when reviewed together with production files!! Davide Spadini, Maurício Aniche, Magiel Bruntink, Margaret-Anne Storey, Alberto Bacchelli. When Testing Meets Code Review: Why and How Developers Review Tests. ICSE 2018.
  • 127. Code review in test files! Little on finding more bugs! Davide Spadini, Maurício Aniche, Magiel Bruntink, Margaret-Anne Storey, Alberto Bacchelli. When Testing Meets Code Review: Why and How Developers Review Tests. ICSE 2018. 0% 10% 20% 30% 0% 10% 20% 30% Code improvement Understanding Social communication Defect Knowledge transfer Misc
  • 128. Learn software testing is challenging! clipart by frankes https://openclipart.org/detail/190242/comic-girl-tini-at-school
  • 129. Common mistakes • Test coverage (20.87%) • Maintainability of test code (20.42%) • Understanding test concepts (15.35%) • Boundary testing (12.95%) • State-based testing (12.39%) • Assertions (8.93%) • Mock Objects (5.87%) • Tools (4.21%)
  • 130. Difficult topics Maurício Aniche, Felienne Hermans, Arie van Deursen. An Exploratory Study on Challenges in Software Testing Education. TU Delft. In submission. 17% 19% 30% 31% 42% 35% 27% 35% 29% 46% 56% 36% 30% 44% 54% 46% 73% 76% 49% 42% 33% 32% 27% 25% 25% 25% 21% 20% 19% 18% 16% 16% 14% 14% 2% 1% 34% 39% 37% 37% 31% 40% 48% 41% 50% 35% 26% 46% 54% 40% 32% 41% 25% 23% Minimum set of tests Q18 (80) Avoid flaky tests Q17 (81) Exploratory Testing Q16 (80) Defensive programming Q15 (81) How much to test Q14 (80) Acceptance tests Q13 (81) Design by contracts Q12 (81) TDD Q11 (81) Testability Q10 (81) Best practices Q9 (81) State−based testing Q8 (81) Apply MC/DC Q7 (83) Structural testing Q6 (82) Boundary Testing Q5 (84) Mock Objects Q4 (84) Choose the test level Q3 (84) Arrange−Act−Assert Q2 (81) JUnit tests Q1 (83) 100 50 0 50 100
  • 131. How to Learn? Maurício Aniche, Felienne Hermans, Arie van Deursen. An Exploratory Study on Challenges in Software Testing Education. TU Delft. In submission. 0% 1% 7% 6% 9% 10% 7% 31% 30% 35% 29% 93% 93% 80% 75% 73% 72% 65% 33% 32% 30% 20% 7% 6% 12% 19% 19% 18% 28% 36% 38% 34% 51% Midterm exam Q11 (81) AMA sessions Q10 (82) Related papers Q9 (79) Support from TAs Q8 (82) Labwork Q7 (83) ISTQB book Q6 (81) PragProg book Q5 (80) Interaction Q4 (83) Live coding Q3 (83) Guest lectures Q2 (83) Lectures Q1 (83) 100 50 0 50 100 Peopledonotlikebooksandpapers…
  • 132. The majority of projects and users [from 416 participants and 1,337,872 intervals] do not practice testing actively. We should change it. Moritz Beller, Georgios Gousios, Annibale Panichella, Andy Zaidman. When, How, and Why Developers (Do Not) Test in Their IDEs. FSE 2015. clipart by laobc https://openclipart.org/detail/65257/sad-baby
  • 133. Topics of today • Structural testing and MC/DC • Log monitoring and passive learning • Search-based software testing • Mutation testing • Fuzzing • Property-based testing • Code review • Static analysis tools Maurício Aniche m.f.aniche@tudelft.nl @mauricioaniche http://www.mauricioaniche.com/talks/2018/tad