Testar2014 presentation

Test
Automa+on
at
the
useR
interface
level
Test *
Tanja
E.
J.
Vos
So#ware
Tes+ng
and
Quality
Group
Research
center
for
So#ware
Produc+on
Methods
(PROS)
Universidad
Politecnica
de
Valencia
Spain

Contents
• Tes+ng
at
the
UI
level:
what
and
state-‐of-‐the-‐art
• TESTAR
and
how
it
works
• How
it
has
been
used
2

Was
developed
under
FITTEST
• Future Internet Testing
• September 2010 – February 2014
• Total costs: 5.845.000 euros
• Partners:
– Universidad
Politecnica
de
Valencia
(Spain)
– University
College
London
(United
Kingdom)
– Berner
&
MaTner
(Germany)
– IBM
(Israel)
– Fondazione
Bruno
Kessler
(Italy)
– Universiteit
Utrecht
(The
Netherlands)
– So#team
(France)
•
hTp://www.pros.upv.es/fiTest/

Tes+ng
at
the
UI
Level
• UI
is
where
all
func+onality
comes
together
– Integra+on
/
System
Tes+ng
• Most
applica+ons
have
UIs
– Computers,
tables,
smartphones….
• Faults
that
arise
at
UI
level
are
important
– These
are
what
your
client
finds
-‐>
test
from
their
perspec+ve!
• No
need
for
source
code
– But
if
we
have
it
even
beTer
;-‐)

State
of
the
art
in
UI
tes+ng
• Capture
Replay
– The
tool
captures
user
interac+on
with
the
UI
and
records
a
script
that
can
be
automa+cally
replayed
during
regression
tes+ng
– UI
change
(at
development
+me
&
at
run
+me)
– Automated
regression
tests
break
– Huge
maintenance
problem
• Visual
Tes+ng
• Model-‐based
Tes+ng

State
of
the
art
in
UI
tes+ng
• Capture
Replay
• Visual
tes6ng
– Based
on
image
recogni+on
– Easy
to
understand,
no
programming
skills
needed
– Solves
most
of
maintenance
problem
– Introduces
addi+onal
problems:
• Performance
of
image
processing
• False
posi+ves
and
false
nega+ves
– the
ambiguity
associated
with
image
locators
– imprecision
of
image
recogni+on
feeds
into
oracles
• Model-‐based
Tes+ng

State
of
the
art
in
UI
tes+ng
• Capture
Replay
• Visual
tes+ng
• (ui)
title: "Button"
enabled: false
hasFocus: true
rect: [15, 25, 65, 55]
title: "File"
Model-‐based
tes6ng
-‐-‐
TESTAR
– Based
title: "Button"
enabled: false
hasFocus: true
rect: [15, 25, 65, 55]
on
automa+cally
inferred
tree
model
of
the
UI
– Tests
sequences
are
derived
automa+cally
from
the
model
– Executed
sequences
can
be
replayed
– If
title: "File"
UI
changes
so
does
the
model/tests
-‐>
no
maintenance
of
the
tests
– Programming
skills
are
needed
to
define
powerful
oracles
• It
needs
to
be
inves+gated
more
if
this
is
really
a
problem….
• Do
we
want
testers
to
have
programming
skills?
type: TButton
...
Window
Button Text Menu Slider
MI MI MI MI
type: TMenuItem
...
ABC
type: TButton
...
MI MI MI MI
type: TMenuItem
...
ABC
Test *

8
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
Replayable Erroneous Sequences
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
How
it works..
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
-‐ Run
executable
/
command
-‐ Bring
SUT
into
dedicated
start
state
(delete
or
restore
configura+on
files)
-‐ Wait
un+l
SUT
fully
loaded
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
SELECT
ACTION Obtain
state
(Widget
Tree)
through
Accessibility
API
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
Test *

Button Widget
Trees
Window
title: "Button"
enabled: false
hasFocus: true
rect: [15, 25, 65, 55]
MI MI MI MI
type: TButton
title: "Button"
enabled: false
hasFocus: true
title: "File"
rect: [15, 25, 65, 55]
...
Window
MI MI MI MI
type: TMenuItem
title: "File"
...
ABC
type: TButton
...
Text Menu Slider
MI MI MI MI
type: TMenuItem
...
ABC

12
12
!"#$%&' ()*%%
+,&--*.
+/0$/#"1*
+//%,&.
2//%,&.
2//%31*0
2//%31*0
2//%,&.
2//%,&. 2//%31*0
2//%,&.
2//%,&.
+//%31*0
+//%31*0
+/0$/#"1* +//%,&.
+/0$/#"1* 2//%,&.
2//%31*0
2//%31*0
+//%31*0
(1&14#5"-*
+5&6*%
2//%,&. 2//%31*0
+/0$/#"1* 7./8.*##3-9":&1/.
7./8.*##;*8"/-<=
+&->&#
+/0$/#"1*
+/0$/#"1* +/0$/#"1*
2//%,&.
2//%31*0
2//%31*0
2//%31*0
2//%31*0
+/0$/#"1* +/0$/#"1* 7&8*,//? 2.**
2.**+/%40-
2.**+/%40-
+/0$/#"1* +/0$/#"1* 7&8*,//? +/0$/#"1* 5&6*%
+/0$/#"1* +2&6@/%9*. 2//%,&. 2//%31*0
+2&6@/%9*. A"*B@/.0 A"*B@/.0
+2&6@/%9*. +2&631*0
2//%,&. 2//%31*0
A"*B@/.0
+2&631*0
(&#)
(&#)
2."0+/00/-C3D&-9%*
+//%,&.
+/0$/#"1*
+//%31*0
E*-4
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
@/.0
E&"-FE*-4
G9"#&6%*9H
(*&.:)
Active Widget Tree

13
13
!"#$%&' ()*%%
+,&--*.
+/0$/#"1*
+//%,&.
2//%,&.
2//%31*0
2//%31*0
2//%,&.
2//%,&. 2//%31*0
2//%,&.
2//%,&.
+//%31*0
+//%31*0
+/0$/#"1* +//%,&.
+/0$/#"1* 2//%,&.
2//%31*0
2//%31*0
+//%31*0
(1&14#5"-*
+5&6*%
2//%,&. 2//%31*0
+/0$/#"1* 7./8.*##3-9":&1/.
7./8.*##;*8"/-<=
+&->&#
+/0$/#"1*
+/0$/#"1* +/0$/#"1*
2//%,&.
2//%31*0
2//%31*0
2//%31*0
2//%31*0
+/0$/#"1* +/0$/#"1* 7&8*,//? 2.**
2.**+/%40-
2.**+/%40-
+/0$/#"1* +/0$/#"1* 7&8*,//? +/0$/#"1* 5&6*%
+/0$/#"1* +2&6@/%9*. 2//%,&. 2//%31*0
+2&6@/%9*. A"*B@/.0 A"*B@/.0
+2&6@/%9*. +2&631*0
2//%,&. 2//%31*0
A"*B@/.0
+2&631*0
(&#)
(&#)
2."0+/00/-C3D&-9%*
+//%,&. +/0$/#"1*
+//%31*0
E*-4
E*-431*0
E*-4
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0 !./$!/B-E*-
4
Active Widget Tree

14
14
!"#$%&' ()*%%
+,&--*.
+/0$/#"1*
+//%,&.
2//%,&.
2//%31*0
2//%31*0
2//%,&.
2//%,&. 2//%31*0
2//%,&.
2//%,&.
+//%31*0
+//%31*0
+/0$/#"1* +//%,&.
+/0$/#"1* 2//%,&.
2//%31*0
2//%31*0
+//%31*0
(1&14#5"-*
+5&6*%
2//%,&. 2//%31*0
+/0$/#"1* 7./8.*##3-9":&1/.
7./8.*##;*8"/-<=
+&->&#
+/0$/#"1*
+/0$/#"1* +/0$/#"1*
2//%,&.
2//%31*0
2//%31*0
2//%31*0
2//%31*0
+/0$/#"1* +/0$/#"1* 7&8*,//? 2.**
2.**+/%40-
2.**+/%40-
+/0$/#"1* +/0$/#"1* 7&8*,//? +/0$/#"1* 5&6*%
+/0$/#"1* +2&6@/%9*. 2//%,&. 2//%31*0
+2&6@/%9*. A"*B@/.0 A"*B@/.0
+2&6@/%9*. +2&631*0
2//%,&. 2//%31*0
A"*B@/.0
+2&631*0
(&#)
(&#)
2."0+/00/-C3D&-9%* +//%,&.
+/0$/#"1*
+//%31*0
E*-4
E*-431*0
E*-431*0
E*-431*0
E*-431*0
E*-431*0
()*%%
+/0$/#"1*
+/0$/#"1*
+/0$/#"1*
5&6*%
+/0$/#"1* +/0$/#"1*
5&6*%
2*F1
5&6*% ,411/-
+/0$/#"1* +/0$/#"1*
,411/-
5&6*% ,411/-
5&6*%
2*F1
!"&%/8

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
SELECT
ACTION -‐ Use
informa+on
in
Widget
Tree
to
derive
a
set
of
“sensible”
ac+ons
-‐ Click
on
enabled
BuTons,
Type
into
Text
Boxes…
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
-‐ Select
one
of
the
ac+ons
from
the
ac+on
set
-‐ Various
possible
strategies:
-‐ Random,
-‐ Coverage
Metrics,
-‐ Search-‐based…
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Execute
and
record
selected
ac+on
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Oracle
-‐>
Check
whether
state
is
erroneous
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Stopping
Criteria:
-‐ A#er
X
ac+ons
-‐ A#er
Y
hours
-‐ A#er
some
state
occurred
-‐ etc
….
Did
we
find
a
fault?
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Save
sequence
if
it
contained
errors
Test *

START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
Various
stopping
criteriaA:C
TION
-‐ X
sequences
-‐ Y
hours
-‐ …
Test *

TESTAR
tool
READY
22
Set
path
the
SUT

TESTAR
tool
SET
23
Filter:
1)
undesirable
ac+ons,
i.e.
closing
the
applica+on
al
the
+me
2)
Undesirable
processes,
for
example
help
panes
in
acrobat,
etc…….

24
GO!
See
video
at
hTps://www.youtube.com/watch?v=PBs9jF_pLCs

Oracles
for
free
• What
can
we
easily
detect?
• Crashes
• Program
freezes

26
Cheap
Oracles
• Cri+cal
message
boxes
• Suspicious
stdout
/
stderr

Specifying
Cheap
Oracles
• Simply
with
regular
Expressions
• For
example:
.*NullPointerExcep+on
.*|[Ee]rror|[Pp]roblem

More
sophis+ca+on
needs
work
• Ac+ons
– Ac+on
detec+on
– Ac+on
selec+on
– Some+mes
a
trial/error
process
• Random
selec+on
=
like
a
child,
just
much
faster
• Prin+ng,
file
copying
/
moving
/
dele+ng
• Starts
other
Processes
• Rights
management,
dedicated
user
accounts,
disallow
ac+ons
• Oracles
that
need
programming

All
kinds
of
processes
can
start……

32
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
protected SUT startSystem() !
throws SystemStartException!

33
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
protected State getState(SUT system) !
throws StateBuildException!

34
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
protected Set<Action> deriveActions(SUT system, !
State state) !
throws ActionBuildException!

35
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
protected Action selectAction(State state,!
Set<Action> actions);!
!!
// Here you can implement any selection strategy!
// per defaults this is random selection from actions!

36
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
!
protected boolean executeAction(SUT system, !
State state, !
Action action);!
!!

37
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *
protected Verdict getVerdict(State state);!

getVerdict!
Verdict
….
private
final
String
info;
private
final
double
severity;
private
final
Visualizer
visualizer;
(double
severity,
String
info)
public
Verdict(double
severity,
String
info,
Visualizer
v)
38
protected
Verdict
getVerdict(State
state){
Assert.notNull(state);
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
//
ORACLES
FOR
FREE
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
//
if
the
SUT
is
not
running,
we
assume
it
crashed
if(!state.get(IsRunning,
false))
return
new
public
final
class
public
Verdict
Verdict(1.0,
"System
is
offline!
I
assume
it
crashed!");
{
//
if
the
SUT
does
not
respond
within
a
given
amount
of
+me,
we
assume
it
crashed
if(state.get(NotResponding,
false))
return
new
Verdict(0.8,
"System
is
unresponsive!
I
assume
something
is
wrong!");

getVerdict!
39
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
//
ORACLES
ALMOST
FOR
FREE
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
String
+tleRegEx
=
sezngs().get(SuspiciousTitles);
//
search
all
widgets
for
suspicious
+tles
for(Widget
w
:
state){
String
+tle
=
w.get(Title,
"");
if(+tle.matches(+tleRegEx)){
//
visualize
the
problema+c
widget,
by
marking
it
with
a
red
box
Visualizer
visualizer
=
U+l.NullVisualizer;
if(w.get(Tags.Shape,
null)
!=
null){
Pen
redPen
=
Pen.newPen().setColor(Color.Red).(…).build();
visualizer
=
new
ShapeVisualizer(redPen,
…..,
"Suspicious
Title",
0.5,
0.5);
}
return
new
Verdict(1.0,
"Discovered
suspicious
widget
+tle:
'"
+
+tle
+
"'.",
visualizer);
}
}

getVerdict!
40
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
//
MORE
SOPHISTICATED
ORACLES
CAN
BE
PROGRAMMED
HERE
//-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
The
sky
is
the
limit
;-‐)
//
if
everything
was
ok...
return
new
Verdict(0.0,
"No
problem
detected.",
U+l.NullVisualizer);;
}

protected boolean moreActions(State state);!
protected void finishSequence(File recordedSequence)!
protected boolean moreSequences();!
41
START
SUT
Domain Experts
SCAN GUI +
OBTAIN
WIDGET TREE
more
actions?
DERIVE SET
OF USER
ACTIONS
EXECUTE
ACTION
calculate
fitness of test
sequence
No
Yes
Action
Definitions
Oracle
Definition
STOP
SUT
SUT
optional
instrumentation
FAULT? ORACLE
Yes
No
more sequences?
SELECT
ACTION
Test *

MS
Office
• Subject
applica+on:
Microso#
Word
2011
• Robustness
test:
random
ac+on
selec+on
• 18
hour
run
• 672
sequences
à
200
ac+ons
• 9
crashes
• 6
reproducable
crashes
• Effort
was
approx
1
hour
to:
– System
setup
(loca+on,
configura+on
files)
– Augment
Ac+on
Set
(Drag
Sources,
Drop
Targets,
Clicks,
Double
Clicks,
Right
Clicks,
Text
to
type,
…)
– Configure
cheap
oracle
(crashes,
+meouts,
evident
error
messages)

CTE
XL
Profesional
• CTE
XL
Professional
is
a
commercial
tool
for
test
case
design
• Draw
a
combinatorial
tree
modeling
test
relevant
aspects
• Generate
a
set
of
abstract
test
cases
• Java
applica+on
-‐
Eclipse
Rich
Client
Plaorm
(RCP)
using
Standard
Widget
Toolkit
(SWT)
• Developed
and
commercialized
by
Berner&MaTner
• TESTAR
was
used
to
test
it.
44

Do
experiments
with
more
sophis+cated
ac+on
selec+on
45
• What is a “good” test sequence?
è One that generates lots of Maximum
Call Stacks (MCS)
• MCS: root-leaf-path through call tree
• Intuition: the more MCSs a sequence
generates, the more aspects of the
SUT are tested (McMaster et al.)
• #MCS = number of leaves
• Obtainable through bytecode
instrumentation (no source code
needed)
!"#$%&
!'%& !(%& !)%&
!*%& !*%& !'%&
!*%&
+,#$-.$%&
/
/
/

Do
experiments
with
more
sophis+cated
ac+on
selec+on
• Select actions in such a way that sequences are formed
that generate large amounts of “Maximum Call Stacks”
within the system under test (SUT)
• Optimization algorithm used:
– Ant Colony Optimization
46

47
Ant Colony Optimization
• C = component set (here: C = set of feasible actions)
• The likelihood that c∈C
is chosen is determined by its
pheromone value pci
i • Generate trails (sequences) by selecting components
according to €
pheromone values pi
• Assess fitness of trails (i.e. MSC)
• Reward components ci that appear in “good” trails by increasing
their pheromones pi
(Upon construction of subsequent trails, prefer components with
high pheromone values)

Initial experiment results
ACO Run
160000
140000
120000
100000
80000
60000
40000
20000
0
0 1000 2000 3000 4000 5000 6000
#MCS
Sequence
160000
140000
120000
100000
80000
60000
40000
20000
0
0 1000 2000 3000 4000 5000 6000
#MCS
Sequence
Random Run
• Fixed stopping criteria -> 6000 generated sequences

49
Conclusion
• Implementation works
– Better than random
– Solutions improve over time
– Letting it run unti
• Efficiency
– Sequence generation is expensive è parallelization
– Frequent restarts of the SUT è might not be suitable for large
applications with a significant startup time, e.g. Eclipse
– ACO good choice?
• Fault sensitivity? è Empirical evaluation needed

Clave
Informá6ca
• We
met
this
company
at
some
local
test
event
in
Valencia
• Clavei
is
a
private
so#ware
vendor
from
Alicante,
which
• Specialized
for
over
26
years
in
the
development
Enterprise
Resource
Planning
(ERP)
systems
for
SMEs.
• Main
products
is
ClaveiCon
a
so#ware
solu+on
for
SMEs
for
accoun+ng
and
financing
control
• Current
tes+ng
is
done
manually
• Amount
of
faults
found
by
clients
is
too
high
• Tes+ng
needs
to
be
improved

Objec+ves
of
the
study
• Can
our
tool
be
useful
for
Clave
Informa+ca?
• Can
it
help
them
be
more
effec+ve
in
finding
faults?
• Can
this
be
done
in
an
efficient
way,
i.e.
not
taking
too
much
+me.
• Restric+ons:
– Clave
had
no
budget
to
apply
the
tool
themselves
– So
we,
the
tool
developing
researchers
did
that
51

ClaveiCon
• WriTen
in
Visual
Basic
• Microso#
SQL
Server
2008
database
• Targets
the
Windows
opera+ng
systems.
• Store
data
about
product
planning,
cost,
development
and
manufacturing.
• Provides
a
real+me
view
on
a
company’s
processes
and
enables
controlling
inventory
management,
shipping
and
payment
as
well
as
marke+ng
and
sales.

Case
Study
Procedure
1)
Planning
Phase:
a) Implementa+on
of
Test
Environment
b) Error
Defini+on:
An+cipate
and
iden+fy
poten+al
fault
paTerns.
2)
Implementa+on
Phase:
a) Oracle
Implementa+on:
Implement
the
detec+on
of
the
errors
defined
in
the
previous
step.
b) Ac+on
Defini+on
Implementa+on
c) Implementa+on
of
stopping
criteria
3)
Tes+ng
Phase:
run
the
test
4)
Evalua+on
Phase:
a) Iden+fy
the
most
severe
problems
encountered
during
the
run.
b) The
collected
informa+on
will
be
used
for
the
refinement
of
the
setup
during
the
next
itera+on.
53

Results
• The
pre-‐tes+ng
ac+vi+es:
– the
development
or
ac+ons,
oracles
and
stopping
criteria
to
setup
TESAR
takes
some
ini+al
effort
(in
our
case
approximately
26
hours)
but
will
pay
off
the
more
o#en
the
test
is
run.
• The
manual
labor
associated
to
post-‐tes+ng:
– inspec+on
of
log
files,
– reproduc+on
and
comprehension
of
errors
Are
only
a
+ny
frac+on
of
the
overall
tes+ng
+me
(we
spent
1,5
hour
of
manual
interven+on
during
and
a#er
tests,
compared
to
over
91
hours
of
actual
unaTended
tes+ng).
• TESTAR
detected
10
previously
unknown
cri6cal
faults,
makes
for
a
surprisingly
posi+ve
result
towards
believing
that
TESTAR
can
be
a
valuable
and
resource-‐efficient
supplement
for
manual
tes+ng.
54

See
a
video
here:
hTp://www.pros.upv.es/index.php/es/videos/item/1398-‐testar-‐rogue-‐user
55

SoZeam
• FITTEST
partner
from
France
• Big
so#ware
company
• SUT
selected
for
evalua+ng
TESTAR:
Modelio
SaaS
• Modelio
SaaS:
! – PHP
web
applica+on
– For
the
transparent
configura+on
of
distributed
environments
that
run
projects
created
with
SOFTEAM’s
Modelio
Modeling
tool
– Administrators
use
this
applica+on
to
manage
servers
and
projects
that
run
in
virtual
environments
on
different
cloud
plaorms
• Current
tes+ng
done
manually

Case
Study
Procedure
57
INTRODUCTORY
COURSE
CONSOLIDATE FINAL
PROTOCOL
RUN THE
PROTOCOL
Level 2:
HANDS ON
LEARNING
TRAINING PHASE
Level 3:
PERFORMANCE
EXAMS
SUCCESS?
YES
SETTING-UP
A WORKING TEST
ENVIRONMENT
Example
SUTS
USER
MANUAL
SOFTEAM
INSTALLING
TOOL
Example
SUTS
TESTING PHASE
EVALUATE
TEST RESULTS
NO
TRAINER
WORKING
DIARIES PROTOCOL
PROTOCOL
SUFFICIENT
QUALITY?
YES
NO
PRPORTOOTCOOCLOL
EVOLUTIONS
LEARNABILITY-QUESTIONNAIRE
(A)
LEARNABILITY-QUESTIONNAIRE
(B)
SATISFACTION-INTERVIEW
Level 1:
REACTION
EVALUATE
COURSE QUALITY
QUESTIONNAIRE
We
measured:
• Learnability
(ques+onnaires,
work-‐diaries,
performance
evalua+ons)
• Effec+veness
– 17
faults
were
re-‐injected
to
evaluate
– Code
coverage
• Efficiency
– Time
for
set-‐up,
designing
and
develop
– Time
for
running
tests

following the work in [4]. The pur-pose
questionnaire is to complement the
order to determine whether
with their given answers.
Results
COLLECTION
included the administration of
examination, working diaries,
TESTAR protocol artifacts (oracle,
video-taped interviews with the
diaries, the trainees reported
over the hands-on learning pe-riod
schedule. Table 2 shows the
activities.
Time reported (min)
S1 S2 In Pairs
1200 30 30
820 30 20
30 0 10
240 20 30
60 10 15
2350 90 105
activities during the hands-on
of the di↵erent TESTARs se-tups,
The trainer rated each artifact
Figure 3: Evolution of artifact quality as rated by
the trainer
Description
Test Suite
TSSoft TSTestar
Faults discovered 14 + 1 10 + 1
Did not find IDs 1, 9, 12 1,4,8,12,14,15,16
Code coverage 86.63% 70.02%
Time spent on development 40h 36h
Run time manual automated
1h 10m 77h 26m
Faults diagnosis and report 2h 3h 30m
Faults reproducible 100% 91.76%
Number of test cases 51 dynamic
Table 3: Comparison between tests
learning, and at point B, after the hands-on learning. The
questions have been taken from [9] where the authors are
analyzing the learnability of CASE tools. They have been
divided into 7 categories to separate di↵erent aspects of the
tool. It consists of 18 items in 5-points likert scale.
5. ANALYSIS
RQ1: How learnable is the TESTAR tool when it is used by
testing practitioners of SOFTEAM?
Empirical data was collected in order to analyze learnabil-ity
at the three identified di↵erent levels.
Reaction (level 1) - Responses from two questionnaires
can definitely compete with that of
SOFTEAM. The subjects felt confident
bit more time in customizing the
oracles, the TESTAR tool would
as their manual test suite w.r.t.
capability. This could save them
test suite in the future.
subjects found the investment in learn-ing
spending e↵ort in writing Java
worthwhile since they were sure
often the tests are run in an au-tomated
satisfied with the experience and
their peer colleagues. To persuade
some more in the tool (for example
to research how good the auto-mated
how re-usable they are amongst
perceived as difficult. Neverthe-less,
definitely detected.
criticism regarding the documenta-tion
of the tool, the testers’ reactions
during the interviews and the
that they were satisfied with the
came to a similar conclusion regard-ing
Although, the trainer reported
action set definition, the con-stant
of artefact quality during the
ease of learnability. These items
work to enhance the tool.
of Information Technology Education: Research,
4(1):61–84, January 2005.
[10] A. Zendler, E. Horn, H. Schwartzel, and E. Plodereder.
Demonstrating the usage of single-case designs in
experimental software engineering. Information and
Software Technology, 43(12):681 – 691, 2001.
APPENDIX
Faces were rated with a scale from 1 to7 where 1 represented ”Not
at all” and 7 represented ”Very much”.
Would you recommend Could you pursuade
the tool to your colleagues? your management to invest?
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
the tool in such a way that it detected a fair amount
injected faults. This gives insight into the training ma-terial
and the user manual that needs to be improved and
concentrate more on giving examples and guidance on more
sophisticated oracles. Also, we might need to research and
develop a wizard that can customize the protocol without
programming.
The e↵ectiveness and efficiency of the automated tests
generated with TESTAR can definitely compete with that of
manual tests of SOFTEAM. The subjects felt confident
they would invest a bit more time in customizing the
selection and the oracles, the TESTAR tool would
best or even better as their manual test suite w.r.t.
coverage and fault finding capability. This could save them
manual execution of the test suite in the future.
The SOFTEAM subjects found the investment in learn-ing
TESTAR tool and spending e↵ort in writing Java
for powerful oracles worthwhile since they were sure
would pay o↵ the ore often the tests are run in an au-tomated
way. They were satisfied with the experience and
animated to show their peer colleagues. To persuade
management and invest some more in the tool (for example
doing follow-up studies to research how good the auto-mated
tests can get and how re-usable they are amongst
versions of the SUT) was perceived as difficult. Neverthe-less,
enthusiasm to try was definitely detected.
summary, despite criticism regarding the documenta-tion
and installation process of the tool, the testers’ reactions
statements encountered during the interviews and the
questionnaire, indicate that they were satisfied with the
experience. We came to a similar conclusion regard-ing
tool’s learnability. Although, the trainer reported
difficulties with the action set definition, the con-stant
progress and increase of artefact quality during the
study, points to an ease of learnability. These items
improved in future work to enhance the tool.
REFERENCES
studies for method and tool evaluation. Software,
IEEE, 12(4):52 –62, July 1995.
[8] P.M. Kruse, N. Condori-Fernandez, T.E.J. Vos,
A. Bagnato, and E. Brosse. Combinatorial testing tool
learnability in an industrial environment. In ESEM
2013, pages 304–312, Oct 2013.
[9] M. Senapathi. A framework for the evaluation of case
tool learnability in educational environments. Journal
of Information Technology Education: Research,
4(1):61–84, January 2005.
[10] A. Zendler, E. Horn, H. Schwartzel, and E. Plodereder.
Demonstrating the usage of single-case designs in
experimental software engineering. Information and
Software Technology, 43(12):681 – 691, 2001.
APPENDIX
Faces were rated with a scale from 1 to7 where 1 represented ”Not
at all” and 7 represented ”Very much”.
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
1 2 3 4 5 6 7
X
• Some
difficul+es/resistance/
misunderstanding
during
the
learning
of
programming
for
powerful
oracles
• Tes+ng
ar+facts
produced
increased
in
quality
– Red
=
Oracle
– Green
=
Ac+on
Set
– Blue
=
Stopping
Criteria
the
cards
tool
perceived
Would
your
argu-ments
sat-isfaction
in-terview
expression
pur-pose
the
whether
administration of
diaries,
oracle,
the
reported
pe-riod
the
hands-on
se-tups,
artifact
stopping
sub-mitted
Figure 3: Evolution of artifact quality as rated by
the trainer
Description
Test Suite
TSSoft TSTestar
Faults discovered 14 + 1 10 + 1
Did not find IDs 1, 9, 12 1,4,8,12,14,15,16
Code coverage 86.63% 70.02%
Time spent on development 40h 36h
Run time manual automated
1h 10m 77h 26m
Faults diagnosis and report 2h 3h 30m
Faults reproducible 100% 91.76%
Number of test cases 51 dynamic
Table 3: Comparison between tests
learning, and at point B, after the hands-on learning. The
questions have been taken from [9] where the authors are
analyzing the learnability of CASE tools. They have been
divided into 7 categories to separate di↵erent aspects of the
tool. It consists of 18 items in 5-points likert scale.
5. ANALYSIS
RQ1: How learnable is the TESTAR tool when it is used by
testing practitioners of SOFTEAM?
Empirical data was collected in order to analyze learnabil-ity
at the three identified di↵erent levels.
Reaction (level 1) - Responses from two questionnaires
about first impressions of the course (quality and learnability

Student
course
• Course:
1st
year
Master
• “Developing
Quality
So#ware”
• 34
students
working
in
groups
of
2
• Introduc+on:
10
minutes
• Going
through
the
user
manual
(10
pages)
while
doing
a
small
exercise
on
a
calculator:
50
minutes
• A#er
1
hour
the
students
were
sezng
up
tests
for
MS
paint

Future
Work
• S+ll
lot
that
needs
to
be
done!
• Accessibility
API
works
if
UI
has
been
programmed
“well”
• Research
more
search-‐based
approaches
for
ac+on
selec+on
• Research
the
integra+on
of
other
test
case
genera+on
techniques
(model-‐based,
combinatorial-‐based)
for
ac+on
selec+on
• Design
a
test
spec
language
that
makes
it
possible
to
specify
ac+ons
and
oracles
without
programming
Java
• Do
more
industrial
evalua+ons
to
compare
maintenance
costs
during
regression
tes+ng
with
our
tool
and
capture/replay
or
visual
tes+ng
tools
• Extend
the
tool
beyond
PC
applica+ons
(for
now
we
have
Mac
and
Windows
plug-‐ins)
to
mobile
plaorms
60

• Tanja
E.
J.
Vos
• email:
tvos@pros.upv.es
• skype:
tanja_vos
• web:
hTp://tanvopol.webs.upv.es/
• telephone:
+34
690
917
971

Testar2014 presentation

More Related Content

Similar to Testar2014 presentation (20)

More from Tanja Vos (6)

Recently uploaded (20)

Testar2014 presentation