Testing the UI of Mobile Applications

Testing the UI of Mobile Apps
Marco Torchiano
marco.torchiano_@_polito.it
https://softeng.polito.it/torchiano
@mtorchiano
3rd International Genoa Software Engineering PhD School on
Automated Functional and Security Testing of
Web and Mobile Applications
May 13-16, Genova, Italy
v.1.2.0
© Marco Torchiano, 2019

Outline
• Testing mobile apps is hard!
• E2E testing of mobile apps
• How much testing is performed?
• Mobile test fragility
• Causes of fragility
• Visual vs. Layout
• Combined approach

Developers perspective
• Series of interviews with actual developers from 7 companies
• Are mobile applications tested? How? To what extent?
• What are the most peculiar properties to test in mobile applications?
• What aspects of mobile applications discourage developers from testing
them?
• What are the main challenges of mobile application testing?

Developers perspective - Companies
A. Distributor of testing tools for various typologies of applicatives.
B. Test factory for third party applications and test consulting.
C. Insurance company: web and mobile apps for insurers and
customers.
D. Insurance company: platform for insurance management.
E. Test factory for third party applications and test consulting.
F. Full-stack development of mobile applications for multiple
platforms.
G. Test factory for consulting of test and test management for banking
applications.

How mobile testing is performed
Manual testing ● ● ● ● ● ● ○
Random-monkey testing ● ○ ○ ○ ○ ○ ○
Capture & Replay ● ● ● ● ● ○ ○
Scripted testing ● ● ● ● ● ○ ○
Model-based testing ● ○ ○ ○ ○ ○ ○

Tool adoption
Tool Category n
MicroFocus UFT / QTP Regression and functional testing 4
Selenium Scripted web-based app testing 4
Appium Multi-platform mobile app automated testing 3
PerfectoMobile Scripted cloud-based app testing 3
JMeter Load and performance testing 3
Silk Mobile Capture and Replay testing 2
JUnit Java unit testing 2
AppliTools AI-based Visual Testing 1
Monkey Random testing 1
Qualitia Scripted testing and GUI modeling 1
TestComplete Capture and Replay testing 1

Peculiarities of Mobile Testing
• Procedures and instruments vary significantly for:
• Native
• Web-based
• Hybrid
• Huge device and os diversity
Device diversity and form factor are the fundamental variables to take
into account, much more than for web application testing, for which it
is sufficient to test the main browsers

Limitations to Mobile Test Adoption
• Clients want published asap in any case
• Quality and features are reduced
• User feedback partly replaces testing
• Business-oriented departments are in charge of testing
• Focus is often on back-end as (valuable) service provider
Only companies creating apps that manage sensitive and
economically critical data tend to adopt automated testing
• Company B

Challenges
• Test fragility leading to high maintenance cost
• Company A had to re-record all tests with new version
• Company B and C estimate ~30% of test effort is due to test script
maintenance due to lack of flexibility
[..] problem that is perceived and that has to be fought on
a daily basis: test suites must be maintained daily
• Company D

End-to-End (E2E) Testing
E2E testing is used to test whether the flow
of an application right from start to finish
is behaving as expected

Testing
Pyramid
E2E
Integration
Unit
€€€
€cent
Adapted from M. Fowler

E2E Critical factors
• Cost: E2E tests are expensive (hard) to write
• E.g. through Capture & Replay
• Time: E2E tests are slow
• UI interactions take time
• Integration: E2E tests are difficult to integrate in CI
• Hard to run in headless mode
• Fragility: E2E tests are fragile
• Even minor UI changes can break tests

E2E Critical factors worsen in mobile
• Cost: E2E tests are expensive (hard) to write
• E.g. through Capture & Replay
• Time: E2E tests are slow
• UI interactions take time
• Integration: E2E tests are difficult to integrate in CI
• Hard to run in headless mode
• Fragility: E2E tests are fragile
• Even minor UI changes can break tests
Rich user
interactions
Wide variety of devices
+ Quick release cycles
Device test factories
or emulators
Different devices may behave
slightly differently

Critical Factors – Mobile specific
• Wide variety of devices
• Versions of OS
• Screen aspect ratio
• Display Hw vs. Sw buttons
• Rich user interactions
• Gestures, e.g., swipe, pinch, etc.
• Sw and HW buttons
• Quick release cycle:
• apps typically follow a weekly/bi-weekly release plan

OS Versions
0% 5% 10% 15% 20% 25% 30%
Gingerbread
Ice Cream Sandwich
Jelly Bean
KitKat
Lollipop
Marshmallow
Nougat
Oreo
Pie
Data refer to May 2019

Relative screen sizes of Android devices
20 Background
(source: https: //www.xda-developers.com)

Screen Size and Density
ldpi mdpi tvdpi hdpi xhdpi xxhdpi Total
Small 0.4% 0.1% 0.1% 0.6%
Normal 0.9% 0.3% 24.0% 37.7% 23.6% 86.5%
Large 2.4% 1.9% 0.6% 1.6% 1.7% 8.2%
Xlarge 3.1% 1.3% 0.6% 5.0%
Total 0.4% 6.4% 2.2% 25.9% 40.0% 25.4%

How to handle diversity
• Device portfolio representative of your target users
• Include both OS version and Device
• Prioritized portfolio, e.g.,
• Gold:
• Main supported devices + OS
• All test suite executed
• Silver:
• Additional devices + OS
• Only a subset of test case executed

A real-world case
• Company performing test for third parties
• Case study test process:
• Conducted in 2017
• Outsourcing
• On mobile banking app
• 14 test areas

Test suite
0 20 40 60 80 100 120 140 160 180
MIGRAZIONE TOKEN
ASSICURAZIONE
ASSISTENZA
RATING
MESSAGGI
TOUCH ID
FINANAZIAMENTI
IMPOSTAZIONI
POSTLOGIN
NAVIGAZIONE
PROFILO
CONTO
INVESTIMENTI
PRELOGIN
STRONG AUTHENTICATION
RISPARMIO
FLUSSI DISPOSITIVI
CARTE
1349 Test cases

Testing the UI of Mobile Applications

Device Portfolio
Gold
IOS 9.3.2 APPLE 6s/6 Plus
ANDROID 6.0.1 SAMSUNG S6 /S7
Silver
IOS 9.3.2 APPLE 5s
IOS 9.3.2 APPLE 4s
ANDROID 6.0.1 SAMSUNG S5
IOS 9.3.2 APPLE 6
ANDROID 6.0.1 HUAWEI P9
IOS 9.3.2 APPLE 5c
ANDROID 4.4.4 SAMSUNG Note 3
ANDROID 6.0.1 LG NEXUS 5
ANDROID 4.1.2 SAMSUNG A3
ANDROID 5.0.0 ASUS Zenfone 2
ANDROID 5.1.1 LG G4
ANDROID 4.4.4 Xiaomi Mi Note
ANDROID 4.4.2 ACER E3
ANDROID 4.4.2 HTC SENSE 5
ANDROID 4.4.4 HUAWEI Ascend G700

Test executions
• Run just 15% of tests w.r.t. full coverage of all portfolio
• 3904 test runs vs. 26980
Test cases Devices Test runs
Test suite 1349 2 2698
Subset 67 18 1206
3904

End-2-End Testing Techniques
• Manual
• Scripted

• Manual
• Coordinate-based
• Layout-based
• Visual / Image Recognition
Scripted

• Manual
• Coordinate-based : 1st generation
• Layout-based : 2nd generation
• Visual / Image Recognition : 3rd generation

Layout-based testing
Test scripts leverage the UI structure to
• Identify and detect UI components
• Interact with them

Layout-based testing
Elements are identified through layout properties
(e.g., IDs, text, content description, widget type)

Layout-based mobile testing tools

Espresso – Test Script
@Test
public void testCreateNote() {
onView(withId(R.id.fab_expand_menu_button)).
perform(click());
onView(withId(R.id.fab_note)).perform(click());
onView(withId(R.id.detail_title)).
perform(typeTextIntoFocusedView("Ciao"));
onView(withContentDescription("drawer open")).
perform(click());
onView(withText("Ciao")).
check(matches(isDisplayed()));
}

Espresso + Recorder – Test Script
@Test
public void recordedTest() {
ViewInteraction viewInteraction = onView(
allOf(withId(R.id.fab_expand_menu_button),
childAtPosition(allOf(withId(R.id.fab),
childAtPosition(
withClassName(is("android.widget.FrameLayout")),
2)), 3), isDisplayed()));
viewInteraction.perform(click());
This is just for the first click!

// Added a sleep statement to match the app's execution delay.
// The recommended way to handle such scenarios is to use Espresso
idling resources:
// https://google.github.io/android-testing-support-
library/docs/espresso/idling-resource/index.html
try {
Thread.sleep(150);
} catch (InterruptedException e) {
e.printStackTrace();
}
A number of sleeps added
even if not required

ViewInteraction viewInteraction = onView(
allOf(withId(R.id.fab_expand_menu_button),
childAtPosition(allOf(withId(R.id.fab),
childAtPosition(
withClassName(is("android.widget.FrameLayout")),
2)), 3), isDisplayed()));
viewInteraction.perform(click());
onView(withId(R.id.fab_expand_menu_button)).
perform(click());

Visual testing
• Image recognition techniques are used to identify elements of the
user interface to interact with.
• Ease of definition of test cases with no tech knowledge required
• only screen captures are needed
• Can be applied seamlessly to any kind of software provided with an
(emulated) user interface
• Very high fragility to even minor changes in the GUI
• Difficult in-depth testing of application functionalities.

Visual testingImage recognition of screen captures on an emulated
Android Virtual Device
Visual GUI Testing of Android Apps

Visual testing tools
Eye Automate

Script EyeAutomate
Click "{ImageFolder}/1557782216425.png"
Type "Ciao"
Check "{ImageFolder}/1557782252007.png"

Script
click("1548927253493_cropped.png")
sleep(1)
sleep(1)
type("Test1")
sleep(1)
sleep(1)
find("1548927262321_cropped.png")
sleep(1)
find("1548927264020_cropped.png")
sleep(1)

How much is E2E testing used?
And which tools are used, how much?

Tool Diffusion
0% 1% 2% 3% 4% 5%
Selendroid
Appium
UI Automator
Robotium
Espresso
Robolectric

Tool Diffusion
0.01%
0.08%
0.33%
0.84%
2.43%
4.12%
20.00%
0% 5% 10% 15% 20%
Selendroid
Appium
UI Automator
Robotium
Espresso
Robolectric
Junit

Tools diffusion
Results from Cruz et al., 2019

How much test code?
2.8%
7.4%
6.2%
7.6%
13.5%
0% 2% 4% 6% 8% 10% 12% 14%
Appium
UI Automator
Robotium
Espresso
Robolectric
TLR = Test LOC / Project LOC

Fragility
A test case is fragile if it requires interventions
when the application evolves due to
any modification applied to the Application Under Test.

Fragility related drawbacks
• When a fragility manifests, a test fails
• Then extra effort is required to:
• Verify that no regression has occurred
• Modify the failing test to adapt it to the changed UI
• Re-run the test

Visual fragility
Graphic changes: invalidation of Visual test scripts.Graphic changes: invalidation of Visual test scripts.
App: K9 Mail

Layout-based fragility
Properties change: invalidation of Layout-based test scripts.

How much fragility?
• Test Suite Volatility (TSV)
• Proportion of releases that required any test code modification
• Modified Test Classes Ratio (MCR)
• Average proportion of test classes modified on each release
• Modified Classes With Modified Methods ratio
• Discards new methods and other cosmetic changes

How much fragility?
Espresso
UI Automator
Robotium
Robolectric
0% 5% 10% 15% 20% 25% 30%
Test Suite Volatility

How much fragility?
Espresso
UI Automator
Robotium
Robolectric
0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20%
Modified Class Ratio

How much fragility?
Espresso
UI Automator
Robotium
Robolectric
0% 10% 20% 30% 40% 50% 60% 70%
Modified Classes With Modified Methods

Taxonomy of change causes (top-level)

Taxonomy of change causes
Modifications of test code without connection to production code.
(e.g., changes in checked assertions, refactoring of test code, addition or
removal of log instructions or screen captures, syntax corrections)
- assertThat(activity, notNullValue());
- assertThat(activity.toolbar, notNullValue());
- assertThat(activity.presenter, notNullValue());
+ assertThat(activity.drawerToggle, notNullValue());

Modifications in test code due to changes in production code that are unrelated to
the graphic appearance of the app (e.g., changes in Activity methods and startup,
changes in the application data structures, application code refactoring)
+ Intent intent = new Intent();
+ intent.putExtra(Judo.JUDO_OPTIONS, getJudo().build());
- activityTestRule.launchActivity(getIntent());
+ activityTestRule.launchActivity(intent);

Changes in the time needed by the app to perform operations (e.g.,
changes in View transition duration, or long-running activity tasks)
- Thread.sleep(500);
+ Thread.sleep(1000);

Adaptations of test classes to guarantee compatibility with different
versions of the Android OS (e.g., changed classes for the same Widget
or OS version check)
- rotateToPortrait( this );
+ if (VERSION.SDK_INT >=
+ VERSION_CODES.JELLY_BEAN_MR2) {
+ ritateToPortrait( this );

Modifications in the operations that can be performed over existing widgets of the
user interfac (e.g., changed in the navigation, in keyboard opening methods, in
checked properties of widgets)
- onView(withId(R.id.fitnessProgramButton)).
perform(ViewActions(scrollTo());
+ onView(withId(R.id.fitnessProgramButton)).
perform(ViewActions.scrollTo()), click());

Modifications in the number and type of elements in the visual hierarchy of the
tested activities (e.g., addition, removal or substitution of widgets)
expectVisible(viewThat(
- hasAncestorThat(withId(
- R.id.attribute_symptoms_onset_days)),
+ hasAncestorThat(withId(R.id.attribute_weight)),
hasText(" ");

Changes in the way the widgets are identified in test code (e.g., changes in unique
IDs or text content of views)
- onView(withId(R.id.morse_input_text_card))
+ onView(withId(R.id.morse_input_text_container))
.perform(click());

Changed ways to access resources that are used as oracles by the test code (e.g.,
changes in retrieving of text or graphical resources)
- onView(withText("Coupon")).perform(click())
+ onView(withText(R.string.category_coupon))
.perform(click());

Modification in the actual appearance of widgets (e.g., changes in
animations, transparencies, themes, absolute coordinates, sizes)

Taxonomy of change causes (full)

Fragility changes
Espresso
UI Automator
Robotium
Robolectric
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Changed test files with modification linked to AUT…Changed test files with modifications linked to AUT changes

Non-Fragility changes
Espresso
UI Automator
Robotium
Robolectric
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Changed test files with modifications not linked to AUT…Changed test files with modifications NOT linked to AUT changes

GUI related changes
Espresso
UI Automator
Robotium
Robolectric
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
AUT related changes linked to GUI changes

Visual vs. Layout-Based
Cost of test development

Experiment Goal
• Analyze the E2E testing process
• to understand what technique
• Layout-based
• Visual
• yields
• higher productivity
• high quality of tests

Experiment Goal
• Analyze the E2E testing process
• to understand what technique
• Layout-based
• Visual
• yields
• higher productivity
• high quality of tests
Omni-Notes
Context
Android native app:
MSc Students in
Software Engineering course

Research Questions
RQ1 Productivity: What is the productivity in test script production of
graduate students with Visual and Layout-based testing tools for
Android App testing?
RQ2 Quality: What is the quality of the test suites produced by
graduate students using Visual and Layout-based testing tools for
Android App testing?
RQ3 Obstacles: What are the perceived difficulties in applying Visual
and Layout-based testing tools for Android App testing?

Experiment Design
Session 1 Session 2
Scenarios
Eye Automate
Test
scripts
Test
scripts
Feedback
Test
results
Test
results
Feedback
RQ3
RQ1
RQ2

Scenarios
Open Info Screen
Open AboutActivity and verify that App icon and copyright notice
are shown;
Add a note
Insert a note with custom title and content, and verify that it is
shown in the note list;
Search a note Input a note’s text in search bar, and verify that such note is shown;
Check available
languages
Open Language option in SettingsActivity and verify that English,
Italian and French language are available;
Delete a note
Delete and then restore a note from the TrashAc- tivity, then check
that it is shown again in the MainActivity;
Restore a note
Add and then delete a note, and check that it is no longer shown in
the MainActivity;
Add note category
Add a note category with custom name and color, and check that it
is shown among available ones in the DrawerMenu.

Feedback Questionnaire
• Implementing the test suite with EyeAutomate / Espresso was easy
and intuitive
• The EyeStudio / Android Studio IDE was helpful in the creation of test
scripts
• It was easy to identify elements with the EyeAutomate / Espresso
technique
• What were the main issues in identifying elements of the screen
using the EyeAutomate / Espresso testing approach? (Open)
• Which tool would you choose if you had to perform E2E testing in the
future? And why? (Open)

?(Open)
Espresso frame-
plementation of
using the layout-
elements of the
rform UI testing
omment)
average number
was 0.6.
21.5%) reported
Layout−based Visual
−0.25 0.00 0.25 −0.25 0.00 0.25
0
2
4
6
Productivity[testcases]
Results: productivity
⚖️
No significant difference
( p = 0.625 )

?(Open)
Espresso frame-
plementation of
using the layout-
elements of the
rform UI testing
omment)
average number
was 0.6.
21.5%) reported
Layout−based Visual
−0.25 0.00 0.25 −0.25 0.00 0.25
0
2
4
6
Layout−based
No Yes
0
2
4
6
Results: productivity
Within Espresso student
using Test Recorder
(average 6.7) and those
who didn’t (average 4.13)
show a significant
difference (p<0.001)

Layout Visual
−0.25 0.00 0.25 −0.25 0.00 0.25
0.00
0.25
0.50
0.75
1.00
TestCaseQuality
Results: quality
👍
Significant difference
( p = 0.025
Cliff’s δ = 0.18 )

Layout Visual
−0.25 0.00 0.25 −0.25 0.00 0.25
0.00
0.25
0.50
0.75
1.00
TestCaseQuality
Results: quality
👍
Layout−based
No Yes
0.00
0.25
0.50
0.75
1.00
TestCaseQuality
Within Espresso, student
using Test Recorder
(average 0.57) and those
who didn’t (average 0.71)
show a significant
difference (p=0.021)

Results: perception
by
ed
g
e
st
ge
st
ar,
he
g
ns
s,
h
g
n
h
ad
ot
es
be
ft
Hs30
participants’ perceived easiness in
nding properties to discriminate GUI
components, using theVisual or the
Layout-based approach
0.0181 Reject
Layout−based
Visual
1 2 3 4 5
(a) 2.2 - 2.6: Implementing the test suite was easy and in-
tuitive
Layout−based
Visual
1 2 3 4 5
(b) 2.3 - 2.7: The IDE was helpful in the creation of test
scripts
Visual
g
re
st
ge
st
ar,
he
ng
ns
s,
h
ng
n
h
ad
ot
es
be
ft
as
components, using theVisual or the
Layout-based approach
Layout−based
Visual
1 2 3 4 5
tuitive
Layout−based
Visual
1 2 3 4 5
scripts
Layout−based
Visual
ere
est
ge
est
Bar,
he
ng
ns
ns,
th
ng
in
th
ad
pot
es
be
eft
as
ms
der
Layout−based
Visual
1 2 3 4 5
tuitive
Layout−based
Visual
1 2 3 4 5
scripts
Layout−based
Visual
1 2 3 4 5
Ease of use
IDE support
Element identification
👍
🏻
👍
🏻
👍
🏻

Results: visual obstacles
• Issues with image recognition
• Capture size
• Click position
• Other issues (e.g. small images)
• Resolution / Portability

Results: layout-based obstacles
• Identifying widgets
• Missing / ambiguous properties
• Layout hierarchy
• Recording tool

Results
• RQ1: the learnability of the two tools is similar, for non-professional
developers approaching them.
• RQ2: the test suites developed with EyeAutomate possess a higher
quality than those developed with Espresso.
• Better quality has been achieved by the participants when manually writing
test scripts, as opposed to leveraging the Capture & Replay approach
implemented by Espresso Test Recorder.

Results
• RQ3: developing a test suite with EyeAutomate (w/EyeStudio IDE) is
slightly easier than with Espresso (w/Android Studio IDE)
• The main obstacles are:
• imprecision of the image recognition library
(especially with small elements and very simple patterns)
• difficulty of finding unambiguous layout properties
• Preference towards the visual approach
• higher intuitiveness and ease of use in building test scripts through screen
captures.

Visual vs. Layout-Based Summary
• Visual testing tools enable testers to deliver similar productivity but a
higher quality when compared to layout-based tools
• Visual testing tools exhibit problems in image capture and recognition
• Especially with small items
• Layout-based test scripts is difficult due to the incomplete or
ambiguous definition of GUI components or layouts

Take away
• If you are an app developer, then
• consider adopting “testability” guidelines
• If you are a tester, then
• think about visual tools
• If you are a visual tool developer, then
• improve image recognition
• If you are a layout-based tool developer, then
• improve script capturing

Combined Layout-Visual Approach
• Translation from Layout-based test cases to Visual test cases,
and viceversa;
• From 2° generation properties to 3° generation screen captures;
• From 3° generation screen captures to 2° generation properties.

Combined Layout-Visual Approach: 2nd  3rd
Static analysis and enhancement of original layout files;
Addition of unique IDs to widgets of the Activities;
Addition of instructions for screen capturing to test code.

Execution of the test script on an Android Virtual Device;
Runtime extraction of (Element, Interaction) pairs
through the use of screen capturing libraries on the AVD.

Generation of the Visual test script for the selected Visual
testing tool.

Combined Layout-Visual Approach: 3rd  2nd
Adds callbacks for operations on any on-screen widgets,
allowing logging of the operations that are performed.

Execution of the 3° generation script on the instrumented
Android app.

Extracts layout properties from the log generated during
the execution, obtaining (property, interaction) pairs

Translates the sequence of steps in the desired layout-
based scripting syntax.

Combined Approach: advantages
• Automated generation of layout-based (visual) test cases reusing
existing visual (layout-based) test cases;
• Automated porting of existing visual scripts to other devices / screen
sizes / resolutions;
• Reduced maintenance of test cases;
• Reduced impact of fragilities on test cases.

In summary
0,01%
0,08%
0,33%
0,84%
2,43%
4,12%
20,00%
0%
5%
10%
15%
20%
Selendroid
Appium
UIAutomator
Robotium
Espresso
RobolectricJunit
E2E
Integration
Unit
€€€

In summary
Espresso
UI Automator
Robotium
Robolectric
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Changed test files with modification linked to…
Fragility

References
• Android Distribution Dashboard:
https://developer.android.com/about/dashboards/
• L. Cruz, R. Abreu, D. Lo. «To the Attention of Mobile Software
Developers: Guess What, Test your App!» Empirical Software
Engineering, 2019
https://luiscruz.github.io/papers/cruz2019attention.pdf
• R. Coppola, M. Morisio, M. Torchiano, L. Ardito. «Scripted GUI Testing
of Android Open-Source Apps: Evolution of Test Code and Fragility
Causes» Empirical Software Engineering, 2019

References
• M. Fowler. “TestPyramid”, 2012
https://martinfowler.com/bliki/TestPyramid.html
• Nayebi, M., Adams, B., & Ruhe, G. (Dec 2015). “Mobile App Releases
– A Survey Research on Developers and Users Perception”, SANER
2015

Testing the UI of Mobile Applications

More Related Content

What's hot (20)

Similar to Testing the UI of Mobile Applications (20)

More from Marco Torchiano (14)

Recently uploaded (20)

Testing the UI of Mobile Applications

Editor's Notes