Escaping Automated Test Hell - One Year Later

Main sponsor

Escaping Automated Test Hell
One year later...

Wojciech Seliga

About me
• Coding for 30 years
• Agile Practices (inc. TDD) since 2003
• Dev Nerd, Tech Leader, Agile Coach,
Speaker
• 5+ years with Atlassian (JIRA Development
Team Lead)
• Spartez Co-founder

18 000 tests on all levels

Very slow and fragile
feedback loop

Serious performance and
reliability issues

Feedback Test
Speed Quality
`

Respect Restructure
Design Share
Prune

Test Code is Not
Trash
Refactor
Maintain
Review
Discuss

Optimum Balance

Isolation Speed

Optimum Balance

Isolation Speed Coverage

Optimum Balance

Isolation Speed Coverage Level

Optimum Balance

Isolation Speed Coverage Level Access

Optimum Balance

Isolation Speed Coverage Level Access Effort

Dangerous to temper with

Quality / Determinism

Dangerous to temper with

Quality / Determinism Maintainability

Splitting codebase is
key aspect of short
test feedback loop

Build Tiers and Policy

Tier A1 - green soon after all commits
unit tests and functional* tests

Tier A2 - green at the end of the day
WebDriver and bundled plugins tests

Tier A3 - green at the end of the iteration
supported platforms tests, compatibility tests

Wallboards:
Constant
Awareness

Training
• assertThat over assertTrue/False and
assertEquals
• avoiding races - Atlassian Selenium with its
TimedElement
• Unit tests over functional tests
• Brownbags, blogs, code reviews

Re-run failed tests and see if they pass

Automatic Flakiness Detection
Quarantine

Selenium ditching
Sky did not fall in

Ditching - beneﬁts

• Freed build agents - better system
throughput
• Boosted morale
• Gazillion of developer hours saved
• Money saved on infrastructure

Ditching - due diligence

• conducting the audit - analysis of the
coverage we lost
• determining which tests needs to rewritten
(e.g. security related)
• rewriting the tests

Flaky Browser-based Tests
Races between test code and asynchronous page logic

Playing with "loading" CSS class does not really help

Races Removal with Tracing
// in the browser:
function mySearchClickHandler() {
    doSomeXhr().always(function() {
        // This executes when the XHR has completed (either success or failure)
        JIRA.trace("search.completed");
    });
}
// In production code JIRA.trace is a no-op

// in my page object:
@Inject
TraceContext traceContext;

public SearchResults doASearch() {
    Tracer snapshot = traceContext.checkpoint();
    getSearchButton().click(); // causes mySearchClickHandler to be invoked
    // This waits until the "search.completed"
// event has been emitted, *after* previous snapshot
    traceContext.waitFor(snapshot, "search.completed");
    return pageBinder.bind(SearchResults.class);
}

Speed

Can we halve our build times?

Parallel Execution - Theory
Batches

Start of Build End of Build

Parallel Execution
Batches


Parallel Execution -
Agent
availability
Reality Bites
Batches


Dynamic Test Execution
Dispatch - Hallelujah

"You can't manage what
you can't measure."
W. Edwards Deming

If y
ou
you bel
ieve
"Youare manage what
can't just
doo i i
you can't measure."n
me t
d.
W. Edwards Deming

You can't improve something
if you can't measure it

You can't improve something
if you can't measure it
Proﬁler, Build statistics, Logs, statsd → Graphite

Compilation
Packaging

Executing Tests

Anatomy of Build*

Fetching Dependencies
Compilation
Packaging

Executing Tests

Anatomy of Build*

Compilation
Packaging

Executing Tests

Anatomy of Build*
*Any resemblance to maven build is entirely accidental

Compilation
Packaging

SCM Update Executing Tests

Anatomy of Build*

Agent Availability/Setup
Compilation
Packaging


Anatomy of Build*

Compilation
Packaging Publishing Results


Anatomy of Build*

Compilation (7min)

JIRA Unit Tests Build

Compilation (7min)

Packaging (0min)


Compilation (7min)

Packaging (0min)

Executing Tests (7min)


Compilation (7min)
Publishing Results (1min)
Packaging (0min)



Compilation (7min)
Packaging (0min)

Fetching Dependencies (1.5min)


Compilation (7min)
SCM Update (2min) Packaging (0min)



Agent Availability/Setup (mean 10min)
Compilation (7min)
SCM Update (2min) Packaging (0min)



Decreasing Test
Execution Time to
ZERRO
alone would not let us
achieve our goal!


• starved builds due to
busy agents building
very long builds
• time synchronization
issue - NTPD problem

SCM Update - Checkout time

• Proximity of SCM repo

• shallow git clones are not so fast and lightweight +
generating extra git server CPU load

• git clone per agent/plan + git pull + git clone per build
(hard links!)

• Stash was thankful (queue)

SCM Update - Checkout time

• Proximity of SCM repo

• shallow git clones are not so fast and lightweight +
generating extra git server CPU load

• git clone per agent/plan + git pull + git clone per build
(hard links!)

• Stash was thankful (queue)

2 min → 5 seconds

Escaping Automated Test Hell - One Year Later

• Fix Predator
• Sandboxing/isolation agent trade-off:
rm -rf $HOME/.m2/repository/com/atlassian/*

into
find $HOME/.m2/repository/com/atlassian/
-name “*SNAPSHOT*” | xargs rm

• Network hardware failure found
(dropping packets)

• Fix Predator
• Sandboxing/isolation agent trade-off:
rm -rf $HOME/.m2/repository/com/atlassian/*

into
find $HOME/.m2/repository/com/atlassian/
-name “*SNAPSHOT*” | xargs rm

• Network hardware failure found
(dropping packets)

1.5 min → 10 seconds

Compilation

• Restructuring multi-pom maven project
and dependencies
• Maven 3 parallel compilation FTW
-T 1.5C
*optimal factor thanks to scientiﬁc trial and error research

Compilation

• Restructuring multi-pom maven project
and dependencies
• Maven 3 parallel compilation FTW
-T 1.5C
*optimal factor thanks to scientiﬁc trial and error research

7 min → 1 min

Unit Test Execution
• Splitting unit tests into 2 buckets: good and
legacy (much longer)
• Maven 3 parallel test execution (-T 1.5C)
3000 poor tests 11000 good tests
(5min) (1.5min)

Unit Test Execution
• Splitting unit tests into 2 buckets: good and
legacy (much longer)
• Maven 3 parallel test execution (-T 1.5C)
3000 poor tests 11000 good tests
(5min) (1.5min)

7 min → 5 min

Functional Tests
• Selenium 1 removal did help
• Faster reset/restore (avoid unnecessary
stuff, intercepting SQL operations for debug
purposes - building stacktraces is costly)
• Restoring via Backdoor REST API
• Using REST API for common setup/
teardown operations

Publishing Results

• Server log allocation per test → using now
Backdoor REST API (was Selenium)
• Bamboo DB performance degradation for
rich build history - to be addressed

Publishing Results

• Server log allocation per test → using now
Backdoor REST API (was Selenium)
• Bamboo DB performance degradation for
rich build history - to be addressed

1 min → 40 s

Unexpected Problem

• Stability Issues with our CI server
• The bottleneck changed from I/O to CPU
• Too many agents per physical machine

Compilation (1min)

JIRA Unit Tests Build Improved

Compilation (1min)
Packaging (0min)


Compilation (1min)
Packaging (0min)
Publishing Results (40sec)



Compilation (1min)
Packaging (0min)

Fetching Dependencies (10sec)


Compilation (1min)
Packaging (0min)

SCM Update (5sec) Executing Tests (5min)


Agent Availability/Setup (3min)*
Compilation (1min)
Packaging (0min)

SCM Update (5sec) Executing Tests (5min)


Improvements Summary
Tests Before After Improvement %

Unit tests 29 min 17 min 41%

Functional tests 56 min 34 min 39%

WebDriver tests 39 min 21 min 46%

Overall 124 min 72 min 42%

* Additional ca. 5% improvement expected once new git clone
strategy is consistently rolled-out everywhere

But that's still bad

We want CI feedback loop in a few minutes maximum

Resistance against splitting
The last attempt: Magic Machine

Decide with high conﬁdence (e.g. > 95%) which subset of tests
to run basing on the committed changes

Magic Machine
• Looking at Bamboo history (analysing
correlation between changes and failures)
• Matching: package test/prod code and
transitive imports
• Code instrumentation (Clover, Emma, AspectJ)
• Run most often failing ﬁrst

Inevitable Split - Fears

• Organizational concerns - understanding,
managing, integrating, releasing
• Mindset change - if something worked for
10 years why to change it?
• We damned ourselves with big buckets for
all tests - where do they belong to?

Magic Machine strikes back

With heavy use of brain, common sense
and expert judgement

Splitting code base
• Step 0 - JIRA Importers Plugin (3 years ago)
• Step 1- New Issue View and Navigator
JIRA
6 .0

We are still escaping hell.
Hell sucks in your soul.

Conclusions
• Visibility and problem awareness help
• Maintaing huge testbed is difﬁcult and costly
• Measure the problem
• No prejudice - no sacred cows
• Automated tests are not one-off investment,
it's a continuous journey
• Performance is a damn important feature

Do you want to help?
We are hiring in Gdańsk
• Principal Java Developer

• Development Team Lead

• Java and Scala Developers

• UX Designer

• Front-End Developer

• QA Engineer
Visit us at the booth or apply at
http://www.atlassian.com/company/careers

Images - Credits

• Turtle - by Jonathan Zander, CC-BY-SA-3.0
• Loading - by MatthewJ13, CC-SA-3.0
• Magic Potion - by Koolmann1, CC-BY-SA-2.0
• Merlin Tool - by By L. Mahin, CC-BY-SA-3.0
• Choose Pills - by *rockysprings, CC-BY-SA-3.0

Escaping Automated Test Hell - One Year Later

More Related Content

What's hot (20)

Viewers also liked (15)

Similar to Escaping Automated Test Hell - One Year Later (20)

More from Wojciech Seliga (8)

Recently uploaded (20)

Escaping Automated Test Hell - One Year Later