SlideShare a Scribd company logo
Release & Iterate Faster:
Stop Manual Testing
Drew Hannay (@drewhannay)
Release Schedule (Ideal)
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
Repeat 12x yearly
Release Schedule (Reality)
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Rush to check in
half-finished code
Oncall
Handoff
RC Cut Manual
Testing... Hotfix Last minute
critical feature
Delayed
Release
Oncall
Handoff
Testing...
Bug found in last
minute feature
Hotfix Release Oncall
Handoff
Repeat 12x yearly; cry
Nobody is happy
● Devs aren’t happy
○ Mad rush to check in code before an RC cut
○ Dread being oncall during release week
● Product managers aren’t happy
○ Only 12 chances per year to release new features
○ Hard to iterate on member feedback; need to “get it right” the first time
● Testing team isn’t happy
○ Code quality drops immediately before RC as code is rushed in
○ Manual testing is time-consuming and tedious
● Leadership isn’t happy
○ Unhappy employees (see above ^)
○ Improvements to product / business delayed by release schedule
Project Voyager
● Rewrite of the LinkedIn app for Android, iOS, and mobile web
○ Brand new codebase
○ Brand new frontend API server
○ Brand new product designs
● Product goals:
○ Weekly releases
○ Faster iteration
○ Easier experiments
● Huge company focus on mobile
○ Mobile was becoming more and more important to the business
○ We needed to get better at it
Easy Engineering Answer
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Testing... Release Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
RC Cut Manual
Testing... Release Oncall
Handoff
RC Cut Manual
It was a week the whole time! #problemsolved
Better Engineering Answer
● The release schedule should be a product decision
○ Shouldn’t be restricted by engineering problems
● We should always be able to take the latest build and give it to members
○ But we still need frequent and fast builds!
● We needed a mindset shift
○ And a catchy slogan
3x3
Release three times per day, no more than three
hours from code commit to member availability
Why three hours?
● Not enough time for manual testing steps
● Not enough time to test everything
○ The goal isn’t 100% automation, it’s faster iterations
○ More tests → more time fixing tests when product changes → slower iterations
● Easy to emphasize craftsmanship
○ Devs can take the extra time to write quality code when the next release is three hours away
● Easy to fit three releases in an eight hour workday :)
○ 10am, 1pm, 4pm
Commit pipeline
Code
Review
Static
Analysis
Unit
Tests
Build
Release
Artifacts
Scenario
Tests
Alpha
Release
Feature
Development
Production
Release
Beta
Release
Layout
Tests
Commit pipeline
Code
Review
Static
Analysis
Unit
Tests
Build
Release
Artifacts
Scenario
Tests
Alpha
Release
Feature
Development
Production
Release
Beta
Release
Layout
Tests
Static Analysis
● Java Checkstyle
● Android Lint
○ ~300 checks provided by Google
○ ~50 custom checks for LinkedIn-specific patterns and libraries
● Compile-time contract with frontend API server
○ Uses LinkedIn’s open source Rest.li REST framework to define API models
○ Provides static analysis to guarantee no backwards incompatible changes to production models
○ Client model classes are code-generated to ensure correctness
● Experimental: Auto-format code using IntelliJ’s CLI formatter
○ No more time in code reviews on nitpicky style comments
Building the code
● Initially not that bad
○ CI build for debug + release → ~5 minutes
● More features → more code → slower builds
○ Today: over 1 million lines of code in the Android app
○ CI build for debug alone → ~8 minutes
● Today: most of the code is in only two Gradle modules (plus libraries)
○ Eagerly awaiting Android Gradle Plugin 3.0 to start modularizing
APK Splits
● Releasing frequently means frequent updates for members
○ Need to be considerate of their data and keep the app small
● Release builds take advantage of APK splits
○ Separate APK built for each combination of screen density and CPU architecture
○ Total of 30 APKs published for each commit
● If you thought ONE release build was slow…
○ CI build for 30 APK splits → ~35 minutes
○ Almost 20% of our 3 hour “budget”!
Distributed Builds
● Each build uses two CI machines
○ First node builds debug binary and runs tests
○ Second node builds release binaries (APK splits)
● Build time is gated by whichever job is longer
○ Currently release builds :(
● Faster, but requires twice as much CI hardware
○ At one point we were using six machines per build...more on that later
Build Speed: Looking Forward
● Android Gradle Plugin 3.0 brings lots of speed improvements
○ Seeing 40% faster clean builds on the latest beta, with no code changes
○ Modularizing the app code should help even more
● Google Play App Signing
○ Let Google generate the APK splits
○ We only need to build the universal release binary
Testing: How do we test?
● Unit tests
○ Exactly what you think
● Layout tests
○ Unit tests for views
○ Load a layout in a dummy activity, with dummy data
○ Use Espresso ViewAssertions to check for overlaps, RTL layout, etc
● Scenario tests
○ Validate that key business metric flows are working properly
○ Usually flows that span multiple screens of the app
○ App gets mock data from an on-device fixture server
○ NOT an exhaustive suite
Testing: How do we measure coverage?
● Class/method/line code coverage is explicitly NOT measured
○ We don’t want every line covered by an automated test
○ We want high-value, low-maintenance tests
● For each feature, figure out flows that have the highest impact on the business
○ Then cover the happy paths of those flows with scenario tests
○ Teams agree on what flows must be tested and coverage is measured by # of those tests running
● For example:
○ Sharing a post to the LinkedIn feed should succeed → scenario test
■ Large impact on the business if this breaks
○ Sharing a post over 10k characters shows the correct error message → no scenario test
■ Not the end of the world if this is broken
■ Could be covered by unit tests (up to the team)
Testing: Need for Speed
● Currently running over 6.5k tests per build
○ Twice! Once before the commit is checked in, once after
○ Any change that doesn’t pass all tests gets auto-reverted
● Initial approach:
○ One emulator per CI machine
○ Six CI machines per build (!!!)
● Current approach:
○ Custom Gradle-based test harness to optimally shard tests
○ 16 emulators on one CI machine
○ Custom html + junit report
■ Logcat data for each test
■ Screenshots for failing tests
Multi-emulator test run
Testing: Stability
● Stable Test Environment
○ Custom code for creating and starting emulators
○ Test Butler
● Stable Test Framework
○ Run one test per instrumentation command
■ Similar to the new Android Test Orchestrator
○ Clear all app data between each test
○ Auto-recover and retry test if it fails because of an emulator issue
● Stable Test Suite
○ If a test passes once, it should pass always (with no code changes)
Testing: Stability - Quiz
● If we have 1000 tests that are each 99.9% reliable,
what’s the overall reliability of our test suite?
a. 99%
b. 95%
c. 90%
d. 80%
e. 50%
36.7%
● Loss of confidence in tests
● Unhappy developers
● Realization: Flaky tests are worse than no tests
Testing: Trunk Guardian
● Detect & disable flaky tests
● Continuously run all tests against the last known good build
○ Auto-disable a test if it fails
○ File a jira ticket for the owner of the test with logs and screenshots
● Daily report to leadership on % of disabled tests for each team
○ Auto-block new commits for teams with more than 10% disabled tests
● Re-enabling a test requires SOME code change
○ Find and fix the root cause of the flakiness
○ If you can’t, add more logs so you have more data next time it fails
{Drew catches his
breath}
Distribution: Alpha
● AKA “latest successful build”
● Employees have a button in the app to get the latest build
○ Usually PMs and execs who want to see the latest code ASAP
● Initially we used Google Play’s Alpha channel
○ Auto-uploaded every three hours for “true 3x3”
○ But people who wanted the latest code wanted it FASTER
Distribution: Beta
● Google Play public beta program
○ Open membership
○ Maxes out at a number that won’t have material business impact
if something goes terribly wrong
● Three beta releases per week
○ Wednesday / Monday / Friday
● Public beta users can (and do!) send feedback through Google Play
○ Most significant issues that get past our automated tests are reported
within four hours of beta release
Distribution: Production
● Once per week, Wednesday
○ Promote the newest beta build without any blocking issues
○ If all three betas are bad, skip the release and hold a post-mortem
● Google Play staged rollout
○ Ramp the build slowly throughout the day
○ Monitor adoption rate and crash rate
Release Schedule (Current)
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
Beta
Release
Prod + Beta
Release
Oncall
Handoff
Beta
Release
~130 releases / year (no release on holidays or “InDay”)
Minimizing Risk & Enabling Experiments
● Take advantage of LinkedIn’s existing A/B testing infrastructure
● New features are developed behind feature flags
○ Code can be ramped dynamically to different groups of members
○ Performance of new features or changes can be monitored
● Dynamic configuration
● Server-controlled kill switch
○ Crashing or buggy code can often be disabled without a new build
But you must have SOME manual tests?
● Nope, not really
● Push notifications and deeplinks are manually tested periodically
○ Doesn’t block releases
● Teams do manual bug bashes and testing on new features
○ Also doesn’t block releases
○ No manual testing for regressions, only to ensure new features are ready to ramp
Commit pipeline
Code
Review
Static
Analysis
Unit
Tests
Build
Release
Artifacts
Scenario
Tests
Alpha
Release
Feature
Development
Production
Release
Beta
Release
Layout
Tests
3x3: What’s next?
● Constantly looking for ways to reduce commit-to-publish time
○ C2P time is a metric tracked at the VP level at LinkedIn
○ Google Play App Signing
○ Gradle modularization + build cache
○ Emulator pools to scale up testing capacity
● Automated performance testing
○ We can sample some types of app performance in production
○ But no great way of catching issues before release
● Automated push notification and deeplink testing
● Automated monitoring and alerting on Google Play reviews
○ Find out when there’s problems in production more quickly
“That’s great for
LinkedIn, but…”
Takeaways
● Keep in mind we built this over 2.5 years
○ This presentation is intended to show some of the challenges, but also that 3x3 IS possible
● Leadership buy-in is crucial
○ 3x3 is a significant mindset shift for everyone in the organization
○ Driving it top-down makes things much easier
● Start with the simpler pieces
○ Anyone can run static analysis, or set up automated tests in CI builds
Questions?
3x3: Blogs, Videos, & Code
● 3x3: Speeding up mobile releases
● Consistent Android Testing Environments with Gradle (slides)
● Open Sourcing Test Butler (github)
● Test Butler: Reliable Android Testing, at Your Service (slides)
● Open Sourcing Dex Test Parser (github)

More Related Content

PPTX
A brief history of automation in Software Engineering
PPTX
Cloud Native CI/CD with Spring Cloud Pipelines
PDF
Extreme Programming - to the next-level
PDF
Fast end-to-end-tests
PDF
Automated Performance Testing
PDF
Introduction to Automated Testing
PDF
GTAC 2015
PDF
140 releases per month
A brief history of automation in Software Engineering
Cloud Native CI/CD with Spring Cloud Pipelines
Extreme Programming - to the next-level
Fast end-to-end-tests
Automated Performance Testing
Introduction to Automated Testing
GTAC 2015
140 releases per month

What's hot (20)

PDF
An almost complete continuous delivery pipeline including configuration manag...
PPT
icebreakerwithdevops-150218112943-conversion-gate02
PDF
TuleapCon 2019. Tuleap explained by the users
PDF
Performance testing for developers
PDF
Survival of the Continuist
PDF
How to เสร็จเร็ว (Use Agile for your project with team)
PDF
TuleapCon 2019. Tuleap Trackers, when one size does not fit all
PDF
DevOps Unicorns
ODP
Product development and tools
PDF
Agile testing
PDF
Improve the deployment process step by step
PPTX
Agile Software Development Techniques for Daily Use
PDF
Chapter17 of clean code
PDF
Does Your Code Measure Up?
PPTX
Comparing Agile QA Approaches to End-to-End Testing
DOCX
Happy ever afters with ci workflow
PDF
Why You Should Start Using Docker
PDF
Github, Travis-CI and Perl
PPTX
Clean Code: Successive Refinement
PDF
Don't Suck at Building Stuff - Mykel Alvis at Puppet Camp Altanta
An almost complete continuous delivery pipeline including configuration manag...
icebreakerwithdevops-150218112943-conversion-gate02
TuleapCon 2019. Tuleap explained by the users
Performance testing for developers
Survival of the Continuist
How to เสร็จเร็ว (Use Agile for your project with team)
TuleapCon 2019. Tuleap Trackers, when one size does not fit all
DevOps Unicorns
Product development and tools
Agile testing
Improve the deployment process step by step
Agile Software Development Techniques for Daily Use
Chapter17 of clean code
Does Your Code Measure Up?
Comparing Agile QA Approaches to End-to-End Testing
Happy ever afters with ci workflow
Why You Should Start Using Docker
Github, Travis-CI and Perl
Clean Code: Successive Refinement
Don't Suck at Building Stuff - Mykel Alvis at Puppet Camp Altanta
Ad

Similar to Release & Iterate Faster: Stop Manual Testing (20)

PDF
Expedia 3x3 presentation
PDF
3x3: Speeding Up Mobile Releases
PDF
3x3 Speeding Up Mobile Releases
PPTX
How to establish ways of working that allows shifting-left of the automation ...
PDF
TestIstanbul 2015
PDF
Continuous Delivery: 5 years later (Incontro DevOps 2018)
PDF
Tools and libraries for common problems (Early Draft)
PDF
Introduction to Continuous Delivery
PDF
Put "fast" back in "fast feedback"
PPTX
Software management for tech startups
PDF
Testing in a continuous delivery environment
PDF
Eric tucker - Eliminating "Over the Fence"
ODP
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
PDF
Yet Another Continuous Integration Story
PDF
CI/CD: Lessons from LinkedIn and Mockito
PDF
High Performance Software Engineering Teams
PDF
QCon'17 talk: CI/CD at scale - lessons from LinkedIn and Mockito
PDF
[SRD UGM] Sharing Session - Software Testing
PDF
QA Strategies for Testing Legacy Web Apps
PDF
Software Testing
Expedia 3x3 presentation
3x3: Speeding Up Mobile Releases
3x3 Speeding Up Mobile Releases
How to establish ways of working that allows shifting-left of the automation ...
TestIstanbul 2015
Continuous Delivery: 5 years later (Incontro DevOps 2018)
Tools and libraries for common problems (Early Draft)
Introduction to Continuous Delivery
Put "fast" back in "fast feedback"
Software management for tech startups
Testing in a continuous delivery environment
Eric tucker - Eliminating "Over the Fence"
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
Yet Another Continuous Integration Story
CI/CD: Lessons from LinkedIn and Mockito
High Performance Software Engineering Teams
QCon'17 talk: CI/CD at scale - lessons from LinkedIn and Mockito
[SRD UGM] Sharing Session - Software Testing
QA Strategies for Testing Legacy Web Apps
Software Testing
Ad

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
sap open course for s4hana steps from ECC to s4
PDF
KodekX | Application Modernization Development
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Encapsulation theory and applications.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Dropbox Q2 2025 Financial Results & Investor Presentation
sap open course for s4hana steps from ECC to s4
KodekX | Application Modernization Development
NewMind AI Weekly Chronicles - August'25 Week I
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Encapsulation theory and applications.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Mobile App Security Testing_ A Comprehensive Guide.pdf
MIND Revenue Release Quarter 2 2025 Press Release
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Release & Iterate Faster: Stop Manual Testing

  • 1. Release & Iterate Faster: Stop Manual Testing Drew Hannay (@drewhannay)
  • 2. Release Schedule (Ideal) Sunday Monday Tuesday Wednesday Thursday Friday Saturday Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff Repeat 12x yearly
  • 3. Release Schedule (Reality) Sunday Monday Tuesday Wednesday Thursday Friday Saturday Rush to check in half-finished code Oncall Handoff RC Cut Manual Testing... Hotfix Last minute critical feature Delayed Release Oncall Handoff Testing... Bug found in last minute feature Hotfix Release Oncall Handoff Repeat 12x yearly; cry
  • 4. Nobody is happy ● Devs aren’t happy ○ Mad rush to check in code before an RC cut ○ Dread being oncall during release week ● Product managers aren’t happy ○ Only 12 chances per year to release new features ○ Hard to iterate on member feedback; need to “get it right” the first time ● Testing team isn’t happy ○ Code quality drops immediately before RC as code is rushed in ○ Manual testing is time-consuming and tedious ● Leadership isn’t happy ○ Unhappy employees (see above ^) ○ Improvements to product / business delayed by release schedule
  • 5. Project Voyager ● Rewrite of the LinkedIn app for Android, iOS, and mobile web ○ Brand new codebase ○ Brand new frontend API server ○ Brand new product designs ● Product goals: ○ Weekly releases ○ Faster iteration ○ Easier experiments ● Huge company focus on mobile ○ Mobile was becoming more and more important to the business ○ We needed to get better at it
  • 6. Easy Engineering Answer Sunday Monday Tuesday Wednesday Thursday Friday Saturday Testing... Release Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff RC Cut Manual Testing... Release Oncall Handoff RC Cut Manual It was a week the whole time! #problemsolved
  • 7. Better Engineering Answer ● The release schedule should be a product decision ○ Shouldn’t be restricted by engineering problems ● We should always be able to take the latest build and give it to members ○ But we still need frequent and fast builds! ● We needed a mindset shift ○ And a catchy slogan
  • 8. 3x3 Release three times per day, no more than three hours from code commit to member availability
  • 9. Why three hours? ● Not enough time for manual testing steps ● Not enough time to test everything ○ The goal isn’t 100% automation, it’s faster iterations ○ More tests → more time fixing tests when product changes → slower iterations ● Easy to emphasize craftsmanship ○ Devs can take the extra time to write quality code when the next release is three hours away ● Easy to fit three releases in an eight hour workday :) ○ 10am, 1pm, 4pm
  • 12. Static Analysis ● Java Checkstyle ● Android Lint ○ ~300 checks provided by Google ○ ~50 custom checks for LinkedIn-specific patterns and libraries ● Compile-time contract with frontend API server ○ Uses LinkedIn’s open source Rest.li REST framework to define API models ○ Provides static analysis to guarantee no backwards incompatible changes to production models ○ Client model classes are code-generated to ensure correctness ● Experimental: Auto-format code using IntelliJ’s CLI formatter ○ No more time in code reviews on nitpicky style comments
  • 13. Building the code ● Initially not that bad ○ CI build for debug + release → ~5 minutes ● More features → more code → slower builds ○ Today: over 1 million lines of code in the Android app ○ CI build for debug alone → ~8 minutes ● Today: most of the code is in only two Gradle modules (plus libraries) ○ Eagerly awaiting Android Gradle Plugin 3.0 to start modularizing
  • 14. APK Splits ● Releasing frequently means frequent updates for members ○ Need to be considerate of their data and keep the app small ● Release builds take advantage of APK splits ○ Separate APK built for each combination of screen density and CPU architecture ○ Total of 30 APKs published for each commit ● If you thought ONE release build was slow… ○ CI build for 30 APK splits → ~35 minutes ○ Almost 20% of our 3 hour “budget”!
  • 15. Distributed Builds ● Each build uses two CI machines ○ First node builds debug binary and runs tests ○ Second node builds release binaries (APK splits) ● Build time is gated by whichever job is longer ○ Currently release builds :( ● Faster, but requires twice as much CI hardware ○ At one point we were using six machines per build...more on that later
  • 16. Build Speed: Looking Forward ● Android Gradle Plugin 3.0 brings lots of speed improvements ○ Seeing 40% faster clean builds on the latest beta, with no code changes ○ Modularizing the app code should help even more ● Google Play App Signing ○ Let Google generate the APK splits ○ We only need to build the universal release binary
  • 17. Testing: How do we test? ● Unit tests ○ Exactly what you think ● Layout tests ○ Unit tests for views ○ Load a layout in a dummy activity, with dummy data ○ Use Espresso ViewAssertions to check for overlaps, RTL layout, etc ● Scenario tests ○ Validate that key business metric flows are working properly ○ Usually flows that span multiple screens of the app ○ App gets mock data from an on-device fixture server ○ NOT an exhaustive suite
  • 18. Testing: How do we measure coverage? ● Class/method/line code coverage is explicitly NOT measured ○ We don’t want every line covered by an automated test ○ We want high-value, low-maintenance tests ● For each feature, figure out flows that have the highest impact on the business ○ Then cover the happy paths of those flows with scenario tests ○ Teams agree on what flows must be tested and coverage is measured by # of those tests running ● For example: ○ Sharing a post to the LinkedIn feed should succeed → scenario test ■ Large impact on the business if this breaks ○ Sharing a post over 10k characters shows the correct error message → no scenario test ■ Not the end of the world if this is broken ■ Could be covered by unit tests (up to the team)
  • 19. Testing: Need for Speed ● Currently running over 6.5k tests per build ○ Twice! Once before the commit is checked in, once after ○ Any change that doesn’t pass all tests gets auto-reverted ● Initial approach: ○ One emulator per CI machine ○ Six CI machines per build (!!!) ● Current approach: ○ Custom Gradle-based test harness to optimally shard tests ○ 16 emulators on one CI machine ○ Custom html + junit report ■ Logcat data for each test ■ Screenshots for failing tests
  • 21. Testing: Stability ● Stable Test Environment ○ Custom code for creating and starting emulators ○ Test Butler ● Stable Test Framework ○ Run one test per instrumentation command ■ Similar to the new Android Test Orchestrator ○ Clear all app data between each test ○ Auto-recover and retry test if it fails because of an emulator issue ● Stable Test Suite ○ If a test passes once, it should pass always (with no code changes)
  • 22. Testing: Stability - Quiz ● If we have 1000 tests that are each 99.9% reliable, what’s the overall reliability of our test suite? a. 99% b. 95% c. 90% d. 80% e. 50%
  • 23. 36.7% ● Loss of confidence in tests ● Unhappy developers ● Realization: Flaky tests are worse than no tests
  • 24. Testing: Trunk Guardian ● Detect & disable flaky tests ● Continuously run all tests against the last known good build ○ Auto-disable a test if it fails ○ File a jira ticket for the owner of the test with logs and screenshots ● Daily report to leadership on % of disabled tests for each team ○ Auto-block new commits for teams with more than 10% disabled tests ● Re-enabling a test requires SOME code change ○ Find and fix the root cause of the flakiness ○ If you can’t, add more logs so you have more data next time it fails
  • 26. Distribution: Alpha ● AKA “latest successful build” ● Employees have a button in the app to get the latest build ○ Usually PMs and execs who want to see the latest code ASAP ● Initially we used Google Play’s Alpha channel ○ Auto-uploaded every three hours for “true 3x3” ○ But people who wanted the latest code wanted it FASTER
  • 27. Distribution: Beta ● Google Play public beta program ○ Open membership ○ Maxes out at a number that won’t have material business impact if something goes terribly wrong ● Three beta releases per week ○ Wednesday / Monday / Friday ● Public beta users can (and do!) send feedback through Google Play ○ Most significant issues that get past our automated tests are reported within four hours of beta release
  • 28. Distribution: Production ● Once per week, Wednesday ○ Promote the newest beta build without any blocking issues ○ If all three betas are bad, skip the release and hold a post-mortem ● Google Play staged rollout ○ Ramp the build slowly throughout the day ○ Monitor adoption rate and crash rate
  • 29. Release Schedule (Current) Sunday Monday Tuesday Wednesday Thursday Friday Saturday Beta Release Prod + Beta Release Oncall Handoff Beta Release Beta Release Prod + Beta Release Oncall Handoff Beta Release Beta Release Prod + Beta Release Oncall Handoff Beta Release Beta Release Prod + Beta Release Oncall Handoff Beta Release Beta Release Prod + Beta Release Oncall Handoff Beta Release Beta Release Prod + Beta Release Oncall Handoff Beta Release ~130 releases / year (no release on holidays or “InDay”)
  • 30. Minimizing Risk & Enabling Experiments ● Take advantage of LinkedIn’s existing A/B testing infrastructure ● New features are developed behind feature flags ○ Code can be ramped dynamically to different groups of members ○ Performance of new features or changes can be monitored ● Dynamic configuration ● Server-controlled kill switch ○ Crashing or buggy code can often be disabled without a new build
  • 31. But you must have SOME manual tests? ● Nope, not really ● Push notifications and deeplinks are manually tested periodically ○ Doesn’t block releases ● Teams do manual bug bashes and testing on new features ○ Also doesn’t block releases ○ No manual testing for regressions, only to ensure new features are ready to ramp
  • 33. 3x3: What’s next? ● Constantly looking for ways to reduce commit-to-publish time ○ C2P time is a metric tracked at the VP level at LinkedIn ○ Google Play App Signing ○ Gradle modularization + build cache ○ Emulator pools to scale up testing capacity ● Automated performance testing ○ We can sample some types of app performance in production ○ But no great way of catching issues before release ● Automated push notification and deeplink testing ● Automated monitoring and alerting on Google Play reviews ○ Find out when there’s problems in production more quickly
  • 35. Takeaways ● Keep in mind we built this over 2.5 years ○ This presentation is intended to show some of the challenges, but also that 3x3 IS possible ● Leadership buy-in is crucial ○ 3x3 is a significant mindset shift for everyone in the organization ○ Driving it top-down makes things much easier ● Start with the simpler pieces ○ Anyone can run static analysis, or set up automated tests in CI builds
  • 37. 3x3: Blogs, Videos, & Code ● 3x3: Speeding up mobile releases ● Consistent Android Testing Environments with Gradle (slides) ● Open Sourcing Test Butler (github) ● Test Butler: Reliable Android Testing, at Your Service (slides) ● Open Sourcing Dex Test Parser (github)