SlideShare a Scribd company logo
mabl’s ML Implementation
Testing @ Scale with Google Cloud Platform
Joseph Lust mabl engineer @lustcoder
About Me: Joseph Lust
Engineer
■ Building the web for two decades
■ Cloud native since 2011
■ Currently building mabl ML cloud on GCP
■ Co-organizer GDG Cloud Boston Meetup
Agenda
■ mabl’s problem statement
■ ML on Google Cloud
■ Cloud best practices at mabl
■ Q & A
the problem statement
Developing Quality Software is Slow
Image Source: IBM SAGE
■ Software is complex
■ Testing software is slow
■ Humans have low clock speeds
■ Humans are expensive
Continuous Delivery breaks QA
■ Change happens infrequently
■ Weeks or months to write tests
■ Weeks to execute tests
■ Change is constant
■ Hours to write tests
■ Minutes to execute tests
The Doughnut Hole: Continuous Testing (CT)
Source: Google Trends
Hudson
Released
Facebook
CD Article
Continuous Testing Roadmap
■ Highly parallel testing (fail fast)
■ Automatically heal broken scripts
■ Automatically generate test scripts
Manual
Testing
Manually
Scripted Testing
Playback
Testing
Auto Test
Generation
Auto-healing
Testing
✔ ✔ ✔ ✔
the ML @ mabl
Machine Learning Areas @ mabl
■ Anomaly detection
▲ Timing anomalies
▲ Visual change anomalies
■ Auto-healing tests
▲ Alternative element candidate election
▲ Multi-branch evaluation
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
Why anomaly detection isn’t quite that easy
■ Domain variability
one app’s anomaly is another’s normal
■ Temporal variability
today’s anomaly is tomorrow’s normal
■ Detection quality
not all anomalies are bad (or important)
■ Initial quality
what can we detect with minimal app-specific data
Approach
■ Modeling the app
▲ domain variability, quality
■ Incremental learning
▲ temporal variability, initial quality
■ User feedback
▲ use case variability, quality
Analysis pipeline
Observations
▲ Raw data/events
Measurements
▲ Calculated features
State detections
▲ Aggregate measurements
▲ Apply detection models (abstraction)
▲ Integrate detections (higher abstraction)
Insights
▲ Recognize significant state changes
Signals for anomaly detection
■ Visual appearance
■ Performance
■ Failures
■ Errors (non-fatal)
■ Page structure changes
visual anomalies
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
benign common changes
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
anomalous changes
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
model training and feedback
Pieces of the puzzle
■ Cloud ML Engine
▲ Tensorflow machine learning models
▲ Managed model training
▲ Online prediction service
■ Dataflow
▲ Streaming data processing
▲ I/O connectors for PubSub, etc.
■ Datastore
▲ Augment trained model with mutable data
▲ Fast query by type and filter (e.g., by key)
Architecture of model feedback loops
harnessing google cloud for ML
mabl cloud design tenets
■ Serverless
■ Decoupled sub systems
■ Event driven architecture
Serverless just works
■ No Provisioning
■ Transparent Scaling
■ Event Driven
■ Pay only for Use
35
e.g. Kubernetes Engine
■ NoOps - developers’ containers in prod
■ Language agnostic
■ Massively scale
▲ running millions of containers a month
■ “Set it and forget it!”
36
Kubernetes
Engine
Decoupled Architecture
■ Message passing
■ Polyglot development
■ Independent deployment
■ Minimize blast radius
Cloud
Pub/Sub
Cloud
Functions
Event Driven Architecture
■ No “cron jobs” or batch processes
■ Continuous pipelines
■ Surge buffers
■ 100s ms to 1-2s E2E handling
38
Cloud
Dataflow
Cloud
Functions
Google Cloud’s Impact on mabl
■ From zero to alpha product in ~6mo with 8 developers
▲ Processing ~100M pages per month
■ Systems designed for scale on day one
■ No dreaded “rewrite”
■ Product families work very well together out of the box
Questions?

More Related Content

PDF
End to-end test automation at scale
PDF
Serverless Apps on Google Cloud: more dev, less ops
PDF
Embracing Serverless with Google
PPTX
Bus ticket management system
PDF
CI/CD for Machine Learning
PPTX
DataSciencePT #27 - Fifty Shades of Automated Machine Learning
ODP
bpmNEXT: Automating human-centric processes with machine learning
PDF
Reactive application
End to-end test automation at scale
Serverless Apps on Google Cloud: more dev, less ops
Embracing Serverless with Google
Bus ticket management system
CI/CD for Machine Learning
DataSciencePT #27 - Fifty Shades of Automated Machine Learning
bpmNEXT: Automating human-centric processes with machine learning
Reactive application

What's hot (8)

PDF
Golden images vs Configuration Mgmt
PPTX
Troubleshooting Dashboard Performance
PPTX
JoTechies - Azure Functions Using c#
PPTX
Using Processes and Timers for Long-Running Asynchronous Tasks
ODP
Summit 2019: "Submarine" initiative
PDF
Distributed Time Travel for Feature Generation at Netflix
PPTX
Introduction To Serverless Architecture
PPTX
Caching Tips & Tricks
Golden images vs Configuration Mgmt
Troubleshooting Dashboard Performance
JoTechies - Azure Functions Using c#
Using Processes and Timers for Long-Running Asynchronous Tasks
Summit 2019: "Submarine" initiative
Distributed Time Travel for Feature Generation at Netflix
Introduction To Serverless Architecture
Caching Tips & Tricks
Ad

Similar to mabl's Machine Learning Implementation on Google Cloud Platform (20)

PDF
Testing and Deployment - Full Stack Deep Learning
PDF
Infrastructure Agnostic Machine Learning Workload Deployment
PPTX
Machine learning testing survey, landscapes and horizons, the Cliff Notes
PPTX
MOPs & ML Pipelines on GCP - Session 6, RGDC
PPTX
DevOps for Machine Learning overview en-us
PDF
Machine learning at scale with Google Cloud Platform
PPTX
Kubernetes for machine learning
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
PDF
Hopsworks at Google AI Huddle, Sunnyvale
PPTX
Why is dev ops for machine learning so different
PDF
“Testing Cloud-to-Edge Deep Learning Pipelines: Ensuring Robustness and Effic...
PPTX
Build 2019 Recap
PPTX
Why is dev ops for machine learning so different - dataxdays
PPTX
CNCF-Istanbul-MLOps for Devops Engineers.pptx
PDF
Build and Monitor Machine Learning Services in Kubernetes
PDF
Continuous delivery for machine learning
PPTX
How Machine learning Integration supports testing automation in software
PDF
Recreating "The Clock" with Machine Learning and Web Scraping
PPTX
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
PDF
Sysml 2019 demo_paper
Testing and Deployment - Full Stack Deep Learning
Infrastructure Agnostic Machine Learning Workload Deployment
Machine learning testing survey, landscapes and horizons, the Cliff Notes
MOPs & ML Pipelines on GCP - Session 6, RGDC
DevOps for Machine Learning overview en-us
Machine learning at scale with Google Cloud Platform
Kubernetes for machine learning
Using MLOps to Bring ML to Production/The Promise of MLOps
Hopsworks at Google AI Huddle, Sunnyvale
Why is dev ops for machine learning so different
“Testing Cloud-to-Edge Deep Learning Pipelines: Ensuring Robustness and Effic...
Build 2019 Recap
Why is dev ops for machine learning so different - dataxdays
CNCF-Istanbul-MLOps for Devops Engineers.pptx
Build and Monitor Machine Learning Services in Kubernetes
Continuous delivery for machine learning
How Machine learning Integration supports testing automation in software
Recreating "The Clock" with Machine Learning and Web Scraping
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Sysml 2019 demo_paper
Ad

More from Joseph Lust (8)

PDF
GitLab Commit 2020: Ubiquitous quality through continuous testing pipelines
PDF
Serverless Preview Environments @ Boston DevOps
PDF
Making CLIs with Node.js
PDF
Serverless preview environments to the rescue
PDF
Going Microserverless on Google Cloud @ mabl
PDF
Going Microserverless on Google Cloud
PDF
Kubernetes & Google Container Engine @ mabl
PDF
Firebase Cloud Functions: a quick overview
GitLab Commit 2020: Ubiquitous quality through continuous testing pipelines
Serverless Preview Environments @ Boston DevOps
Making CLIs with Node.js
Serverless preview environments to the rescue
Going Microserverless on Google Cloud @ mabl
Going Microserverless on Google Cloud
Kubernetes & Google Container Engine @ mabl
Firebase Cloud Functions: a quick overview

Recently uploaded (20)

PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Artificial Intelligence
PDF
Well-logging-methods_new................
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Geodesy 1.pptx...............................................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPT
Mechanical Engineering MATERIALS Selection
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Construction Project Organization Group 2.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT
introduction to datamining and warehousing
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Artificial Intelligence
Well-logging-methods_new................
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Geodesy 1.pptx...............................................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mechanical Engineering MATERIALS Selection
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Construction Project Organization Group 2.pptx
additive manufacturing of ss316l using mig welding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
introduction to datamining and warehousing
Operating System & Kernel Study Guide-1 - converted.pdf
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx

mabl's Machine Learning Implementation on Google Cloud Platform

  • 1. mabl’s ML Implementation Testing @ Scale with Google Cloud Platform Joseph Lust mabl engineer @lustcoder
  • 2. About Me: Joseph Lust Engineer ■ Building the web for two decades ■ Cloud native since 2011 ■ Currently building mabl ML cloud on GCP ■ Co-organizer GDG Cloud Boston Meetup
  • 3. Agenda ■ mabl’s problem statement ■ ML on Google Cloud ■ Cloud best practices at mabl ■ Q & A
  • 5. Developing Quality Software is Slow Image Source: IBM SAGE ■ Software is complex ■ Testing software is slow ■ Humans have low clock speeds ■ Humans are expensive
  • 6. Continuous Delivery breaks QA ■ Change happens infrequently ■ Weeks or months to write tests ■ Weeks to execute tests ■ Change is constant ■ Hours to write tests ■ Minutes to execute tests
  • 7. The Doughnut Hole: Continuous Testing (CT) Source: Google Trends Hudson Released Facebook CD Article
  • 8. Continuous Testing Roadmap ■ Highly parallel testing (fail fast) ■ Automatically heal broken scripts ■ Automatically generate test scripts Manual Testing Manually Scripted Testing Playback Testing Auto Test Generation Auto-healing Testing ✔ ✔ ✔ ✔
  • 9. the ML @ mabl
  • 10. Machine Learning Areas @ mabl ■ Anomaly detection ▲ Timing anomalies ▲ Visual change anomalies ■ Auto-healing tests ▲ Alternative element candidate election ▲ Multi-branch evaluation
  • 14. Why anomaly detection isn’t quite that easy ■ Domain variability one app’s anomaly is another’s normal ■ Temporal variability today’s anomaly is tomorrow’s normal ■ Detection quality not all anomalies are bad (or important) ■ Initial quality what can we detect with minimal app-specific data
  • 15. Approach ■ Modeling the app ▲ domain variability, quality ■ Incremental learning ▲ temporal variability, initial quality ■ User feedback ▲ use case variability, quality
  • 16. Analysis pipeline Observations ▲ Raw data/events Measurements ▲ Calculated features State detections ▲ Aggregate measurements ▲ Apply detection models (abstraction) ▲ Integrate detections (higher abstraction) Insights ▲ Recognize significant state changes
  • 17. Signals for anomaly detection ■ Visual appearance ■ Performance ■ Failures ■ Errors (non-fatal) ■ Page structure changes
  • 30. model training and feedback
  • 31. Pieces of the puzzle ■ Cloud ML Engine ▲ Tensorflow machine learning models ▲ Managed model training ▲ Online prediction service ■ Dataflow ▲ Streaming data processing ▲ I/O connectors for PubSub, etc. ■ Datastore ▲ Augment trained model with mutable data ▲ Fast query by type and filter (e.g., by key)
  • 32. Architecture of model feedback loops
  • 34. mabl cloud design tenets ■ Serverless ■ Decoupled sub systems ■ Event driven architecture
  • 35. Serverless just works ■ No Provisioning ■ Transparent Scaling ■ Event Driven ■ Pay only for Use 35
  • 36. e.g. Kubernetes Engine ■ NoOps - developers’ containers in prod ■ Language agnostic ■ Massively scale ▲ running millions of containers a month ■ “Set it and forget it!” 36 Kubernetes Engine
  • 37. Decoupled Architecture ■ Message passing ■ Polyglot development ■ Independent deployment ■ Minimize blast radius Cloud Pub/Sub Cloud Functions
  • 38. Event Driven Architecture ■ No “cron jobs” or batch processes ■ Continuous pipelines ■ Surge buffers ■ 100s ms to 1-2s E2E handling 38 Cloud Dataflow Cloud Functions
  • 39. Google Cloud’s Impact on mabl ■ From zero to alpha product in ~6mo with 8 developers ▲ Processing ~100M pages per month ■ Systems designed for scale on day one ■ No dreaded “rewrite” ■ Product families work very well together out of the box