SlideShare a Scribd company logo
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI Scenarios
Continuous Delivery of Deep Transformer-based
NLP Models Using MLflow and AWS Sagemaker
for Enterprise AI Scenarios
Yong Liu
Principal Data Scientist
Outreach Corporation
Andrew Brooks
Senior Data Scientist
Outreach Corporation
Presentation Outline
➢ Introduction and Background
➢ Challenges in Enterprise AI
Implementation
➢ Full LifeCycle ML Experience at
Outreach
➢ Conclusion and Future Work
Introduction and Background
4,000+ Customers
Sales Engagement Platform (SEP)
▪ SEP encodes and
automates sales
activities into
workflows
▪ Enables reps to
perform one-on-one
personalized
outreach to up to 10x
Day 1
Phone Call
Day 1
Email
Day 3
LinkedIn
Day 5
Phone
Day 5
Email
ML/NLP/AI Roles in Enterprise Sales Scenarios
▪ Continuous learning from data (
emails, phone calls,
engagement logs etc.)
▪ Reasoning from knowledge to
create a flywheel for the
continual success of Reps
Challenges in Enterprise AI Implementation
Implementation Challenges: the Digital Divide
Outline
Dev-Prod
Divide
Dev-Prod
Differences
Challenge 2Challenge 1
Arbitrary
Uniqueness
Challenge 3
Provenance
Challenge 4
Challenge 1: Dev-Prod Divide
▪ can’t test on “live” data
▪ can’t verify model invoked correctly
▪ can’t reproduce bugs or issues
reported by users
▪ can’t reuse prod code for model
development
Isolated prod environment
Source: Winderresearch
Challenge 2: Dev-Prod Differences Dev-Prod
▪ training data != prod data
▪ production scoring requires logic not
used during model training
When training & prod pipelines are and need to
be different
Source:
MoneyUser.com
when the “whole” is not greater than the
“sum of its parts”.
Challenge 3: Arbitrary Uniqueness
▪ deploying each model feels like a
“special case”
▪ gates and deploy mechanisms are ad
hoc
▪ pipeline maintenance is costly
Source: Rowperfect UK
Challenge 4: Provenance
▪ don’t know what exactly is running in
prod
▪ inability to repro and debug model
issues reported by customers
▪ model/pipeline changes = 😬
▪ undocumented model/code changes
compromise metric drifting.
▪ model experiments wasted
Provenance from models to source code and
data.
Source: Slane Cartoons
Full-life Cycle ML Implementation
Experience at Outreach
A Use Case: Guided Engagement
powered by an intent classification model
▪ ML model predicts the
intent of prospect’s email
reply and then
recommends the right
template to respond.
Six Stages of ML Full Life Cycle
Six Stages of ML Full Life Cycle
Model Development and Offline Experimentation
MLFlow tracking server to log all offline experiments
Tracking experiments
Creating a transformer flavor model
A new MLflow model flavor (transformer) & TransformerClassifier (sklearn pipeline)
Transformer
MLflow model
Flavor code
TransformerClassifier
Saving and Loading Transformer Artifacts
An example of a fully saved and reloadable MLflow “Transformer”-flavor Model
Load:
mlflow.pyfunc.load_model(model_uri)
Save:
mlflow_transformer.log_model(
transformer_classifier=trained_model,
artifact_path="transformer_classifier",
conda_env=CONDA_ENV,
)
Productionizing Code and Git Repos
▪ MLProject Conda.yml
IDE Dev Environment, MLflow MLProject, Github repo structure, flake8
Flexible Execution Mode
MLProject allows using code and execution environments either locally or remotely
(1)mlflow run ./ -e train
(1)mlflow run git+ssh://git@github.com/model-repo -e train --version
1.1
(1)mlflow run ./ -e train.py --backend databricks --backend-config
gpu_cluster_type.json
(1)mlflow run git+ssh://git@github.com/model-repo -e train.py --
version 1.1 --backend databricks --backend-config
gpu_cluster_type.json
Models: trained, wrapped, private-wheeled
To support deployment specific logic and environment, we create three progressively
evolved models for final deployment in a host (Sagemaker)
Fine-tuned Trained
Transformer Classifier
Pre-score
filter
Post-score
filter
Wrapped sklearn
pipeline model
Private wheeled model
No need to access github
Continuous Integration through CircleCI
Continuous Delivery/Rollback through Concourse
Gated by two human gates: 1) start the full model deployment 2)promote the model from staging to
production; and one regression test gate: accuracy must not be lower than previous version
Model Registry to Track Deployed Model Provenance
How Well Did We Do?
Dev/Prod
Divide
Dev/Prod
Differences
Arbitrary
Uniqueness
Provenance
+ +
_
Conclusions and Future Work
Conclusions and Future Work
▪ We highlight four typical enterprise AI implementation
challenges and how we solve them with MLflow, Sagemaker and
CICD tools
▪ Our intent classification model has been deployed in production
and in operation using this framework
▪ Next steps:
Incorporating model in-production feedback loop into annotation
and model dev cycle
We are further improving the annotation pipeline to have seamless
human-in-the-loop active learning and model validation
Acknowledgements
Data Science Group | Outreach.io
Contact:
yong.liu@outreach.io
andrew.brooks@outreach.io

More Related Content

PDF
Introduction to Smart Data Models
PDF
Designing and developing vocabularies in RDF
PDF
Introduction to Azure Synapse Webinar
PDF
AWS Cloud Formation
PPTX
Google Cloud Platform Data Storage
PPTX
NoSQL databases - An introduction
PDF
SQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
PPTX
Oracle DBA
Introduction to Smart Data Models
Designing and developing vocabularies in RDF
Introduction to Azure Synapse Webinar
AWS Cloud Formation
Google Cloud Platform Data Storage
NoSQL databases - An introduction
SQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
Oracle DBA

What's hot (20)

PPTX
JSON and the Oracle Database
PDF
Dealing with Azure Cosmos DB
PPTX
Realtime vs Cloud Firestore
PPTX
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
PPTX
PDF
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
PDF
Spring framework 3
PPTX
Google Cloud Fundamentals by CloudZone
PDF
Real Life Clean Architecture
PPTX
From the Monolith to Microservices - CraftConf 2015
PDF
High-speed Database Throughput Using Apache Arrow Flight SQL
PDF
2017 Software Developer Productivity Survey in the United States and Great Br...
PDF
AWS Cloud cost optimization
PDF
Part 3 - Modern Data Warehouse with Azure Synapse
PDF
Building a semantic/metrics layer using Calcite
PPT
MongoDB Tick Data Presentation
PDF
Introduction to Bigdata and HADOOP
PPTX
Java Spring
PPTX
Data Observability.pptx
PPTX
Software architecture in practice
JSON and the Oracle Database
Dealing with Azure Cosmos DB
Realtime vs Cloud Firestore
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Introduction to Data Virtualization (session 1 from Packed Lunch Webinar Series)
Spring framework 3
Google Cloud Fundamentals by CloudZone
Real Life Clean Architecture
From the Monolith to Microservices - CraftConf 2015
High-speed Database Throughput Using Apache Arrow Flight SQL
2017 Software Developer Productivity Survey in the United States and Great Br...
AWS Cloud cost optimization
Part 3 - Modern Data Warehouse with Azure Synapse
Building a semantic/metrics layer using Calcite
MongoDB Tick Data Presentation
Introduction to Bigdata and HADOOP
Java Spring
Data Observability.pptx
Software architecture in practice
Ad

Similar to Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI Scenarios (20)

PPTX
Why is dev ops for machine learning so different
PDF
MLFlow: Platform for Complete Machine Learning Lifecycle
PPTX
Why is dev ops for machine learning so different - dataxdays
PDF
"Managing the Complete Machine Learning Lifecycle with MLflow"
PDF
Managing the Machine Learning Lifecycle with MLflow
PDF
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
PDF
Introduction to MLflow
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
PDF
Scaling up Machine Learning Development
PDF
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
PPTX
Databricks for MLOps Presentation (AI/ML)
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Pitfalls of machine learning in production
PPTX
Hands-On Workshop: Introduction to Development on Force.com for Developers
PDF
Managing the Machine Learning Lifecycle with MLOps
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
PPTX
.NET Fundamentals and Business Application Development
PDF
Seamless End-to-End Production Machine Learning with Seldon and MLflow
Why is dev ops for machine learning so different
MLFlow: Platform for Complete Machine Learning Lifecycle
Why is dev ops for machine learning so different - dataxdays
"Managing the Complete Machine Learning Lifecycle with MLflow"
Managing the Machine Learning Lifecycle with MLflow
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Introduction to MLflow
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Scaling up Machine Learning Development
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Databricks for MLOps Presentation (AI/ML)
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Pitfalls of machine learning in production
Hands-On Workshop: Introduction to Development on Force.com for Developers
Managing the Machine Learning Lifecycle with MLOps
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
.NET Fundamentals and Business Application Development
Seamless End-to-End Production Machine Learning with Seldon and MLflow
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake

Recently uploaded (20)

PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
annual-report-2024-2025 original latest.
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Business Analytics and business intelligence.pdf
PDF
Lecture1 pattern recognition............
PDF
Introduction to the R Programming Language
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Computer network topology notes for revision
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
annual-report-2024-2025 original latest.
.pdf is not working space design for the following data for the following dat...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Business Analytics and business intelligence.pdf
Lecture1 pattern recognition............
Introduction to the R Programming Language
Miokarditis (Inflamasi pada Otot Jantung)
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Clinical guidelines as a resource for EBP(1).pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Computer network topology notes for revision
ISS -ESG Data flows What is ESG and HowHow
Introduction-to-Cloud-ComputingFinal.pptx
SAP 2 completion done . PRESENTATION.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx

Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI Scenarios

  • 2. Continuous Delivery of Deep Transformer-based NLP Models Using MLflow and AWS Sagemaker for Enterprise AI Scenarios Yong Liu Principal Data Scientist Outreach Corporation Andrew Brooks Senior Data Scientist Outreach Corporation
  • 3. Presentation Outline ➢ Introduction and Background ➢ Challenges in Enterprise AI Implementation ➢ Full LifeCycle ML Experience at Outreach ➢ Conclusion and Future Work
  • 6. Sales Engagement Platform (SEP) ▪ SEP encodes and automates sales activities into workflows ▪ Enables reps to perform one-on-one personalized outreach to up to 10x Day 1 Phone Call Day 1 Email Day 3 LinkedIn Day 5 Phone Day 5 Email
  • 7. ML/NLP/AI Roles in Enterprise Sales Scenarios ▪ Continuous learning from data ( emails, phone calls, engagement logs etc.) ▪ Reasoning from knowledge to create a flywheel for the continual success of Reps
  • 8. Challenges in Enterprise AI Implementation
  • 9. Implementation Challenges: the Digital Divide Outline Dev-Prod Divide Dev-Prod Differences Challenge 2Challenge 1 Arbitrary Uniqueness Challenge 3 Provenance Challenge 4
  • 10. Challenge 1: Dev-Prod Divide ▪ can’t test on “live” data ▪ can’t verify model invoked correctly ▪ can’t reproduce bugs or issues reported by users ▪ can’t reuse prod code for model development Isolated prod environment Source: Winderresearch
  • 11. Challenge 2: Dev-Prod Differences Dev-Prod ▪ training data != prod data ▪ production scoring requires logic not used during model training When training & prod pipelines are and need to be different Source: MoneyUser.com
  • 12. when the “whole” is not greater than the “sum of its parts”. Challenge 3: Arbitrary Uniqueness ▪ deploying each model feels like a “special case” ▪ gates and deploy mechanisms are ad hoc ▪ pipeline maintenance is costly Source: Rowperfect UK
  • 13. Challenge 4: Provenance ▪ don’t know what exactly is running in prod ▪ inability to repro and debug model issues reported by customers ▪ model/pipeline changes = 😬 ▪ undocumented model/code changes compromise metric drifting. ▪ model experiments wasted Provenance from models to source code and data. Source: Slane Cartoons
  • 14. Full-life Cycle ML Implementation Experience at Outreach
  • 15. A Use Case: Guided Engagement powered by an intent classification model ▪ ML model predicts the intent of prospect’s email reply and then recommends the right template to respond.
  • 16. Six Stages of ML Full Life Cycle
  • 17. Six Stages of ML Full Life Cycle
  • 18. Model Development and Offline Experimentation MLFlow tracking server to log all offline experiments Tracking experiments
  • 19. Creating a transformer flavor model A new MLflow model flavor (transformer) & TransformerClassifier (sklearn pipeline) Transformer MLflow model Flavor code TransformerClassifier
  • 20. Saving and Loading Transformer Artifacts An example of a fully saved and reloadable MLflow “Transformer”-flavor Model Load: mlflow.pyfunc.load_model(model_uri) Save: mlflow_transformer.log_model( transformer_classifier=trained_model, artifact_path="transformer_classifier", conda_env=CONDA_ENV, )
  • 21. Productionizing Code and Git Repos ▪ MLProject Conda.yml IDE Dev Environment, MLflow MLProject, Github repo structure, flake8
  • 22. Flexible Execution Mode MLProject allows using code and execution environments either locally or remotely (1)mlflow run ./ -e train (1)mlflow run git+ssh://git@github.com/model-repo -e train --version 1.1 (1)mlflow run ./ -e train.py --backend databricks --backend-config gpu_cluster_type.json (1)mlflow run git+ssh://git@github.com/model-repo -e train.py -- version 1.1 --backend databricks --backend-config gpu_cluster_type.json
  • 23. Models: trained, wrapped, private-wheeled To support deployment specific logic and environment, we create three progressively evolved models for final deployment in a host (Sagemaker) Fine-tuned Trained Transformer Classifier Pre-score filter Post-score filter Wrapped sklearn pipeline model Private wheeled model No need to access github
  • 25. Continuous Delivery/Rollback through Concourse Gated by two human gates: 1) start the full model deployment 2)promote the model from staging to production; and one regression test gate: accuracy must not be lower than previous version
  • 26. Model Registry to Track Deployed Model Provenance
  • 27. How Well Did We Do? Dev/Prod Divide Dev/Prod Differences Arbitrary Uniqueness Provenance + + _
  • 29. Conclusions and Future Work ▪ We highlight four typical enterprise AI implementation challenges and how we solve them with MLflow, Sagemaker and CICD tools ▪ Our intent classification model has been deployed in production and in operation using this framework ▪ Next steps: Incorporating model in-production feedback loop into annotation and model dev cycle We are further improving the annotation pipeline to have seamless human-in-the-loop active learning and model validation
  • 30. Acknowledgements Data Science Group | Outreach.io Contact: yong.liu@outreach.io andrew.brooks@outreach.io