SlideShare a Scribd company logo
www.verta.ai Confidential
Robust MLOps with Open-Source: ModelDB,
Docker, Jenkins, and Prometheus
!1
Presented by:
Manasi Vartak
CEO, Verta.ai
Michael Liu
Software Engineer, Verta.ai
Slack (Q&A): http://bit.ly/modeldb-mlops
#webinars
www.verta.ai Confidential
About
!2
• Open-core MLOps Platform for the full model
lifecycle
• Model versioning, deployment & ops, monitoring
• Built for data science; able to run at large scale
Manasi Vartak
CEO, Verta.ai
Michael Liu
Software Engineer,
Verta.ai
MIT CSAIL Ph.D. UCSD, Cognitive Science
Creator of ModelDB —
first OSS model
management and
versioning system
Neural-network based
audio analysis,
everything about Python,

Verta client libs
www.verta.ai Confidential
Agenda
• Part I: Intro to MLOps (15 mins)
• Part II: Building an MLOps Pipeline (30 mins)
• Part III: Questions (10 mins)
!3
Slack (Q&A): http://bit.ly/modeldb-mlops
#webinars
www.verta.ai Confidential
Models have become Easy to Build
!4
from fastai.vision import *
from fastai.metrics import accuracy
data.normalize(imagenet_stats)
learner = create_cnn(data, models.resnet18,
metrics=[accuracy], callback_fns=ShowGraph)
learner.fit_one_cycle(8, max_lr=slice(1e-3,
1e-2))
learner.save('stage-1')
www.verta.ai Confidential
Delivery and Operations of ML-Products is Broken
!5
It used to take us 20+ weeks to bring
a new version of the model into
production.
A predictive readmission model that was
trained, optimized and deployed at a hospital
would start sharply degrading within two to
three months.
www.verta.ai Confidential
Why is ML Delivery and Ops so Hard?
!6
www.verta.ai Confidential!7
Challenge 1. Model Development is empirical & ad-hoc
Model 1
Accuracy: 62%
www.verta.ai Confidential!8
Challenge 1. Model Development is empirical & ad-hoc
Model 3
Accuracy: 76%
val udf1: (Int => Int) = (delayed..)
df.withColumn(“timesDelayed”, udf1)
RandomForestClassifier
www.verta.ai Confidential!9
Challenge 1. Model Development is empirical & ad-hoc
Model 5
Accuracy: 68%
val udf1: (Int => Int) = (delayed..)
df.withColumn(“timesDelayed”, udf1)
RandomForestClassifier
credit-default-clean.csv
val lrGrid = new ParamGridBuilder()
.addGrid(rf.maxDepth, Array(5, 10, 15))
.addGrid(rf.numTrees, Array(50, 100))
www.verta.ai Confidential!10
Challenge 1. Model Development is empirical & ad-hoc
Model 50
Accuracy: 82%
val udf1: (Int => Int) = (delayed..)
df.withColumn(“timesDelayed”, udf1)
RandomForestClassifier
credit-default-clean.csv
val lrGrid = new ParamGridBuilder()
.addGrid(rf.maxDepth, Array(5, 10, 15))
.addGrid(rf.numTrees, Array(50, 100))
val labelIndexer1 = new LabelIndexer()
val labelIndexer2 = new LabelIndexer()
…
val udf1: (Int => Int) = (delayed..)
val udf2: (String, Int) = …
df.withColumn(“timesDelayed”, udf1)
.withColumn(“percentPaid”, udf2)
.withColumn(“creditUsed”, udf3)
val scaler = new StandardScaler()
.setInputCol(“features”) …
www.verta.ai Confidential
Challenge 2. DS/ML vs. Software are Different worlds
!11
• Flexibility

• Prototyping

• Bespoke code

• Robustness

• Scale

• Generalization
www.verta.ai Confidential
Challenge 3. Existing Tools are not ML-Aware
!12
Data Drift Resource Utilization
Optimizations Interdependencies
www.verta.ai Confidential
MLOps: DevOps for ML
!13
www.verta.ai Confidential!14
DevOps: Deliver Software Products Faster, More Reliably
www.verta.ai Confidential!15
Cross-Validation
LIME
Shapley
MLOps: Deliver ML Products Faster, More Reliably
??
www.verta.ai Confidential!16
This talk!
MLOps: Deliver ML Products Faster, More Reliably
www.verta.ai Confidential
Building an MLOps Pipeline with open-source:
Docker, Jenkins, Prometheus
!17
https://github.com/VertaAI/modeldb/tree/master/demos/webinar-2020-5-6
www.verta.ai Confidential
Running Example: TweetTrader
!18
Trader
DOW
NASDAQ
AI company using

social media analytics

to make $$$ … …
NLP
NLP
NLP
www.verta.ai Confidential
Let’s help TweetTrader do MLOps
!19
Package
Step 2:
Docker
Container
Release
Step 3:
Jenkins
Monitor
Step 4:
Prometheus
+ Logs
Trained
Model
Step 1:
Tweet
Model
www.verta.ai Confidential
What happens in the wild?
!20
www.verta.ai Confidential
Scenario: All our new traffic is from Germany
!21
www.verta.ai Confidential!22
Challenge 1. Model Development is empirical & ad-hoc
Model 50
Accuracy: 82%
val udf1: (Int => Int) = (delayed..)
df.withColumn(“timesDelayed”, udf1)
RandomForestClassifier
credit-default-clean.csv
val lrGrid = new ParamGridBuilder()
.addGrid(rf.maxDepth, Array(5, 10, 15))
.addGrid(rf.numTrees, Array(50, 100))
val labelIndexer1 = new LabelIndexer()
val labelIndexer2 = new LabelIndexer()
…
val udf1: (Int => Int) = (delayed..)
val udf2: (String, Int) = …
df.withColumn(“timesDelayed”, udf1)
.withColumn(“percentPaid”, udf2)
.withColumn(“creditUsed”, udf3)
val scaler = new StandardScaler()
.setInputCol(“features”) …
www.verta.ai Confidential!23
www.verta.ai Confidential
Scenario: My colleague has an even better
model
!24
www.verta.ai Confidential!25
www.verta.ai Confidential
What’s missing?
!26
www.verta.ai Confidential!27
Cross-Validation
LIME
Shapley
I was kidding; we haven’t solved the ML part
www.verta.ai Confidential!28
DevOps: Deliver Software Products Faster, More Reliably
www.verta.ai Confidential
In code, every change that we make is tracked
!29
www.verta.ai Confidential
In code, every change that we make is tracked
!30
java.lang.NullPointerException: null
...
...
WebBackend
SHA: ed05334
www.verta.ai Confidential
What about models?
!31
Guten Nacht: Negative
Guten Morgen: Negative
...
...
NLPModel
s3://models/final-bert-March12
??
Code?
Data?
Config?
Env?
www.verta.ai Confidential
What’s missing is ML-specific model versioning
• Uniquely identifies a model
• Enables user to go back in time and fully recreate a model
• Code
• Data
• Config
• Environment
• Allows branching, merging, diffs etc.
• Versioning that integrates into the ML workflow (e.g., library vs. CLI)
!32
www.verta.ai Confidential
ModelDB: open-source model versioning
!33
ModelDB 2.0: https://github.com/VertaAI/modeldb
• Code
• Data
• Config
• Env
• Code
• Data
• Config
• Env
www.verta.ai Confidential
Let’s fix the pipeline
!34
www.verta.ai Confidential
Revised MLOps Pipeline
!35
Package
Step 2:
Docker
Container
Release
Step 3:
Jenkins
Monitor
Step 4:
Prometheus
+ Logs
Trained
Model
Step 1:
Tweet
Model + ModelDB
www.verta.ai Confidential
Step 1: Train a Tweet Classification model +
use ModelDB for versioning
!36
www.verta.ai Confidential
Scenario: All our new traffic is from Germany
!37
www.verta.ai Confidential
Scenario: My colleague has an even better
model
!38
www.verta.ai Confidential
lives!
!39
Trader
www.verta.ai Confidential
Revised MLOps Pipeline
!40
Package
Step 2:
Docker
Container
Release
Step 3:
Jenkins
Monitor
Step 4:
Prometheus
+ Logs
Trained
Model
Step 1:
Tweet
Model + ModelDB
www.verta.ai Confidential
Summary
• Part I: Intro to MLOps
• Part II: Building an MLOps Pipeline
• Basic pipeline: Docker, Jenkins, Prometheus
• Real-world simulations
• Pipeline with versioning: ModelDB, Docker, Jenkins, Prometheus
!41
https://github.com/VertaAI/modeldb/tree/master/demos/webinar-2020-5-6
www.verta.ai Confidential
3 Takeaways
• MLOps is DevOps for ML: it helps you ship ML products faster
• Model Versioning : MLOps :: Git : DevOps
• Robust OSS MLOps: ModelDB + Docker + Jenkins + Prometheus
!42
https://github.com/VertaAI/modeldb/tree/master/demos/webinar-2020-5-6
www.verta.ai Confidential
Thanks!
https://github.com/VertaAI/modeldb | Today’s talk: modeldb/demos/ | Slack: http://bit.ly/modeldb-mlops
!43
Pre-register for our MLOps Salon happening in June!
https://info.verta.ai/ml-ops-event

More Related Content

PPTX
MLOps with Azure DevOps
PDF
MLOps with Kubeflow
PDF
Apply MLOps at Scale by H&M
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
PDF
The A-Z of Data: Introduction to MLOps
PDF
MLOps for production-level machine learning
PDF
Seamless MLOps with Seldon and MLflow
PDF
Managing the Machine Learning Lifecycle with MLOps
MLOps with Azure DevOps
MLOps with Kubeflow
Apply MLOps at Scale by H&M
Using MLOps to Bring ML to Production/The Promise of MLOps
The A-Z of Data: Introduction to MLOps
MLOps for production-level machine learning
Seamless MLOps with Seldon and MLflow
Managing the Machine Learning Lifecycle with MLOps

What's hot (20)

PDF
MLOps Bridging the gap between Data Scientists and Ops.
PDF
MLOps Using MLflow
PPTX
MLOps - The Assembly Line of ML
PDF
“Houston, we have a model...” Introduction to MLOps
PPTX
From Data Science to MLOps
PDF
Ml ops past_present_future
PDF
Introduction to MLflow
PPTX
MLOps.pptx
PPTX
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
PDF
MLOps by Sasha Rosenbaum
PDF
ML-Ops: Philosophy, Best-Practices and Tools
PDF
MLops workshop AWS
PPTX
MLOps in action
PDF
Emeli Dral (Evidently AI) – Analyze it: production monitoring for machine lea...
PDF
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
PPTX
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
PDF
What is MLOps
PDF
Databricks Overview for MLOps
PDF
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
PDF
BigQuery ML - Machine learning at scale using SQL
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Using MLflow
MLOps - The Assembly Line of ML
“Houston, we have a model...” Introduction to MLOps
From Data Science to MLOps
Ml ops past_present_future
Introduction to MLflow
MLOps.pptx
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps by Sasha Rosenbaum
ML-Ops: Philosophy, Best-Practices and Tools
MLops workshop AWS
MLOps in action
Emeli Dral (Evidently AI) – Analyze it: production monitoring for machine lea...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
What is MLOps
Databricks Overview for MLOps
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
BigQuery ML - Machine learning at scale using SQL
Ad

Similar to Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus (20)

PDF
Model versioning done right: A ModelDB 2.0 Walkthrough
PPTX
Combining Machine Learning Frameworks with Apache Spark
PDF
Distributed ML in Apache Spark
PPTX
Combining Machine Learning frameworks with Apache Spark
PPTX
Legion - AI Runtime Platform
PPTX
Open, Secure & Transparent AI Pipelines
PPTX
Notes on Deploying Machine-learning Models at Scale
PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PDF
mlflow: Accelerating the End-to-End ML lifecycle
PDF
Managing the Machine Learning Lifecycle with MLflow
PDF
Key projects in AI, ML and Generative AI
PDF
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
PDF
Michelangelo - Machine Learning Platform - 2018
PDF
A survey on Machine Learning In Production (July 2018)
PPTX
Top MLOps (machine learning) Tools Of 2024 - TechDogs
PPTX
Serverless Functions and Machine Learning: Putting the AI in APIs
PPTX
Apache Spark MLlib
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
PDF
"Managing the Complete Machine Learning Lifecycle with MLflow"
PPTX
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Model versioning done right: A ModelDB 2.0 Walkthrough
Combining Machine Learning Frameworks with Apache Spark
Distributed ML in Apache Spark
Combining Machine Learning frameworks with Apache Spark
Legion - AI Runtime Platform
Open, Secure & Transparent AI Pipelines
Notes on Deploying Machine-learning Models at Scale
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
mlflow: Accelerating the End-to-End ML lifecycle
Managing the Machine Learning Lifecycle with MLflow
Key projects in AI, ML and Generative AI
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Michelangelo - Machine Learning Platform - 2018
A survey on Machine Learning In Production (July 2018)
Top MLOps (machine learning) Tools Of 2024 - TechDogs
Serverless Functions and Machine Learning: Putting the AI in APIs
Apache Spark MLlib
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
"Managing the Complete Machine Learning Lifecycle with MLflow"
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Ad

Recently uploaded (20)

PDF
medical staffing services at VALiNTRY
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
L1 - Introduction to python Backend.pptx
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
history of c programming in notes for students .pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
top salesforce developer skills in 2025.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
System and Network Administration Chapter 2
PDF
System and Network Administraation Chapter 3
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
medical staffing services at VALiNTRY
2025 Textile ERP Trends: SAP, Odoo & Oracle
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Reimagine Home Health with the Power of Agentic AI​
L1 - Introduction to python Backend.pptx
How to Migrate SBCGlobal Email to Yahoo Easily
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PTS Company Brochure 2025 (1).pdf.......
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
history of c programming in notes for students .pptx
Odoo Companies in India – Driving Business Transformation.pdf
top salesforce developer skills in 2025.pdf
Understanding Forklifts - TECH EHS Solution
Design an Analysis of Algorithms II-SECS-1021-03
System and Network Administration Chapter 2
System and Network Administraation Chapter 3
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Navsoft: AI-Powered Business Solutions & Custom Software Development

Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus