SlideShare a Scribd company logo
Jordan Edwards
Senior Program Manager, ML Platform
Continuous Integration (CI)
• Blend together the work of individual
engineers in a repository.
• Each time you commit code, it’s
automatically built and tested, and
bugs are detected faster.
Continuous Deployment (CD)
• Automate the entire process from
code commit to production (if your
CI/CD tests are successful.)
Continuous Learning & Monitoring
• Safely deliver features to your
customers as soon as they’re ready.
• Monitor your features in production
and know when they aren’t behaving
as expected.
DevOps for Machine Learning overview en-us
ML DevOps lifecycle
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
Experiment
+ Testing
Continuous Integration
Continuous Deployment
DevOps for Machine Learning overview en-us
Overcome that data science teams only own experiments, instead of being responsible for the
end-to-end flow from experiment to production to operational support on AI.
Benefits:
• Continuous delivery of value (data insights, models) to end users.
• End-to-end ownership of the Analytics Lifecycle by DS teams
• Enforcing a consistent approach to building and deploying AI
• Extending data science with SDE practices to increase delivery quality and cadence.
• Framework for continuous learning, lineage, auditability and regulatory compliance.
• Improving team collaboration through standardization in delivery practices.
DevOps for Machine Learning overview en-us
Use leaderboards, side by side run
comparison and model selection
Capture run metrics, intermediate
outputs, output logs and models
Produce Repeatable Experiments
80%
75%
90%
85%
Use well-defined pipelines
to capture the E2E model
training process
• Track model versions & metadata with a centralized
model registry
• Leverage containers to capture runtime
dependencies for inference
• Leverage an orchestrator like Kubernetes to provide
scalable inference
• Capture model telemetry – health, performance,
inputs / outputs
• Encapsulate each step in the lifecycle to enable
CI/CD and DevOps
• Automatically optimize models to take advantage of
hardware acceleration
Prepare
Data
Register &
Manage Model
Model training &
testing
Package &
Validate Model
…
Feature engineering Deploy Service
Monitor Model
Prepare Experiment Deploy
Data science workflow
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
App Developer IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
IDE
Consume Model
DevOps
Pipeline
Predict
Update
Application
Publish Model
Deploy
Application
Validate
App
App Developer IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Predict
Validate
App
Update
Application
Deploy
Application
Publish Model
App Developer IDE
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Validate
Model
Predict
Validate
Model + App
Update
Application
Deploy
Application
Data Scientist
Publish Model
App Developer IDE
[ { "cat": 0.99218,
"feline": 0.81242 }]
Model Store
Consume Model
DevOps
Pipeline
Validate
Model
Predict
Validate
Model + App
Update
Application
Deploy
Application
Data Scientist
Publish Model
Collect
Feedback
Retrain Model
AB Test
DevOps for Machine Learning overview en-us
App Developer
Cloud Services
IDE
Data Scientist
[ { "cat": 0.99218,
"feline": 0.81242 }]
IDE
Apps
Edge Devices
Model Store
Consume Model
DevOps
Pipeline
Customize Model
Deploy Model
Predict
Validate
&
Flight
Model
+
App
Update
Application
Publish Model
Collect
Feedback
Deploy
Application
Model
Telemetry
Retrain Model
Source Code DevOps Pipeline
Register
Model
Training Pipeline
Data
Movement
Data Prep Model Training
Model
Store
DevOps Pipeline
DevTest
Deploy to PROD
Package
Model
Validate
Model
Get
Human
Approval
MODEL CI/CD (Machine Learning as a Service + DevOps)
Azure DevOps
Azure Machine Learning
Azure Data Factory
New model
registered,
trigger
release
ML Pipeline handles dataPrep,
training, evaluation – certifies the
model is of high quality
TRAIN MODEL
DEPLOY MODEL
Unit Test
Code
Code change,
trigger CI
Inference Data
Data Preparation Services
(Labeling, Feedback, Drift)
Data Lake New data, trigger CI
Data
Cooking
Pipeline
New inference
code, trigger
release
Data Warehouse
New training job is
started whenever source
code is pushed.
Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
DevOps for Machine Learning overview en-us
• Training data
• Featurization code (w/ tests)
• Training pipeline
• Training environment
• Evidence chain
• Model config
• Training job info
• Sample data
• Data profile
Use repeatable pipelines for your ML
workflow – they can get complicated.
Source Control
• Track changes in code (and configuration) over time, integrate work,
reproducibility and collaboration.
Dataset Versioning
• Training data plays an important role in the quality of the software
build. Hence, versioning of data is required for reproducability.
Model Versioning
• Version trained models in relation to code and training data for
traceability.
Experiment Tracking
• Version model experiment runs to understand which code, data and
e.g. selected features led to what output and performance, and
allow for reproducibility.
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
• The model response on a given record is not the expected one.
• Investigate the trainset and detect potential bias.
• Ensure that the preprocessing is not clipping any values etc.
• Document these corner cases & add them to validation process
Edge cases
• This type of bugs refers to the resiliency of the model in case of missing
values and how well can it handle unseen categorical values.
Null values /
unknown categories
• An input stream may stop producing data causing unexpected responses by
the model.
Input issues
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
Test Type Data Scientist App Dev / Ops
Unit Tests X
Data Integrity Tests X
Model Performance X
Model Validation X
Integration Tests X X
Load Tests X
Data Monitoring X
Skew Monitoring X
Model Monitoring X X
• Data (changes to shape / profile)
• Model in isolation (offline A/B)
• Model + app (functional testing)
• Only deploy after initial validation passes
• Ramp up traffic to new model using A/B
experimentations
• Functional behavior
• Performance characteristics
DevOps for Machine Learning overview en-us
• which data,
• which experiment / previous model(s),
• where’s the code / notebook)
• Was it converted / quantized?
• Private / compliant data
DevOps for Machine Learning overview en-us
• Focus on ML, not DevOps
• Get telemetry for service health and model behavior
• code-generation
• API specifications / interfaces
• Cloud Services
• Mobile / Embedded Applications
• Edge Devices
• Quantize / optimize models for target platform
• Compliant + Safe
DevOps for Machine Learning overview en-us
ML DevOps lifecycle
Experiment
Data Acquisition
Business Understanding
Initial Modeling
Develop
Modeling
Operate
Continuous Delivery
Data Feedback Loop
System + Model Monitoring
Experiment
+ Testing
Continuous Integration
Continuous Deployment
© Microsoft Corporation
DevOps brings together people, processes, and technology, automating software delivery to provide continuous
value to your users. Using Azure DevOps, you can deliver software faster and more reliably - no matter how big
your IT department or what tools you’re using.
DevOps for ML: Supporting Technologies
Infrastructure as Code CI/CD Testing / Release / Monitoring
• Azure Resource Manager Templates
• Azure ML Python SDK & CLI
• Azure SDK’s
• Azure DevOps Pipelines
• Azure ML Training Services
• Azure Repos / GitHub
• Azure Boards
• Azure DevOps for automated testing
• R - Runit and testthat
• Python - PyUnit, pytest, nose, …
• Azure ML Tracking
• Azure Data Prep SDK (analyse/profile)
• Azure ML Model Management
(Instrumentation, Telemetry)
• Azure Monitor for app telemetry
Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
DevOps for Machine Learning overview en-us
Model trainer
Model trainer
Model trainer
DevOps for Machine Learning overview en-us
Azure Machine Learning service
Set of Azure Cloud
Services
Python
SDK
 Prepare Data
 Build Models
 Train Models
 Manage Models
 Track Experiments
 Deploy Models
That enables you to:
IT/Ops
ML Scientist
Dev/Ops
Azure Machine Learning – Key Concepts
Azure ML service Artifact
Workspace
The workspace is the top-level resource for the Azure Machine Learning service.
It provides a centralized place to work with all the artifacts you create when using Azure Machine
Learning service.
The workspace keeps a list of compute targets that can be used to train your model. It also keeps a
history of the training runs, including logs, metrics, output, and a snapshot of your scripts.
Models are registered with the workspace.
You can create multiple workspaces, and each workspace can be shared by multiple people.
When you create a new workspace, it automatically creates these Azure resources:
Azure Container Registry - Registers docker containers that are used during training and when
deploying a model.
Azure Storage - Used as the default datastore for the workspace.
Azure Application Insights - Stores monitoring information about your models.
Azure Key Vault - Stores secrets used by compute targets and other sensitive information needed
by the workspace.
Azure ML service
Key Artifacts
Workspace
Clone
Edit
Submit
ML Pipelines
Increase experiment velocity, reliability, repeatability
Use the technology of your choice for each step
Create & manage ML workflows concurrently
Define steps to prepare data, train, deploy, eval
Use diverse languages & run on diverse compute
Easy to compose and swap out steps as your workflow
evolves
Features
Sequencing and parallelization of steps, declarative
data dependencies
Unattended execution for long running pipeline, mixed
and diverse (heterogeneous) compute for steps
Data management and reusable components. Share
pipelines, code, intermediate data, and models
Compute
#1, #2
Compute
#3
Compute
#4
ML Pipeline
2 3
5 6
8
1 4
7
REST API w/ parameters enables retraining and batch
scoring
Fine controls for compute provision and deprovision
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
Azure ML – Models and Model Registry
Model Model Registry
Model Deployment
5
DevOps for Machine Learning overview en-us
Cloud-hosted pipelines for Linux, Windows and macOS.
Azure DevOps Pipelines
Any language, any platform, any cloud
Build, test, and deploy Node.js, Python, Java, PHP,
Ruby, C/C++, .NET, Android, and iOS apps. Run in
parallel on Linux, macOS, and Windows. Deploy to
Azure, AWS, GCP or on-premises
Extensible
Explore and implement a wide range of community-
built build, test, and deployment tasks, along with
hundreds of extensions from Slack to SonarCloud.
Support for YAML, reporting and more
Containers and Kubernetes
Easily build and push images to container registries
like Docker Hub and Azure Container Registry.
Deploy containers to individual hosts or Kubernetes.
https://azure.com/pipelines

DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
Continuous Integration and Delivery
Build Model (app) (testing + validation)
Deploy Resources
Deploy Model (app)
Logging & Monitoring
Real-Time
Azure Kubernetes Service
Application Performance Monitoring
Azure ML Experiments
Docker +
Conda Env.
Model / Data Monitoring
Batch
Azure ML Pipelines
Data Collection
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us

More Related Content

PPTX
MLOps in action
PPTX
DevOps for AI Apps
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
PPTX
Why do the majority of Data Science projects never make it to production?
PDF
MLOps by Sasha Rosenbaum
PDF
Building successful and secure products with AI and ML
PPTX
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
PDF
Ml ops intro session
MLOps in action
DevOps for AI Apps
Using MLOps to Bring ML to Production/The Promise of MLOps
Why do the majority of Data Science projects never make it to production?
MLOps by Sasha Rosenbaum
Building successful and secure products with AI and ML
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Ml ops intro session

Similar to DevOps for Machine Learning overview en-us (20)

PDF
Productionising Machine Learning Models
PDF
Continuous delivery for machine learning
PPTX
Integrating Machine Learning Capabilities into your team
PPTX
Why is dev ops for machine learning so different
PDF
[AI] ML Operationalization with Microsoft Azure
PPTX
ML Ops.pptx
PDF
DevOps Days Rockies MLOps
PPTX
ML-Ops: From Proof-of-Concept to Production Application
PPTX
Software engineering practices for the data science and machine learning life...
PPTX
CNCF-Istanbul-MLOps for Devops Engineers.pptx
PDF
C2_W1---.pdf
PDF
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
PPTX
Why is dev ops for machine learning so different - dataxdays
PDF
CI/CD for Machine Learning
PPTX
From Data Science to MLOps
PDF
Machine Learning Operations Cababilities
PPTX
Machine Learning Models in Production
PPTX
Build 2019 Recap
PDF
Azure Engineering MLOps
PPTX
Continuous Intelligence Workshop
Productionising Machine Learning Models
Continuous delivery for machine learning
Integrating Machine Learning Capabilities into your team
Why is dev ops for machine learning so different
[AI] ML Operationalization with Microsoft Azure
ML Ops.pptx
DevOps Days Rockies MLOps
ML-Ops: From Proof-of-Concept to Production Application
Software engineering practices for the data science and machine learning life...
CNCF-Istanbul-MLOps for Devops Engineers.pptx
C2_W1---.pdf
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
Why is dev ops for machine learning so different - dataxdays
CI/CD for Machine Learning
From Data Science to MLOps
Machine Learning Operations Cababilities
Machine Learning Models in Production
Build 2019 Recap
Azure Engineering MLOps
Continuous Intelligence Workshop
Ad

Recently uploaded (20)

PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Online Work Permit System for Fast Permit Processing
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
L1 - Introduction to python Backend.pptx
PDF
System and Network Administraation Chapter 3
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
history of c programming in notes for students .pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Digital Strategies for Manufacturing Companies
PDF
Nekopoi APK 2025 free lastest update
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
Operating system designcfffgfgggggggvggggggggg
Online Work Permit System for Fast Permit Processing
PTS Company Brochure 2025 (1).pdf.......
L1 - Introduction to python Backend.pptx
System and Network Administraation Chapter 3
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
history of c programming in notes for students .pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Which alternative to Crystal Reports is best for small or large businesses.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Softaken Excel to vCard Converter Software.pdf
ManageIQ - Sprint 268 Review - Slide Deck
Digital Strategies for Manufacturing Companies
Nekopoi APK 2025 free lastest update
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
VVF-Customer-Presentation2025-Ver1.9.pptx
Understanding Forklifts - TECH EHS Solution
Odoo Companies in India – Driving Business Transformation.pdf
ISO 45001 Occupational Health and Safety Management System
Ad

DevOps for Machine Learning overview en-us

  • 1. Jordan Edwards Senior Program Manager, ML Platform
  • 2. Continuous Integration (CI) • Blend together the work of individual engineers in a repository. • Each time you commit code, it’s automatically built and tested, and bugs are detected faster. Continuous Deployment (CD) • Automate the entire process from code commit to production (if your CI/CD tests are successful.) Continuous Learning & Monitoring • Safely deliver features to your customers as soon as they’re ready. • Monitor your features in production and know when they aren’t behaving as expected.
  • 4. ML DevOps lifecycle Experiment Data Acquisition Business Understanding Initial Modeling Develop Modeling Operate Continuous Delivery Data Feedback Loop System + Model Monitoring Experiment + Testing Continuous Integration Continuous Deployment
  • 6. Overcome that data science teams only own experiments, instead of being responsible for the end-to-end flow from experiment to production to operational support on AI. Benefits: • Continuous delivery of value (data insights, models) to end users. • End-to-end ownership of the Analytics Lifecycle by DS teams • Enforcing a consistent approach to building and deploying AI • Extending data science with SDE practices to increase delivery quality and cadence. • Framework for continuous learning, lineage, auditability and regulatory compliance. • Improving team collaboration through standardization in delivery practices.
  • 8. Use leaderboards, side by side run comparison and model selection Capture run metrics, intermediate outputs, output logs and models Produce Repeatable Experiments 80% 75% 90% 85% Use well-defined pipelines to capture the E2E model training process
  • 9. • Track model versions & metadata with a centralized model registry • Leverage containers to capture runtime dependencies for inference • Leverage an orchestrator like Kubernetes to provide scalable inference • Capture model telemetry – health, performance, inputs / outputs • Encapsulate each step in the lifecycle to enable CI/CD and DevOps • Automatically optimize models to take advantage of hardware acceleration
  • 10. Prepare Data Register & Manage Model Model training & testing Package & Validate Model … Feature engineering Deploy Service Monitor Model Prepare Experiment Deploy Data science workflow
  • 15. App Developer IDE Data Scientist [ { "cat": 0.99218, "feline": 0.81242 }] IDE Consume Model DevOps Pipeline Predict Update Application Publish Model Deploy Application Validate App
  • 16. App Developer IDE Data Scientist [ { "cat": 0.99218, "feline": 0.81242 }] Model Store Consume Model DevOps Pipeline Predict Validate App Update Application Deploy Application Publish Model
  • 17. App Developer IDE [ { "cat": 0.99218, "feline": 0.81242 }] Model Store Consume Model DevOps Pipeline Validate Model Predict Validate Model + App Update Application Deploy Application Data Scientist Publish Model
  • 18. App Developer IDE [ { "cat": 0.99218, "feline": 0.81242 }] Model Store Consume Model DevOps Pipeline Validate Model Predict Validate Model + App Update Application Deploy Application Data Scientist Publish Model Collect Feedback Retrain Model AB Test
  • 20. App Developer Cloud Services IDE Data Scientist [ { "cat": 0.99218, "feline": 0.81242 }] IDE Apps Edge Devices Model Store Consume Model DevOps Pipeline Customize Model Deploy Model Predict Validate & Flight Model + App Update Application Publish Model Collect Feedback Deploy Application Model Telemetry Retrain Model
  • 21. Source Code DevOps Pipeline Register Model Training Pipeline Data Movement Data Prep Model Training Model Store DevOps Pipeline DevTest Deploy to PROD Package Model Validate Model Get Human Approval MODEL CI/CD (Machine Learning as a Service + DevOps) Azure DevOps Azure Machine Learning Azure Data Factory New model registered, trigger release ML Pipeline handles dataPrep, training, evaluation – certifies the model is of high quality TRAIN MODEL DEPLOY MODEL Unit Test Code Code change, trigger CI Inference Data Data Preparation Services (Labeling, Feedback, Drift) Data Lake New data, trigger CI Data Cooking Pipeline New inference code, trigger release Data Warehouse New training job is started whenever source code is pushed.
  • 22. Continuous Integration and Delivery Build Model (app) (testing + validation) Deploy Resources Deploy Model (app) Logging & Monitoring Real-Time Azure Kubernetes Service Application Performance Monitoring Azure ML Experiments Docker + Conda Env. Model / Data Monitoring Batch Azure ML Pipelines Data Collection
  • 24. • Training data • Featurization code (w/ tests) • Training pipeline • Training environment • Evidence chain • Model config • Training job info • Sample data • Data profile Use repeatable pipelines for your ML workflow – they can get complicated.
  • 25. Source Control • Track changes in code (and configuration) over time, integrate work, reproducibility and collaboration. Dataset Versioning • Training data plays an important role in the quality of the software build. Hence, versioning of data is required for reproducability. Model Versioning • Version trained models in relation to code and training data for traceability. Experiment Tracking • Version model experiment runs to understand which code, data and e.g. selected features led to what output and performance, and allow for reproducibility.
  • 28. • The model response on a given record is not the expected one. • Investigate the trainset and detect potential bias. • Ensure that the preprocessing is not clipping any values etc. • Document these corner cases & add them to validation process Edge cases • This type of bugs refers to the resiliency of the model in case of missing values and how well can it handle unseen categorical values. Null values / unknown categories • An input stream may stop producing data causing unexpected responses by the model. Input issues
  • 31. Test Type Data Scientist App Dev / Ops Unit Tests X Data Integrity Tests X Model Performance X Model Validation X Integration Tests X X Load Tests X Data Monitoring X Skew Monitoring X Model Monitoring X X
  • 32. • Data (changes to shape / profile) • Model in isolation (offline A/B) • Model + app (functional testing) • Only deploy after initial validation passes • Ramp up traffic to new model using A/B experimentations • Functional behavior • Performance characteristics
  • 34. • which data, • which experiment / previous model(s), • where’s the code / notebook) • Was it converted / quantized? • Private / compliant data
  • 36. • Focus on ML, not DevOps • Get telemetry for service health and model behavior • code-generation • API specifications / interfaces • Cloud Services • Mobile / Embedded Applications • Edge Devices • Quantize / optimize models for target platform • Compliant + Safe
  • 38. ML DevOps lifecycle Experiment Data Acquisition Business Understanding Initial Modeling Develop Modeling Operate Continuous Delivery Data Feedback Loop System + Model Monitoring Experiment + Testing Continuous Integration Continuous Deployment
  • 39. © Microsoft Corporation DevOps brings together people, processes, and technology, automating software delivery to provide continuous value to your users. Using Azure DevOps, you can deliver software faster and more reliably - no matter how big your IT department or what tools you’re using. DevOps for ML: Supporting Technologies Infrastructure as Code CI/CD Testing / Release / Monitoring • Azure Resource Manager Templates • Azure ML Python SDK & CLI • Azure SDK’s • Azure DevOps Pipelines • Azure ML Training Services • Azure Repos / GitHub • Azure Boards • Azure DevOps for automated testing • R - Runit and testthat • Python - PyUnit, pytest, nose, … • Azure ML Tracking • Azure Data Prep SDK (analyse/profile) • Azure ML Model Management (Instrumentation, Telemetry) • Azure Monitor for app telemetry
  • 40. Continuous Integration and Delivery Build Model (app) (testing + validation) Deploy Resources Deploy Model (app) Logging & Monitoring Real-Time Azure Kubernetes Service Application Performance Monitoring Azure ML Experiments Docker + Conda Env. Model / Data Monitoring Batch Azure ML Pipelines Data Collection
  • 44. Azure Machine Learning service Set of Azure Cloud Services Python SDK  Prepare Data  Build Models  Train Models  Manage Models  Track Experiments  Deploy Models That enables you to:
  • 45. IT/Ops ML Scientist Dev/Ops Azure Machine Learning – Key Concepts
  • 46. Azure ML service Artifact Workspace The workspace is the top-level resource for the Azure Machine Learning service. It provides a centralized place to work with all the artifacts you create when using Azure Machine Learning service. The workspace keeps a list of compute targets that can be used to train your model. It also keeps a history of the training runs, including logs, metrics, output, and a snapshot of your scripts. Models are registered with the workspace. You can create multiple workspaces, and each workspace can be shared by multiple people. When you create a new workspace, it automatically creates these Azure resources: Azure Container Registry - Registers docker containers that are used during training and when deploying a model. Azure Storage - Used as the default datastore for the workspace. Azure Application Insights - Stores monitoring information about your models. Azure Key Vault - Stores secrets used by compute targets and other sensitive information needed by the workspace.
  • 47. Azure ML service Key Artifacts Workspace
  • 49. ML Pipelines Increase experiment velocity, reliability, repeatability Use the technology of your choice for each step Create & manage ML workflows concurrently Define steps to prepare data, train, deploy, eval Use diverse languages & run on diverse compute Easy to compose and swap out steps as your workflow evolves Features Sequencing and parallelization of steps, declarative data dependencies Unattended execution for long running pipeline, mixed and diverse (heterogeneous) compute for steps Data management and reusable components. Share pipelines, code, intermediate data, and models Compute #1, #2 Compute #3 Compute #4 ML Pipeline 2 3 5 6 8 1 4 7 REST API w/ parameters enables retraining and batch scoring Fine controls for compute provision and deprovision
  • 55. Azure ML – Models and Model Registry Model Model Registry
  • 58. Cloud-hosted pipelines for Linux, Windows and macOS. Azure DevOps Pipelines Any language, any platform, any cloud Build, test, and deploy Node.js, Python, Java, PHP, Ruby, C/C++, .NET, Android, and iOS apps. Run in parallel on Linux, macOS, and Windows. Deploy to Azure, AWS, GCP or on-premises Extensible Explore and implement a wide range of community- built build, test, and deployment tasks, along with hundreds of extensions from Slack to SonarCloud. Support for YAML, reporting and more Containers and Kubernetes Easily build and push images to container registries like Docker Hub and Azure Container Registry. Deploy containers to individual hosts or Kubernetes. https://azure.com/pipelines 
  • 61. Continuous Integration and Delivery Build Model (app) (testing + validation) Deploy Resources Deploy Model (app) Logging & Monitoring Real-Time Azure Kubernetes Service Application Performance Monitoring Azure ML Experiments Docker + Conda Env. Model / Data Monitoring Batch Azure ML Pipelines Data Collection

Editor's Notes

  • #3: Continuous Integration (CI) enables individual developers to collaborate more effectively with each other and blend their work into a code repository Each time you commit code, it’s automatically built and tested, and bugs are detected faster. Continuous Delivery (CD) is the process to build, test, configure and deploy from a build to a production environment Key here is repeatability and consistency to the process, making sure it is well understood, repeatable by others and can aid in the process of verifying the correctness. Continuous integration (CI) Increase code coverage. Build faster by splitting test and build runs Automatically ensure you don't ship broken code. Run tests continually. Continuous delivery (CD) Automatically deploy code to production. Ensure deployment targets have latest code. Use tested code from CI process. More info can be found here: https://docs.microsoft.com/en-us/azure/devops/learn/what-is-devops
  • #5: 4
  • #11: Here is the data scientist’s inner loop of work
  • #16: Make this slide animation. Developer work on the IDE of their choice on the application code. They commit the code to source control of their choice (VSTS has good support for various source controls) Separately, Data scientist work on developing their model. Once happy they publish the model to a model repository (we can extend this with Vienna) A build is kicked off in VSTS based on the commit in GitHub. VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container. VSTS pushes the image to private image repository in Azure Container Registry On a set schedule (nightly), release pipeline is kicked off. Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS. Users request for the app goes through DNS server. DNS server passes the request to load balancer and sends the response back to user.
  • #17: Make this slide animation. Developer work on the IDE of their choice on the application code. They commit the code to source control of their choice (VSTS has good support for various source controls) Separately, Data scientist work on developing their model. Once happy they publish the model to a model repository (we can extend this with Vienna) A build is kicked off in VSTS based on the commit in GitHub. VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container. VSTS pushes the image to private image repository in Azure Container Registry On a set schedule (nightly), release pipeline is kicked off. Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS. Users request for the app goes through DNS server. DNS server passes the request to load balancer and sends the response back to user.
  • #18: Make this slide animation. Developer work on the IDE of their choice on the application code. They commit the code to source control of their choice (VSTS has good support for various source controls) Separately, Data scientist work on developing their model. Once happy they publish the model to a model repository (we can extend this with Vienna) A build is kicked off in VSTS based on the commit in GitHub. VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container. VSTS pushes the image to private image repository in Azure Container Registry On a set schedule (nightly), release pipeline is kicked off. Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS. Users request for the app goes through DNS server. DNS server passes the request to load balancer and sends the response back to user.
  • #19: Make this slide animation. Developer work on the IDE of their choice on the application code. They commit the code to source control of their choice (VSTS has good support for various source controls) Separately, Data scientist work on developing their model. Once happy they publish the model to a model repository (we can extend this with Vienna) A build is kicked off in VSTS based on the commit in GitHub. VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container. VSTS pushes the image to private image repository in Azure Container Registry On a set schedule (nightly), release pipeline is kicked off. Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS. Users request for the app goes through DNS server. DNS server passes the request to load balancer and sends the response back to user.
  • #21: Make this slide animation. Developer work on the IDE of their choice on the application code. They commit the code to source control of their choice (VSTS has good support for various source controls) Separately, Data scientist work on developing their model. Once happy they publish the model to a model repository (we can extend this with Vienna) A build is kicked off in VSTS based on the commit in GitHub. VSTS Build pipeline pulls the latest model from Blob container (can be extended with Vienna Model Management Service) and creates a container. VSTS pushes the image to private image repository in Azure Container Registry On a set schedule (nightly), release pipeline is kicked off. Latest image from ACR is pulled and deployed across Kubernetes cluster on ACS. Users request for the app goes through DNS server. DNS server passes the request to load balancer and sends the response back to user.
  • #22: [10:50 AM] Tim ScarfeJordan Edwards thanks for this! Assumptions: 1) the model store is keyed in some way on the build ID and/or the git commit id? 2) the ML pipeline is calling out to data bricks using the jobs API with python source checked into git i.e. not calling a mutable notebook ​ [11:23 AM] Jordan EdwardsTim Scarfe - yes the model is pinned with the git commit as well as the pipeline / build ID (so you have an audit trail to exactly how it was produced) yes the job should submit sources that are in git not in a magic notebook on the file system <https://teams.microsoft.com/l/message/19:bfb1b4d771ff441393e2c89c9e80d14c@thread.skype/1547059832334?tenantId=72f988bf-86f1-41af-91ab-2d7cd011db47&amp;groupId=66aa6f64-da6b-491b-b2e3-8e43ae872a7c&amp;parentMessageId=1547054108278&amp;teamName=DevOps for A.I. V-Team&amp;channelName=General&amp;createdTime=1547059832334> Ideally release in my opinion will be automated to a staging environment once a new model hits the model store, Then integration testing and then a manual release gate for deployment to production, so I would not have the arrow from the repo with inference code changes directly triggering a release ... changes to inference code should trigger the build pipeline too. Perhaps there is room for triggering a different build pipeline based on filter conditions (Path Filters) that follows a seperate path other that registering a new model ?
  • #39: 38
  • #59: The What… Azure Pipelines is our offering for the heart of your DevOps needs… CI/CD… continuous integration & deployment. Azure Pipelines is the perfect launchpad for your code – automating everything… from your builds and deployments so you spend less time with the nuts and bolts and more time being creative At Microsoft we do just that. We deploy over 78k times a day with Azure Pipelines. Open & extensible… It’s great for any type of application, any platform or any cloud. It has cloud hosted pools of Linux, Mac & Windows VMs that we manage for you. Your not restricted to the functionality we provide, Pipelines has rich extensibility. Partners and the community can contribute extensions in our marketplace for everyone One of my favourite things is when new extensions show up. We have over 500 today, ranging from community built to services from Slack to SonarCloud. works has rich extensibility with a wide range of community extensions along with If you want to build & test a Node app in a GitHub repo and deploy it via a docker container to AWS… go for it. Containers / Modern… Containers are becoming more & more the unity of deployment & Azure Pipelines is great for containers. Azure Pipelines can build images, push them to container registries like Docker Hub and Azure Container Registry. You can deploy to any container host including Kubernetes. Transition… Donovan, is going to show us Azure Pipelines in action.