SlideShare a Scribd company logo
DevOps for AI Apps
Richin Jain, Software Engineer (@richinjain)
Vivek Gupta, Data Scientist (@gkeviv)
Agenda
• Background
• DevOps Introduction
• Enterprise AI use case
• Workflows
• Traditional v/s AI App
• Proposed approach
• Pipeline
Background
• Over the course of engaging with various enterprise customers on their AI use case, this has been a
common ask.
• At one point in time Data Science was one off task where answer was given.
• Now it has been integrated with real time applications along with retraining and A/B testing.
• Because of this we have to relook at Data Science process and how it could be integrated with existing
software stack.
• Goal was to start looking at best practices from S/W engineering and how could it best applied here.
DevOps brings together people, processes, and technology, automating software delivery to provide continuous value to
your users.
What is DevOps?
Continuous Integration (CI)
• Focuses on blending the work of
individual developers together into a
repository.
• Each time you commit code, it’s
automatically built and tested and
bugs are detected faster.
Continuous Deployment (CD)
• Automate the entire process from
code commit to production if your
CI/CD tests are successful.
Continuous Learning & Monitoring
• Using CI/CD practices, paired with
monitoring tools, safely deliver features
to your customers as soon as they’re
ready.
Single environment
Multiple environment
Configure Code Build Test Package Deploy
Monitor
• Infrastructure as Code
• Continuous Integration
• Automated Testing
• Continuous Deployment
• Release Management
• Load Testing & Auto-Scale
• App Performance Monitoring
DevOps Maturity
Data Science
• Experimentation
• Modeling
• Versioning
• Lineage
• Conversion
• Export
• Quantization
• Inferencing
• Retraining
• A/B Testing
A C T I V I T I E S
• Need to solve ML problem quickly.
• ML stack might be different from rest of the
application stack.
• Lots of glue code.
• Testing accuracy of ML model.
• ML code is not always version controlled.
• Hard to reproduce models
• Integrating model into application can take weeks
• Need to re-write featurizing and scoring code
multiple times (in different languages)
• Want to start using customer data to build models
• Hard to track breaking changes
P A I N P O I N T S
Enterprise AI use case
• Contoso LLC has an image recognition scenario. Data Science team develops a
state-of-the-art image recognition model.
• Four ways it could be consumed
• User upload the images to Contoso's website and get instant results.
• User uploads several images or point to a folder and get results.
• Native mobile app
• Edge devices
API Based model integration
• Real time
• Batch
Embedded models
• Native Apps
• Edge devices
Workflow: App DeveloperWorkflow: App DeveloperBasic Workflow: Software Engineer
Workflow: Data ScientistWorkflow: App DeveloperBasic Workflow: Data Scientist
DevOps for AI Apps
https://ai.google/research/pubs/pub45742
https://ai.google/research/pubs/pub45742
Model TestsData Tests
ML Infrastructure Tests Monitoring Tests
Proposed Approach
Data-
Science-
Repo
Publish
test
results
Get
Source
Code
Install
Requirements
Create Conda
Environment
Unit-testPylint
Data
Scientist
Code
Coverage
Data-
Science-
Repo
Get
Source
Code
Install
Requirements
Create Conda
Environment
Unit-test
Create
Docker
Image
Data
Scientist
Register
Model
Pull Req. Pass
Test
deployed
image
Deploy on
Test
Model
Testing and
Validation
Build
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Test
(create/update)
Test environment
(continuous
deployment)
Dev
Artifact
Create Conda
Environment
Install
Requirements
Deploy to
Staging
(update)
Staging Environment
(nightly, other
services test here)
Test
Artifact
Create Conda
Environment
Install
Requirements
Deploy to Prod
(update)
Prod environment
(end of sprint)
Get
Source
Create Conda
Environment
Install
Requirements
Convert model
to other formats
ONNX, CoreML, WinML
(end of sprint, every time
there is a new model)
Get
Source
Create Conda
Environment
Install
Requirements
Retrain model
on new data
Retraining Pipeline
(every night, or triggered
on new data uploading to
blob)
It will run in a pre-prod
environment, so it has
access to production
data, and wouldn’t be
promoted unless it passes
A/B tests against prod
data.
Data
Validation
A/B Testing
Model
Testing and
Validation
Model
Management &
Promotion
• Core features of Azure ML service
exposed through a Python SDK and CLI.
• Easy and simple pip install
• Makes CI/CD much simpler.
DevOps for AI Apps
DevOps for AI Apps
• Best practices for architecting and managing an enterprise-ready AI application
lifecycle.
• Azure DevOps and Azure ML ease the adoption of DevOps by DS teams.
• Adoption will increase the agility, quality and delivery of DS teams.
Thank you !

More Related Content

PPTX
DevOps Days Toronto: From 6 Months Waterfall to 1 hour Code Deploys
PPTX
Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...
PPTX
DevOps Pipelines and Metrics Driven Feedback Loops
PPTX
DevOps Transformation at Dynatrace and with Dynatrace
PPTX
How to explain DevOps to your mom
PPTX
Metrics Driven DevOps - Automate Scalability and Performance Into your Pipeline
PPTX
AWS Summit - Trends in Advanced Monitoring for AWS environments
PPTX
Boston DevOps Days 2016: Implementing Metrics Driven DevOps - Why and How
DevOps Days Toronto: From 6 Months Waterfall to 1 hour Code Deploys
Performance Metrics Driven CI/CD - Introduction to Continuous Innovation and ...
DevOps Pipelines and Metrics Driven Feedback Loops
DevOps Transformation at Dynatrace and with Dynatrace
How to explain DevOps to your mom
Metrics Driven DevOps - Automate Scalability and Performance Into your Pipeline
AWS Summit - Trends in Advanced Monitoring for AWS environments
Boston DevOps Days 2016: Implementing Metrics Driven DevOps - Why and How

What's hot (18)

PPTX
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
PDF
Metrics-driven Continuous Delivery
PPTX
Release Readiness Validation with Keptn for Austrian Online Banking Software
PPTX
OOP 2016 - Building Software That Eats The World
PPTX
Top Java Performance Problems and Metrics To Check in Your Pipeline
PDF
Taking the Best of Agile, DevOps and CI/CD into security
PPTX
Monitoring as a Self-Service in Atlassian DevOps Toolchain
PDF
Continuous delivery in Qbon
PPTX
Four Practices to Fix Your Top .NET Performance Problems
PPTX
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
PPTX
Top Performance Problems in Distributed Architectures
PPTX
Keptn - Automated Operations & Continuous Delivery for k8s
PDF
Quality Jam 2017: Elise Carmichael and Corey Pyle "Jumpstarting Your Test Aut...
PDF
Quality Jam 2017: Kevin Dunne "Macro Trends and Useful Tools that 'Get It'"
PPTX
Deploy Faster Without Failing Faster - Metrics-Driven - Dynatrace User Groups...
PPTX
Web and App Performance: Top Problems to avoid to keep you out of the News
PDF
Managers, Future Proof Your Automation
PPTX
From 0 to DevOps: Lessons Learned Moving from On-Prem to Cloud Native
Applying AI to Performance Engineering: Shift-Left, Shift-Right, Self-Healing
Metrics-driven Continuous Delivery
Release Readiness Validation with Keptn for Austrian Online Banking Software
OOP 2016 - Building Software That Eats The World
Top Java Performance Problems and Metrics To Check in Your Pipeline
Taking the Best of Agile, DevOps and CI/CD into security
Monitoring as a Self-Service in Atlassian DevOps Toolchain
Continuous delivery in Qbon
Four Practices to Fix Your Top .NET Performance Problems
DOES SFO 2016 - Scott Willson - Top 10 Ways to Fail at DevOps
Top Performance Problems in Distributed Architectures
Keptn - Automated Operations & Continuous Delivery for k8s
Quality Jam 2017: Elise Carmichael and Corey Pyle "Jumpstarting Your Test Aut...
Quality Jam 2017: Kevin Dunne "Macro Trends and Useful Tools that 'Get It'"
Deploy Faster Without Failing Faster - Metrics-Driven - Dynatrace User Groups...
Web and App Performance: Top Problems to avoid to keep you out of the News
Managers, Future Proof Your Automation
From 0 to DevOps: Lessons Learned Moving from On-Prem to Cloud Native
Ad

Similar to DevOps for AI Apps (20)

PPTX
DevOps for Machine Learning overview en-us
PDF
Inextricably linked: reproducibility and productivity in data science and AI
PDF
Microsoft DevOps for AI with GoDataDriven
PPTX
Why do the majority of Data Science projects never make it to production?
PDF
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
PPTX
Software engineering practices for the data science and machine learning life...
PDF
Using MLOps to Bring ML to Production/The Promise of MLOps
PDF
Inextricably linked reproducibility and productivity in data science and ai ...
PDF
Productionising Machine Learning Models
PPTX
Deploying ML models in the enterprise
PDF
Building Data Science into Organizations: Field Experience
PPTX
MLOps in action
PDF
End to end MLworkflows
PPTX
From Data Science to MLOps
PDF
Maciej Marek (Philip Morris International) - The Tools of The Trade
PDF
Agile Mumbai 27-28th Sep 2024 | AI Revolution: Transforming the Future of Dev...
PPTX
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
PPTX
How to apply machine learning into your CI/CD pipeline
PPTX
Machine Learning Models in Production
PPTX
Azure DevOps AI
DevOps for Machine Learning overview en-us
Inextricably linked: reproducibility and productivity in data science and AI
Microsoft DevOps for AI with GoDataDriven
Why do the majority of Data Science projects never make it to production?
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
Software engineering practices for the data science and machine learning life...
Using MLOps to Bring ML to Production/The Promise of MLOps
Inextricably linked reproducibility and productivity in data science and ai ...
Productionising Machine Learning Models
Deploying ML models in the enterprise
Building Data Science into Organizations: Field Experience
MLOps in action
End to end MLworkflows
From Data Science to MLOps
Maciej Marek (Philip Morris International) - The Tools of The Trade
Agile Mumbai 27-28th Sep 2024 | AI Revolution: Transforming the Future of Dev...
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
How to apply machine learning into your CI/CD pipeline
Machine Learning Models in Production
Azure DevOps AI
Ad

Recently uploaded (20)

PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
DOCX
573137875-Attendance-Management-System-original
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
composite construction of structures.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
additive manufacturing of ss316l using mig welding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
web development for engineering and engineering
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
CH1 Production IntroductoryConcepts.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Sustainable Sites - Green Building Construction
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
573137875-Attendance-Management-System-original
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Model Code of Practice - Construction Work - 21102022 .pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
composite construction of structures.pdf
OOP with Java - Java Introduction (Basics)
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Foundation to blockchain - A guide to Blockchain Tech
additive manufacturing of ss316l using mig welding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
web development for engineering and engineering

DevOps for AI Apps

  • 1. DevOps for AI Apps Richin Jain, Software Engineer (@richinjain) Vivek Gupta, Data Scientist (@gkeviv)
  • 2. Agenda • Background • DevOps Introduction • Enterprise AI use case • Workflows • Traditional v/s AI App • Proposed approach • Pipeline
  • 3. Background • Over the course of engaging with various enterprise customers on their AI use case, this has been a common ask. • At one point in time Data Science was one off task where answer was given. • Now it has been integrated with real time applications along with retraining and A/B testing. • Because of this we have to relook at Data Science process and how it could be integrated with existing software stack. • Goal was to start looking at best practices from S/W engineering and how could it best applied here.
  • 4. DevOps brings together people, processes, and technology, automating software delivery to provide continuous value to your users. What is DevOps? Continuous Integration (CI) • Focuses on blending the work of individual developers together into a repository. • Each time you commit code, it’s automatically built and tested and bugs are detected faster. Continuous Deployment (CD) • Automate the entire process from code commit to production if your CI/CD tests are successful. Continuous Learning & Monitoring • Using CI/CD practices, paired with monitoring tools, safely deliver features to your customers as soon as they’re ready.
  • 6. Configure Code Build Test Package Deploy Monitor • Infrastructure as Code • Continuous Integration • Automated Testing • Continuous Deployment • Release Management • Load Testing & Auto-Scale • App Performance Monitoring DevOps Maturity
  • 7. Data Science • Experimentation • Modeling • Versioning • Lineage • Conversion • Export • Quantization • Inferencing • Retraining • A/B Testing A C T I V I T I E S • Need to solve ML problem quickly. • ML stack might be different from rest of the application stack. • Lots of glue code. • Testing accuracy of ML model. • ML code is not always version controlled. • Hard to reproduce models • Integrating model into application can take weeks • Need to re-write featurizing and scoring code multiple times (in different languages) • Want to start using customer data to build models • Hard to track breaking changes P A I N P O I N T S
  • 8. Enterprise AI use case • Contoso LLC has an image recognition scenario. Data Science team develops a state-of-the-art image recognition model. • Four ways it could be consumed • User upload the images to Contoso's website and get instant results. • User uploads several images or point to a folder and get results. • Native mobile app • Edge devices
  • 9. API Based model integration • Real time • Batch Embedded models • Native Apps • Edge devices
  • 10. Workflow: App DeveloperWorkflow: App DeveloperBasic Workflow: Software Engineer
  • 11. Workflow: Data ScientistWorkflow: App DeveloperBasic Workflow: Data Scientist
  • 18. Build Artifact Create Conda Environment Install Requirements Deploy to Test (create/update) Test environment (continuous deployment) Dev Artifact Create Conda Environment Install Requirements Deploy to Staging (update) Staging Environment (nightly, other services test here) Test Artifact Create Conda Environment Install Requirements Deploy to Prod (update) Prod environment (end of sprint)
  • 19. Get Source Create Conda Environment Install Requirements Convert model to other formats ONNX, CoreML, WinML (end of sprint, every time there is a new model) Get Source Create Conda Environment Install Requirements Retrain model on new data Retraining Pipeline (every night, or triggered on new data uploading to blob) It will run in a pre-prod environment, so it has access to production data, and wouldn’t be promoted unless it passes A/B tests against prod data. Data Validation A/B Testing Model Testing and Validation Model Management & Promotion
  • 20. • Core features of Azure ML service exposed through a Python SDK and CLI. • Easy and simple pip install • Makes CI/CD much simpler.
  • 23. • Best practices for architecting and managing an enterprise-ready AI application lifecycle. • Azure DevOps and Azure ML ease the adoption of DevOps by DS teams. • Adoption will increase the agility, quality and delivery of DS teams.

Editor's Notes

  • #6: https://blogs.msdn.microsoft.com/visualstudioalmrangers/2017/04/20/set-up-a-cicd-pipeline-to-run-automated-tests-efficiently/
  • #7: Microsoft DevOps site - https://www.microsoft.com/en-us/cloud-platform/development-operations Source - http://www.itproguy.com/devops-practices/
  • #14: Link to Google paper - https://ai.google/research/pubs/pub45742
  • #15: Link to Google paper - https://ai.google/research/pubs/pub45742
  • #21: https://azure.microsoft.com/en-us/blog/what-s-new-in-azure-machine-learning-service/