SlideShare a Scribd company logo
• PUBLIC 公開
Well Architected ML Platforms for
Data Science
Reliable Machine Learning Lifecycle
Goal Definition
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Maintenance
Outliers
Model Regression
Recall Feedback (Re-train)
Battle tested CRISP-DM Model for ML Implementations
• PUBLIC 公開
https://d1.awsstatic.com/whitepapers/architecture/wellarchitected-Machine-Learning-Lens.pdf
Platform Capabilities
1 Autoscaling Cloud-native ML/AI Platform with Identity Access
Management and Data Classification
2 Data collection and Curation for Business Analytics Reporting
and ML Model Preparation
3 Self-service Training and Experimentation for Forecasting and
Simulation
4
Prediction performance and Feedback from drift of forecast
Decisions
Insights
Data Lake
IAM
PaaS/DbaaS
5
ML as a Service Model Deployment
• PUBLIC 公開
• PUBLIC 公開
ML Ops Platform – Technical Architecture1
• PUBLIC 公開
ML Ops Platform – Technical Architecture2
Data Sources
Customer
Mart
Sales
Mart
Sales
Mart
Customer
Mart
RISKS
Mart
Sales
Mart
Account
Salesforce
SAS
Click
Stream
Speech
Telematic
s
Customer Bus Ops Risk Profitability Click Stream
Data Sourcing &
Data Wrangling
Data
Scientists
ML Engineers
AWS
Data and Model
Exploration
Spark Mlib
JupyterHub
Notebooks
SparkMagic
Livy
Rest
API
EMR Spark EMR Presto
Tensor flow
H2O
Sparkling
Water
Git Dvc ML Flow
Model Registry
(Versioning)
Training Validation
S3
S3
Data
Collection
Data
Curation
Model Repository
Model Inference
S3
Model Deployment
ML Ops Deployment pipeline
Jenkins
Advanced Analytics Engine
S3
Feature Engineering
Api
Release
Predict
Api
EKS
Jenkin
s
Click
Stream
• PUBLIC 公開
Heterogenous
data sources
Interoperability
and Integration
Schema
Management
using Parquet
Data discovery
Catalog
Data Lineage
and
Classification
Data
Governance and
Privacy
• PUBLIC 公開
•System Appreciation and
Discovery
•Gap Analysis
•Component Design
•Team structure and RACI
Plan
•Milestones and Timelines
•Infrastructure
•Data Systems
•Security
•Governance
Build •Socialize
•Demo
•Training
•Benchmark
•Pilot
Evangelize
•Transform
•Dual Support
•Bridge
•Decommission Legacy
Adopt •Release Management
•Warranty
•Operational Support
•Design Lifecycle
•Iterative Releases
Run
• PUBLIC 公開
• PUBLIC 公開
• PUBLIC 公開
DOCKERIZED FLASK
API
EKS CONTAINER
REGISTRY
REDUCE API BUILD
DEPENDENCIES
ENABLE FARGATE
SERVERLESS GATEWAY
• PUBLIC 公開
Streamlined Data collection
Version controlled Feature Engineering
Collaborative Discovery of features
Distributed Training with Validation
Reliable ML as a service
Prediction performance
Drift monitoring
Model governance and fairness
ML Ops Platform Goals
• PUBLIC 公開
1.Data Science
Teams
•Conceptualiz
ation
•Requirement
s
•Prototype
•Design
Review
1.ML Platform
Team
oJAD - Joint
Application
Design
oDesign
Approval
(JIRA)
oModel
Development
oCoding
oTraining
oCross
Validation
oAPI
Development
Team
oAPI
Requirements
oAPI Security
oAPI Catalog
oAPI
Integration
oDevops Team
oCICD for
Data pipelines
oCICD for
Training
oCICD for
Model API
oInfra Team
oEKS
oNetwork
oMonitoring
o*Governance
Team
oModel
Governance
oModel
Fairness
oModel
Monitoring
oData Science
Teams
oModel
monitoring
oModel
feedback
• PUBLIC 公開
1.Model
Conceptualization
oBusiness need
oMarket Research
o Customer Feedback
oBusiness value
1.Model
Requirements
oBusiness potential -
Inputs from Model
Conceptualization
oData sources
oCoordinating
Customer and
Technical requirements
oPrototype
1.Model Prototype
oData Collection
oData Curation
oFeature Engineering
oTraining and
Prediction
1.Model Design
Review
oJAD - Joint
Application Design
oDesign Approval
1.Model
Development
oOperational
dependencies
oIntegration
dependencies
oCoding
oContinuous
Training
oFeature Engineering
oTraining
oCross validation
oTracking
• PUBLIC 公開
1.Model
DevOps
oUnit tests
oCode coverage
oCI/CD
1.Model Quality
Assurance
oModel
performance
oSystem
performance
1.Model
Deployment
oCloud native
oContainerized
oData Pipeline
oML Service
oMonitoring
and Logging
1.Model
Integration
oService
oData Feed
oDashboard
and
Visualization
1.Model
Management
oModel
Repository
oModel Catalog
oModel
monitoring
oSimulation
1.Model
Governance
and Fairness
oAudit
oExplanation
oTrace back
1.Model
Feedback
oRecall and
Bias
oTraining
Outliers
oContinuous
Training
• PUBLIC 公開
Monitoring
Inputs and
decision outputs
Model fairness
characteristics
Overfitting,
skew, bias
Concept Drift
Outliers
Revised
Training
• PUBLIC 公開

More Related Content

PDF
What is MLOps
PDF
Ml ops intro session
PDF
Managing the Machine Learning Lifecycle with MLflow
PDF
Ml ops past_present_future
PPTX
MLOps - The Assembly Line of ML
PDF
Seamless MLOps with Seldon and MLflow
PDF
Databricks Overview for MLOps
PDF
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
What is MLOps
Ml ops intro session
Managing the Machine Learning Lifecycle with MLflow
Ml ops past_present_future
MLOps - The Assembly Line of ML
Seamless MLOps with Seldon and MLflow
Databricks Overview for MLOps
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage

What's hot (20)

PPTX
Introduction to Azure Databricks
PPTX
From Data Science to MLOps
PDF
Introduction to MLflow
PPTX
Databricks Platform.pptx
PDF
“Houston, we have a model...” Introduction to MLOps
PDF
Technical Deck Delta Live Tables.pdf
PDF
MLOps Using MLflow
PDF
MLOps Bridging the gap between Data Scientists and Ops.
PDF
Productionzing ML Model Using MLflow Model Serving
PDF
Simplifying Model Management with MLflow
PDF
Intro to Delta Lake
PDF
MLOps for production-level machine learning
PDF
Introdution to Dataops and AIOps (or MLOps)
PPTX
An AI Maturity Roadmap for Becoming a Data-Driven Organization
PDF
MLOps by Sasha Rosenbaum
PDF
MLOps with Kubeflow
PDF
Elasticsearch From the Bottom Up
PDF
Modernizing to a Cloud Data Architecture
PDF
Drifting Away: Testing ML Models in Production
PDF
MLflow: A Platform for Production Machine Learning
Introduction to Azure Databricks
From Data Science to MLOps
Introduction to MLflow
Databricks Platform.pptx
“Houston, we have a model...” Introduction to MLOps
Technical Deck Delta Live Tables.pdf
MLOps Using MLflow
MLOps Bridging the gap between Data Scientists and Ops.
Productionzing ML Model Using MLflow Model Serving
Simplifying Model Management with MLflow
Intro to Delta Lake
MLOps for production-level machine learning
Introdution to Dataops and AIOps (or MLOps)
An AI Maturity Roadmap for Becoming a Data-Driven Organization
MLOps by Sasha Rosenbaum
MLOps with Kubeflow
Elasticsearch From the Bottom Up
Modernizing to a Cloud Data Architecture
Drifting Away: Testing ML Models in Production
MLflow: A Platform for Production Machine Learning
Ad

Similar to Well architected ML platforms for Enterprise Data Science (20)

PDF
Operationalizing Machine Learning at Scale at Starbucks
PDF
MLOps – Applying DevOps to Competitive Advantage
PDF
.Net development with Azure Machine Learning (AzureML) Nov 2014
PDF
Marlabs Services Capabilities Overview
PDF
Introducing MLOps.pdf
PDF
ICP for Data- Enterprise platform for AI, ML and Data Science
PDF
Insider's introduction to microsoft azure machine learning: 201411 Seattle Bu...
PDF
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
PPTX
Overview DYN365O
PPTX
Mohamed Sabri: Operationalize machine learning with Kubeflow
PPTX
Mohamed Sabri: Operationalize machine learning with Kubeflow
PPTX
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
PDF
Marlabs Capabilities Overview: Banking and Finance
PDF
Marlabs Capabilities Overview: Telecom
PDF
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
PPTX
AzureML Welcome to the future of Predictive Analytics
DOC
Candra_CollinsCV112016
PDF
Marlabs Capabilities: Retail
PDF
Marlabs Capabilities Overview: Energy and Utilities
PPTX
ICML'16 Scaling ML System@Twitter
Operationalizing Machine Learning at Scale at Starbucks
MLOps – Applying DevOps to Competitive Advantage
.Net development with Azure Machine Learning (AzureML) Nov 2014
Marlabs Services Capabilities Overview
Introducing MLOps.pdf
ICP for Data- Enterprise platform for AI, ML and Data Science
Insider's introduction to microsoft azure machine learning: 201411 Seattle Bu...
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
Overview DYN365O
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
Building your first Analysis Services Tabular BI Semantic model with SQL Serv...
Marlabs Capabilities Overview: Banking and Finance
Marlabs Capabilities Overview: Telecom
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
AzureML Welcome to the future of Predictive Analytics
Candra_CollinsCV112016
Marlabs Capabilities: Retail
Marlabs Capabilities Overview: Energy and Utilities
ICML'16 Scaling ML System@Twitter
Ad

Recently uploaded (20)

DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
PPTX
chapter 5 systemdesign2008.pptx for cimputer science students
PDF
How Tridens DevSecOps Ensures Compliance, Security, and Agility
PPTX
assetexplorer- product-overview - presentation
PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
Website Design Services for Small Businesses.pdf
PPTX
Introduction to Windows Operating System
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PPTX
Custom Software Development Services.pptx.pptx
PDF
Types of Token_ From Utility to Security.pdf
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
DNT Brochure 2025 – ISV Solutions @ D365
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Weekly report ppt - harsh dattuprasad patel.pptx
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
chapter 5 systemdesign2008.pptx for cimputer science students
How Tridens DevSecOps Ensures Compliance, Security, and Agility
assetexplorer- product-overview - presentation
iTop VPN Crack Latest Version Full Key 2025
Wondershare Recoverit Full Crack New Version (Latest 2025)
Monitoring Stack: Grafana, Loki & Promtail
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Advanced SystemCare Ultimate Crack + Portable (2025)
How to Use SharePoint as an ISO-Compliant Document Management System
Website Design Services for Small Businesses.pdf
Introduction to Windows Operating System
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Custom Software Development Services.pptx.pptx
Types of Token_ From Utility to Security.pdf
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
DNT Brochure 2025 – ISV Solutions @ D365

Well architected ML platforms for Enterprise Data Science

  • 1. • PUBLIC 公開 Well Architected ML Platforms for Data Science Reliable Machine Learning Lifecycle
  • 2. Goal Definition Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Maintenance Outliers Model Regression Recall Feedback (Re-train) Battle tested CRISP-DM Model for ML Implementations
  • 4. Platform Capabilities 1 Autoscaling Cloud-native ML/AI Platform with Identity Access Management and Data Classification 2 Data collection and Curation for Business Analytics Reporting and ML Model Preparation 3 Self-service Training and Experimentation for Forecasting and Simulation 4 Prediction performance and Feedback from drift of forecast Decisions Insights Data Lake IAM PaaS/DbaaS 5 ML as a Service Model Deployment
  • 6. • PUBLIC 公開 ML Ops Platform – Technical Architecture1
  • 7. • PUBLIC 公開 ML Ops Platform – Technical Architecture2 Data Sources Customer Mart Sales Mart Sales Mart Customer Mart RISKS Mart Sales Mart Account Salesforce SAS Click Stream Speech Telematic s Customer Bus Ops Risk Profitability Click Stream Data Sourcing & Data Wrangling Data Scientists ML Engineers AWS Data and Model Exploration Spark Mlib JupyterHub Notebooks SparkMagic Livy Rest API EMR Spark EMR Presto Tensor flow H2O Sparkling Water Git Dvc ML Flow Model Registry (Versioning) Training Validation S3 S3 Data Collection Data Curation Model Repository Model Inference S3 Model Deployment ML Ops Deployment pipeline Jenkins Advanced Analytics Engine S3 Feature Engineering Api Release Predict Api EKS Jenkin s Click Stream
  • 8. • PUBLIC 公開 Heterogenous data sources Interoperability and Integration Schema Management using Parquet Data discovery Catalog Data Lineage and Classification Data Governance and Privacy
  • 9. • PUBLIC 公開 •System Appreciation and Discovery •Gap Analysis •Component Design •Team structure and RACI Plan •Milestones and Timelines •Infrastructure •Data Systems •Security •Governance Build •Socialize •Demo •Training •Benchmark •Pilot Evangelize •Transform •Dual Support •Bridge •Decommission Legacy Adopt •Release Management •Warranty •Operational Support •Design Lifecycle •Iterative Releases Run
  • 12. • PUBLIC 公開 DOCKERIZED FLASK API EKS CONTAINER REGISTRY REDUCE API BUILD DEPENDENCIES ENABLE FARGATE SERVERLESS GATEWAY
  • 13. • PUBLIC 公開 Streamlined Data collection Version controlled Feature Engineering Collaborative Discovery of features Distributed Training with Validation Reliable ML as a service Prediction performance Drift monitoring Model governance and fairness ML Ops Platform Goals
  • 14. • PUBLIC 公開 1.Data Science Teams •Conceptualiz ation •Requirement s •Prototype •Design Review 1.ML Platform Team oJAD - Joint Application Design oDesign Approval (JIRA) oModel Development oCoding oTraining oCross Validation oAPI Development Team oAPI Requirements oAPI Security oAPI Catalog oAPI Integration oDevops Team oCICD for Data pipelines oCICD for Training oCICD for Model API oInfra Team oEKS oNetwork oMonitoring o*Governance Team oModel Governance oModel Fairness oModel Monitoring oData Science Teams oModel monitoring oModel feedback
  • 15. • PUBLIC 公開 1.Model Conceptualization oBusiness need oMarket Research o Customer Feedback oBusiness value 1.Model Requirements oBusiness potential - Inputs from Model Conceptualization oData sources oCoordinating Customer and Technical requirements oPrototype 1.Model Prototype oData Collection oData Curation oFeature Engineering oTraining and Prediction 1.Model Design Review oJAD - Joint Application Design oDesign Approval 1.Model Development oOperational dependencies oIntegration dependencies oCoding oContinuous Training oFeature Engineering oTraining oCross validation oTracking
  • 16. • PUBLIC 公開 1.Model DevOps oUnit tests oCode coverage oCI/CD 1.Model Quality Assurance oModel performance oSystem performance 1.Model Deployment oCloud native oContainerized oData Pipeline oML Service oMonitoring and Logging 1.Model Integration oService oData Feed oDashboard and Visualization 1.Model Management oModel Repository oModel Catalog oModel monitoring oSimulation 1.Model Governance and Fairness oAudit oExplanation oTrace back 1.Model Feedback oRecall and Bias oTraining Outliers oContinuous Training
  • 17. • PUBLIC 公開 Monitoring Inputs and decision outputs Model fairness characteristics Overfitting, skew, bias Concept Drift Outliers Revised Training