SlideShare a Scribd company logo
1
Machine Learning Platform for AI
Senior Solutions Architect
임종진
Alibaba Cloud
2022.02
2
Introduction
3
Machine Learning Platform for AI (PAI)
PAI , .
4
Machine Learning Framework (Alink / MPI / PS / Graph / TensorFlow / PyTorch / Caffe…)
Compute Engine (MaxCompute / EMR / Realtime Compute)
PAI-EAS online prediction
• One-click
deployment
• High
performance
• Blue-green
deployment
• Blade compilation
optimization
Basic hardware (CPU, GPU, FPGA, NPU)
Alibaba Cloud Container Service for K8S (ACK)
Visual modeling
PAI-Studio
• Nearly 200 algorithm
components
• Drag-drop method to
build an experiment
• Supports many
feature samples
PAI-DSW
Interactive modeling
• Deep integration of
big data engines
• Multi-frame TF,
PyTorch
• JupyterLab, WebIDE,
Terminal
Cloud-native
deep training
PAI-DLC
• Cloud-native and
container
• Elasticity
• Out of the box
Automatic learning
PAI AutoLearning
• Zero threshold use
• Migration learning
framework
• One-stop solution
Intelligent ecological
market
• AI solution
• Algorithms &
Models
• Business
application API
AI Taobao platform
Data collection
Intelligent labeling
• Multi-scene
Template: Image
Detection,
segmentation, and
comprehensive
annotation
• Data set
management
• Active learning *
• Smart pre-labeling
• Elastic scaling
Machine Learning Platform for AI (PAI)
PAI Studio, , ML All-in-One .
5
PAI-Studio
GUI and Distributed Modeling Platform
Data preprocessing, feature engineering,
model training, modelEvaluation
prediction 200
ML
7
90%
PAI Studio PySpark,
Spark
PAI-Studio GUI ML .
6
PAI-DSW
Data Science Workshop
PAI-DSW GPU CPU ,
Deep Learning Lifecycle R&D Cloud AI Notebook .
7
PAI console
Create
Deploy
Update
Destroy
Arena CLT
Control
Panel
Manage
DLC Cluster
Manage
Jobs
ECS Clusters
VPC
ECS Instance
VPC
ECS Instance
VPC
ECS Instance
. . . . . .
DLC DLC DLC
DLC Clusters
TF Jobs PT Jobs Jupyter
Dash
Board
EIP
API Server
Scheduler
Cloud-native Linear acceleration
Fully cloud-based infrastructure
Kubernetes-native, containerization / Support ACK/ECI
Semi-Hosting & full hosting
Elastic resources, dynamic scaling
Data parallelism & Model parallelism
10 Classification Task
High cost performance
(GN6V, GN5, )
PAI-DLC
Deep Learning Containers cloud-native deep learning training platform
PAI-DLC Deep Learning .
8
PAI-TF
Network layer
AutoML-Tuning automatic parameter adjustment
TransferLearning framework
Business layer Image , , visual
ResNet LeNet VGG GoogLeNet InceptionNet... ...
0 threshold use
Small amount of data
1 station solution
, ,
Open the box, small white friendly
TF .
PAI-AutoLearning
the underlying framework based on a PAI-TF-developed migration learning framework
9
PAI-Tensorflow
Tensorflow
•
• Compilation passes
•
• Pruning
• Code generation
• I/O
PAI-Tensorflow Tensorflow .
10
Cloud-native online services
• , 40W+ QPS
• (traditional learning and deep learning)
• , scaling, blue-green
• Processor SDK
• PAI-Studio, PAI-DSW customer
• PAI-Blade, model compilation
PAI-EAS
Online prediction Elastic Algorithm Service
.
11
PAI-EAS
Online prediction Elastic Algorithm Service
Cloud
Native
ML/DL Model Support
TensorFlow、Caffe、PMML
、 OfflineModel….
CalculationDifferent alarm
timepoints Every task needs an
alarm timepoint
RESTful API .
12
:
• CNN(Image) TF
4
• CRNN (image TF
1.3
• Segmentation (Image class)
TF 1.3
• Bert (natural language)
TF 2.8
• ASR (voice) TF
2.5
PAI-Blade
One-stop compilation and optimization tool
PAI TF TF 4 .
13
PAI Compilation
Volta GPU Tensorflow .
PAI / 3 .
14
Solution Components
Input REST API ML ,
End-to-End DevOps .
15
Best Practices
16
PAI
• , VOC , , ,
Workflow :
• PAI Studio
• PAI EAS
Product
PAI + MaxCompute + DataWorks
NAS
OSS,…
MaxCompute
Distributed
Training
OCR
Machine
Translation
NLP
Content
Security
DataWorks
DataHub
Image
Identification
Risk
Control
Brain
Public
Opinion
Marketing
Cloud Resources
PAI
Studio
PAI Notebook Service
( DSW )
Developer
PAI EAS
( Elastic
Algorithm
Service )
AI Security
PAI .
17
PAI
PAI ML AI .
18
PAI
Alibaba .
PAI , RDMA NCCL
.
19
1 ‒
PAI
.
DB Dataworks
PAI Studio .
PAI EAS
Redis
.
20
Underlying basic data
User data
Data processing and
storage
(Offline) User/Material
Feature engineering
Data integration hourly cycle import
Training (offline)
Material data
Third-party portrait
RDS: MySQL
Nginx
User
Behavior
Log
Kafka
Flume
DRDS
Comment
data
MaxCompute DW
User table
Material
table
MaxCompute
User
characteristics
Material
characteristics
Behavior
characteristics
DW
Flink real-time
computing
ETL
Statistics
business
Real-time
features
Kafka
PAI-Studio
Sample generation
Recall algorithm
Sample generation
PAI-Studio
Sorting algorithm
RDS: MySQL
User/material
recommendation
list
Online recommendation storage
Redis
User/Material
Features
User vector
PAI-EAS
Model services
Inference service
Faiss
Server
Vector service
OSS transit
Item vector
Model file
Online shop service
(Online)
Sub-Table 1: Reading
history
POLARDB
Sub-Table 2: Reading
history
K8S
User exposure request
Recommended
module
Multi-Channel
recall
Exposure
deduplication
filtering
Sorting
Query K most similar items
Real-TIME
2 -
PAI
.
RDS, POLAR DB
,
.
PAI
.
21
22

More Related Content

PDF
Platform Engineering
PPTX
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
PDF
Designing microservices platforms with nats
PDF
Mlflow with databricks
PDF
Cloud-Native CI/CD on Kubernetes with Tekton Pipelines
PDF
Koalas: Making an Easy Transition from Pandas to Apache Spark
PDF
An Introduction to Kubernetes
PDF
stackconf 2022: Introduction to Vector Search with Weaviate
Platform Engineering
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Designing microservices platforms with nats
Mlflow with databricks
Cloud-Native CI/CD on Kubernetes with Tekton Pipelines
Koalas: Making an Easy Transition from Pandas to Apache Spark
An Introduction to Kubernetes
stackconf 2022: Introduction to Vector Search with Weaviate

What's hot (20)

PDF
Vector database
PDF
Monitoring Kubernetes with Prometheus
PDF
Thoughts on kafka capacity planning
PPTX
Kubernetes Workshop
PPTX
Securing and Automating Kubernetes with Kyverno
PPTX
Thrift vs Protocol Buffers vs Avro - Biased Comparison
PDF
Introductory Overview to Managing AWS with Terraform
PDF
Git and Github
PDF
Google Cloud Platform Solutions for DevOps Engineers
PDF
Delta Lake Cheat Sheet.pdf
PPTX
Programming
ODP
Europython 2011 - Playing tasks with Django & Celery
PDF
GitOps and ArgoCD
PPTX
OpenTelemetry For Developers
PDF
Productionzing ML Model Using MLflow Model Serving
PPTX
Debugging Python with Pdb!
PDF
"DevOps > CI+CD "
PDF
Hbase Kullanım Senaryoları
PDF
Monitoring with prometheus
PPTX
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Vector database
Monitoring Kubernetes with Prometheus
Thoughts on kafka capacity planning
Kubernetes Workshop
Securing and Automating Kubernetes with Kyverno
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Introductory Overview to Managing AWS with Terraform
Git and Github
Google Cloud Platform Solutions for DevOps Engineers
Delta Lake Cheat Sheet.pdf
Programming
Europython 2011 - Playing tasks with Django & Celery
GitOps and ArgoCD
OpenTelemetry For Developers
Productionzing ML Model Using MLflow Model Serving
Debugging Python with Pdb!
"DevOps > CI+CD "
Hbase Kullanım Senaryoları
Monitoring with prometheus
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Ad

Similar to 알리바바 클라우드 PAI (machine learning Platform for AI) (20)

PDF
Week 12: Cloud AI- DSA 441 Cloud Computing
PDF
leewayhertz.com-Cloud AI services A comprehensive guide.pdf
PDF
AI as a Service A Comprehensive Guide.pdf
PPTX
SkilledPrime_AI_Tools_Presentation - PPT.pptx
PDF
Ai platform at scale
PDF
2020 AI Ready Solution
PDF
On premise ai platform - from dc to edge
PDF
[Mindslab] Success stories and use cases of artificial intelligence of MindsLab.
PPTX
Innovation with ai at scale on the edge vt sept 2019 v0
PDF
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
PDF
Deep Learning Image Processing Applications in the Enterprise
PPTX
Microsoft AI Platform Overview
PPTX
OpenPOWER and IBM AI overview
PPTX
Deep Learning on Qubole Data Platform
PDF
World Artificial Intelligence Conference Shanghai 2018
PPTX
thetoptrendsinartificialintelligence-191016075410.pptx
PPTX
AI Artificial Intelligent-Machine Learning-Deep Learning .pptx
PPTX
Integrating Machine Learning Capabilities into your team
PDF
How Can AI and IoT Power the Chemical Industry?
PDF
The Top Trends in Artificial Intelligence
Week 12: Cloud AI- DSA 441 Cloud Computing
leewayhertz.com-Cloud AI services A comprehensive guide.pdf
AI as a Service A Comprehensive Guide.pdf
SkilledPrime_AI_Tools_Presentation - PPT.pptx
Ai platform at scale
2020 AI Ready Solution
On premise ai platform - from dc to edge
[Mindslab] Success stories and use cases of artificial intelligence of MindsLab.
Innovation with ai at scale on the edge vt sept 2019 v0
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
Deep Learning Image Processing Applications in the Enterprise
Microsoft AI Platform Overview
OpenPOWER and IBM AI overview
Deep Learning on Qubole Data Platform
World Artificial Intelligence Conference Shanghai 2018
thetoptrendsinartificialintelligence-191016075410.pptx
AI Artificial Intelligent-Machine Learning-Deep Learning .pptx
Integrating Machine Learning Capabilities into your team
How Can AI and IoT Power the Chemical Industry?
The Top Trends in Artificial Intelligence
Ad

More from Alibaba Cloud Korea (13)

PDF
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드 솔루션 소개 및 ᄋ...
PDF
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드의 스타트업 지원 ᄇ...
PDF
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드의 파트너 에코시스테...
PDF
[데이터 센터 오픈 기념 컨퍼런스] 알리바바그룹 코리아의 세계화 및 현ᄌ...
PDF
[데이터 센터 오픈 기념 컨퍼런스] 가장 빠른 혁신의 시작, 알리바바 ᄏ...
PDF
05 알리바바 클라우드의 주요 구축 사례와 게임사를 위한 스페셜 오퍼링(알ᄅ...
PDF
04 alibaba cloud의 ‘차별화된 게임 솔루션’(메가존 알리바바 클라우드 ᄌ...
PDF
03 게임사 고객을 위한 알리바바 클라우드 구성 소개(메가존 알리바바 ᄏ...
PDF
02 메가존 – 알리바바 클라우드 소개(메가존 알리바바 클라우드 선종윤 ᄐ...
PDF
01 한 중 네트워크 가속화 솔루션 (알리바바 클라우드 임종진 부장)
PDF
빅데이터, AI/ML 알리바바 클라우드로 비즈니스 혁신하기 ft. 압사라 컨퍼런스 2021 [3] 알리바바 클라우드 AI/ML 로 서비스 ...
PDF
하시코프와 함께하는 알리바바 클라우드 DevSecOps 뽀개기 E02 DevSec
PDF
하시코프와 함께하는 알리바바 클라우드 DevSecOps 뽀개기 E01 SecOps
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드 솔루션 소개 및 ᄋ...
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드의 스타트업 지원 ᄇ...
[데이터 센터 오픈 기념 컨퍼런스] 알리바바 클라우드의 파트너 에코시스테...
[데이터 센터 오픈 기념 컨퍼런스] 알리바바그룹 코리아의 세계화 및 현ᄌ...
[데이터 센터 오픈 기념 컨퍼런스] 가장 빠른 혁신의 시작, 알리바바 ᄏ...
05 알리바바 클라우드의 주요 구축 사례와 게임사를 위한 스페셜 오퍼링(알ᄅ...
04 alibaba cloud의 ‘차별화된 게임 솔루션’(메가존 알리바바 클라우드 ᄌ...
03 게임사 고객을 위한 알리바바 클라우드 구성 소개(메가존 알리바바 ᄏ...
02 메가존 – 알리바바 클라우드 소개(메가존 알리바바 클라우드 선종윤 ᄐ...
01 한 중 네트워크 가속화 솔루션 (알리바바 클라우드 임종진 부장)
빅데이터, AI/ML 알리바바 클라우드로 비즈니스 혁신하기 ft. 압사라 컨퍼런스 2021 [3] 알리바바 클라우드 AI/ML 로 서비스 ...
하시코프와 함께하는 알리바바 클라우드 DevSecOps 뽀개기 E02 DevSec
하시코프와 함께하는 알리바바 클라우드 DevSecOps 뽀개기 E01 SecOps

Recently uploaded (20)

PDF
Paper PDF World Game (s) Great Redesign.pdf
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
presentation_pfe-universite-molay-seltan.pptx
PDF
Sims 4 Historia para lo sims 4 para jugar
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PPTX
Introduction to Information and Communication Technology
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
RPKI Status Update, presented by Makito Lay at IDNOG 10
PPTX
E -tech empowerment technologies PowerPoint
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PDF
Testing WebRTC applications at scale.pdf
PPTX
Digital Literacy And Online Safety on internet
PDF
Introduction to the IoT system, how the IoT system works
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Paper PDF World Game (s) Great Redesign.pdf
Design_with_Watersergyerge45hrbgre4top (1).ppt
522797556-Unit-2-Temperature-measurement-1-1.pptx
presentation_pfe-universite-molay-seltan.pptx
Sims 4 Historia para lo sims 4 para jugar
Power Point - Lesson 3_2.pptx grad school presentation
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
Tenda Login Guide: Access Your Router in 5 Easy Steps
Job_Card_System_Styled_lorem_ipsum_.pptx
Introduction to Information and Communication Technology
WebRTC in SignalWire - troubleshooting media negotiation
RPKI Status Update, presented by Makito Lay at IDNOG 10
E -tech empowerment technologies PowerPoint
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
Testing WebRTC applications at scale.pdf
Digital Literacy And Online Safety on internet
Introduction to the IoT system, how the IoT system works
SASE Traffic Flow - ZTNA Connector-1.pdf
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)

알리바바 클라우드 PAI (machine learning Platform for AI)

  • 1. 1 Machine Learning Platform for AI Senior Solutions Architect 임종진 Alibaba Cloud 2022.02
  • 3. 3 Machine Learning Platform for AI (PAI) PAI , .
  • 4. 4 Machine Learning Framework (Alink / MPI / PS / Graph / TensorFlow / PyTorch / Caffe…) Compute Engine (MaxCompute / EMR / Realtime Compute) PAI-EAS online prediction • One-click deployment • High performance • Blue-green deployment • Blade compilation optimization Basic hardware (CPU, GPU, FPGA, NPU) Alibaba Cloud Container Service for K8S (ACK) Visual modeling PAI-Studio • Nearly 200 algorithm components • Drag-drop method to build an experiment • Supports many feature samples PAI-DSW Interactive modeling • Deep integration of big data engines • Multi-frame TF, PyTorch • JupyterLab, WebIDE, Terminal Cloud-native deep training PAI-DLC • Cloud-native and container • Elasticity • Out of the box Automatic learning PAI AutoLearning • Zero threshold use • Migration learning framework • One-stop solution Intelligent ecological market • AI solution • Algorithms & Models • Business application API AI Taobao platform Data collection Intelligent labeling • Multi-scene Template: Image Detection, segmentation, and comprehensive annotation • Data set management • Active learning * • Smart pre-labeling • Elastic scaling Machine Learning Platform for AI (PAI) PAI Studio, , ML All-in-One .
  • 5. 5 PAI-Studio GUI and Distributed Modeling Platform Data preprocessing, feature engineering, model training, modelEvaluation prediction 200 ML 7 90% PAI Studio PySpark, Spark PAI-Studio GUI ML .
  • 6. 6 PAI-DSW Data Science Workshop PAI-DSW GPU CPU , Deep Learning Lifecycle R&D Cloud AI Notebook .
  • 7. 7 PAI console Create Deploy Update Destroy Arena CLT Control Panel Manage DLC Cluster Manage Jobs ECS Clusters VPC ECS Instance VPC ECS Instance VPC ECS Instance . . . . . . DLC DLC DLC DLC Clusters TF Jobs PT Jobs Jupyter Dash Board EIP API Server Scheduler Cloud-native Linear acceleration Fully cloud-based infrastructure Kubernetes-native, containerization / Support ACK/ECI Semi-Hosting & full hosting Elastic resources, dynamic scaling Data parallelism & Model parallelism 10 Classification Task High cost performance (GN6V, GN5, ) PAI-DLC Deep Learning Containers cloud-native deep learning training platform PAI-DLC Deep Learning .
  • 8. 8 PAI-TF Network layer AutoML-Tuning automatic parameter adjustment TransferLearning framework Business layer Image , , visual ResNet LeNet VGG GoogLeNet InceptionNet... ... 0 threshold use Small amount of data 1 station solution , , Open the box, small white friendly TF . PAI-AutoLearning the underlying framework based on a PAI-TF-developed migration learning framework
  • 9. 9 PAI-Tensorflow Tensorflow • • Compilation passes • • Pruning • Code generation • I/O PAI-Tensorflow Tensorflow .
  • 10. 10 Cloud-native online services • , 40W+ QPS • (traditional learning and deep learning) • , scaling, blue-green • Processor SDK • PAI-Studio, PAI-DSW customer • PAI-Blade, model compilation PAI-EAS Online prediction Elastic Algorithm Service .
  • 11. 11 PAI-EAS Online prediction Elastic Algorithm Service Cloud Native ML/DL Model Support TensorFlow、Caffe、PMML 、 OfflineModel…. CalculationDifferent alarm timepoints Every task needs an alarm timepoint RESTful API .
  • 12. 12 : • CNN(Image) TF 4 • CRNN (image TF 1.3 • Segmentation (Image class) TF 1.3 • Bert (natural language) TF 2.8 • ASR (voice) TF 2.5 PAI-Blade One-stop compilation and optimization tool PAI TF TF 4 .
  • 13. 13 PAI Compilation Volta GPU Tensorflow . PAI / 3 .
  • 14. 14 Solution Components Input REST API ML , End-to-End DevOps .
  • 16. 16 PAI • , VOC , , , Workflow : • PAI Studio • PAI EAS Product PAI + MaxCompute + DataWorks NAS OSS,… MaxCompute Distributed Training OCR Machine Translation NLP Content Security DataWorks DataHub Image Identification Risk Control Brain Public Opinion Marketing Cloud Resources PAI Studio PAI Notebook Service ( DSW ) Developer PAI EAS ( Elastic Algorithm Service ) AI Security PAI .
  • 18. 18 PAI Alibaba . PAI , RDMA NCCL .
  • 19. 19 1 ‒ PAI . DB Dataworks PAI Studio . PAI EAS Redis .
  • 20. 20 Underlying basic data User data Data processing and storage (Offline) User/Material Feature engineering Data integration hourly cycle import Training (offline) Material data Third-party portrait RDS: MySQL Nginx User Behavior Log Kafka Flume DRDS Comment data MaxCompute DW User table Material table MaxCompute User characteristics Material characteristics Behavior characteristics DW Flink real-time computing ETL Statistics business Real-time features Kafka PAI-Studio Sample generation Recall algorithm Sample generation PAI-Studio Sorting algorithm RDS: MySQL User/material recommendation list Online recommendation storage Redis User/Material Features User vector PAI-EAS Model services Inference service Faiss Server Vector service OSS transit Item vector Model file Online shop service (Online) Sub-Table 1: Reading history POLARDB Sub-Table 2: Reading history K8S User exposure request Recommended module Multi-Channel recall Exposure deduplication filtering Sorting Query K most similar items Real-TIME 2 - PAI . RDS, POLAR DB , . PAI .
  • 21. 21
  • 22. 22