SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
,
S 1
• S
• S 1
S 1 2
• M 2
• . 2 1 A
.
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
Accelerated
데이터베이스 관Dd 부담이 많습니다.
관계형 DB 는 확장성이 T지 않아요.
Had,,p 배o 및 관Dp기. 힘?니다.
기존 D)는 L잡p고 비싸고 느립니다.
상a DB는 고비a에 관D, 확장이 어B워요.
실W/ 데이터는 수집p고 분석p기 힘?니다.
데이터 클E징(E(L)을 좀더 T게 할 수 없을까요?
딥러닝 데이터 H델/배o를 좀 더 T게 p고 싶어요.
ü Amazon RDS
ü Amazon DynamoDB
ü Amazon EMR
ü Amazon Redshift
ü Amazon Aurora
ü Amazon Kinesis
ü AWS Glue
ü Amazon SageMaker
,
A
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
Transactions
ERP
Data analysts
1 4
0 9
5
Amazon
Quicksight
2 45 ) B e M
01) 3 W D cKa cG CS
: ) R D (
:4 ) : E D
https://aws.amazon.com/ko/solutions/case-studies/supercell/
Transactions
ERP
Data
Lake
expdp
Data Data analysts
Data Warehouse
Amazon Redshift
Direct Query
Amazon Athena
Data Storage
Amazon S3
Data
Lake
Business users
Transactions
ERP
Social media
Data
Stream
Capture
Amazon
Kinesis
Events
Amazon
QuickSight
Data Warehouse
Amazon Redshift
Stream Data
Amazon
ElasticSearch
Data Storage
Amazon S3
Raw Data
Amazon S3
ETL (Hadoop)
Amazon EMR
Triggered Code
Amazon Lambda
Staged Data
(Data Lake)
Amazon S3
ETL & Catalog Management
AWS Glue
Data Warehouse
Amazon Redshift
Triggered Code
Amazon Lambda
Transactions
Data scientists
Business users
Connected
devices
Data
Event
Insights
Data
Lake
ML / Analytics / DLWeb logs /
clickstream
Amazon SageMaker
A QUIET OFFICE
Amazon SageMaker
Image Classification
Amazon Rekognition
Image
CHAIR
LAPTOP
LAMP
DESK
97%
95%
88%
82%
Object Identification
WORKING!
<HISTORY>
Modern data architecture
Real-time engagement and interactive customer experiences
Transactions
ERP
Data analysts
Data scientists
Business users
Engagement platformsConnected
devices
Automation / events
Data
Event Action
Insights
Data
Lake
ML / Analytics
Predict / Recommend
AI Services
Social media
Web logs /
clickstream
.
AWS DEEP LEARNING AMI
Apache MXNet TensorFlowCaffe2 Torch KerasCNTK PyTorch GluonTheano
VISION
AWS DeepLensAmazon SageMaker
LANGUAGE
Amazon
Rekognition
Amazon
Polly Amazon Lex
Amazon Rekognition
Video
Amazon
Transcribe
Amazon
Comprehend
Alexa for
Business
VR/AR
Amazon Sumerian
Amazon Machine Learning Amazon EMR & SparkMechanical Turk
INSTANCES
GPU(G2/P2/P3) CPU (C5) FPGA (F1)
Amazon
Translate
( )
!
(
)
-
J N
)
( ((
-
J N
)
( ((
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
Factorization Machines
Linear Learner - Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner - Classification
ALGORITHMS
Apache MXNet
TensorFlow
Caffe2, CNTK,
PyTorch, Torch
FRAMEWORKS
-
J N
)
( ((
-
H
J
N
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
Factorization Machines
Linear Learner - Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner - Classification
BUILT
ALGORITHMS
Caffe2, CNTK, PyTorch,
Torch
IM Estimators in Spark
DEEP LEARNING
FRAMEWORKS
Bring Your Own Script
(IM builds the Container)
BRING YOUR OWN
MODEL
ML
Training
code
Fetch Training data
Save Model
Artifacts
Amazon ECR
Save Inference
Image
Amazon S3
https://nucleusresearch.com/research/single/guidebook-tensorflow-aws/
In analyzing the experiences of researchers supporting
more than 388unique projects, Nucleus found that 88
percent of cloud-based TensorFlow projects are
running on Amazon Web Services (AWS).
“
from sagemaker.tensorflow import TensorFlow
tf_estimator = TensorFlow(
entry_point="tf-train.py", role='SageMakerRole',
training_steps=10000, evaluation_steps=100,
train_instance_count=1, train_instance_type='ml.p2.xlarge’)
tf_estimator.fit('s3://bucket/path/to/training/data’)
from sagemaker.pytorch import Pytorch
pytorch_estimator = Pytorch(entry_point="pt-train.py",
framework_version=”0.4.0”, role='SageMakerRole',
train_instance_type="ml.p2.xlarge", train_instance_count=2,
hyperparameters={ 'epochs': 6, 'backend': 'gloo’ })
pytorch_estimator.fit("s3://my_bucket/my_training_data/")
-
H
J
N
-
H
J
N
predictor = tf_estimator.deploy(
initial_instance_count=1,
instance_type='ml.c4.xlarge')
predictor = mxnet_estimator.deploy(
deploy_instance_type="ml.p2.xlarge",
min_instances=1,
https://runtime.sagemaker.us-east-1.amazonaws.com/
endpoints/model-name/invocations
• BK A ID A
• A I
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
Amazon ECR
Code Commit
Code Pipeline
SageMaker
Hosting
Coco dataset
AWS
Lambda
API
Gateway
Build Train
Deploy
static website hosted on S3
Inference requests
Amazon S3
Amazon
Cloudfront
Web assets on
Cloudfront
-
•
• w I n
l 9 l NF
l
l T
• C x N S
• , C e ,
• 03 2 x
oMs r C
• N S
• C 8
sagemaker = boto3.client(service_name='sagemaker')
sagemaker.create_training_job(**training_params)
create_model_response = sage.create_model(
ModelName = model_name,
ExecutionRoleArn = role,
PrimaryContainer = primary_container)
endpoint_config_response = sage.create_endpoint_config(
EndpointConfigName = endpoint_config_name,
ProductionVariants=[{
'InstanceType':'ml.m4.xlarge',
'InitialInstanceCount':1,
'ModelName':model_name,
'VariantName':'AllTraffic'}])
endpoint_response = sagemaker.create_endpoint(
'EndpointName': endpoint_name,
'EndpointConfigName': endpoint_config_name
2
.
1 3
.
•
•
•
•
1
4.75
8.5
12.25
16
1 4.75 8.5 12.25 16
Speedup(x)
# GPUs
Resnet 152
Inceptin V3
Alexnet
Ideal
P2.16xlarge (8 Nvidia Tesla K80 - 16 GPUs)
Synchronous SGD (Stochastic Gradient Descent)
91%
Efficiency
88%
Efficiency
16x P2.16xlarge by AWS CloudFormation
Mounted on Amazon EFS
# GPUs
## train data
num_gpus = 4
gpus = [mx.gpu(i) for i in range(num_gpus)]
model = mx.model.FeedForward(
ctx = gpus,
symbol = softmax,
num_round = 20,
learning_rate = 0.01,
momentum = 0.9,
wd = 0.00001)
model.fit(X = train, eval_data = val,
batch_end_callback =
mx.callback.Speedometer(batch_size=batch_size))
기반 예제
B : A I A AA
• ( A B
• . DD A DD A B A
• A A IBD A AD D AD
• -A D AD D : D
• BB A
• -. D A: D :
• /D BD D A
• - AD C D :
• A D
• D A D )A B D A A
)..
http://mxnet.io/
https://github.com/dmlc/mxnet
http://incubator.apache.org/projects/mxnet.html
http://gluon.mxnet.io
-
H
• ,X P b fd S
• ( C X g NT
MI ce
• ) A ) A A
A K a W
• A ,C C a
We plan to use Amazon SageMaker to train models
against petabytes of Earth observation imagery datasets
using hosted Jupyter notebooks, so DigitalGlobe's
Geospatial Big Data Platform (GBDX) users can just push a
button, create a model, and deploy it all within one
scalable distributed environment at scale.
- Dr. Walter Scott, CTO of Maxar Technologies and founder of DigitalGlobe
EC
: A C
“With Amazon SageMaker, we can accelerate our Artificial Intelligence
initiatives at scale by building and deploying our algorithms on the
platform. We will create novel large-scale machine learning and AI
algorithms and deploy them on this platform to solve complex
problems that can power prosperity for our customers."
- Ashok Srivastava, Chief Data Officer, Intuit
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
Single
Machine
Distributed, with
Strong Machines
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
EC2 + AMI
Amazon SageMaker
On-premise
!
AWS
Only
http://bit.ly/awskr-ml-credits
-
e - o n S Ug m M
m R . 21, 31 :
r L A : k a) .( m :

More Related Content

PDF
아마존의 딥러닝 기술 활용 사례 - 윤석찬 (AWS 테크니컬 에반젤리스트)
PDF
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
PDF
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
PDF
AWS DevDay Seoul 2017 - Keynote
PDF
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
PPTX
WhereML a Serverless ML Powered Location Guessing Twitter Bot
PDF
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
PPTX
Build, train, and deploy Machine Learning models at scale (May 2018)
아마존의 딥러닝 기술 활용 사례 - 윤석찬 (AWS 테크니컬 에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
데이터 기반 의사결정을 통한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
AWS DevDay Seoul 2017 - Keynote
[AWS Dev Day] 인공지능 / 기계 학습 | 개발자를 위한 수백만 사용자 대상 기계 학습 서비스 확장 하기 - 윤석찬 AWS 수석테...
WhereML a Serverless ML Powered Location Guessing Twitter Bot
프론트엔드 개발자를 위한 서버리스 - 윤석찬 (AWS 테크에반젤리스트)
Build, train, and deploy Machine Learning models at scale (May 2018)

What's hot (11)

PPTX
Where ml ai_heavy
PDF
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
PDF
Deep Learning for Developers (October 2017)
PPTX
Randall's re:Invent Recap
PDF
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
PPTX
Deep Dive: AWS X-Ray London Summit 2017
PPTX
Build, train, and deploy Machine Learning models at scale (May 2018)
PPTX
MXNet Paris Workshop - Intro To MXNet
PDF
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
PDF
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
PDF
Realtime Analytics on AWS
Where ml ai_heavy
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
Deep Learning for Developers (October 2017)
Randall's re:Invent Recap
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Deep Dive: AWS X-Ray London Summit 2017
Build, train, and deploy Machine Learning models at scale (May 2018)
MXNet Paris Workshop - Intro To MXNet
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
Realtime Analytics on AWS
Ad

Similar to AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트) (20)

PDF
Multiplatform Spark solution for Graph datasources by Javier Dominguez
PDF
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
PDF
Scaling up with Cisco Big Data: Data + Science = Data Science
PDF
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
PDF
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
PPTX
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
PPTX
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
ODP
Cloud Computing ...changes everything
PDF
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
PPTX
Deploy Deep Learning Models with TensorFlow + Lambda
PDF
New Developments in Spark
PDF
20181027 deep learningcommunity_aws
PDF
Dev Ops Training
PDF
아마존의 딥러닝 기술 활용 사례
PDF
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
PDF
Scalable AutoML for Time Series Forecasting using Ray
PDF
Satwik Mishra resume
PDF
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
PPTX
Running Presto and Spark on the Netflix Big Data Platform
PDF
04 open source_tools
Multiplatform Spark solution for Graph datasources by Javier Dominguez
클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)
Scaling up with Cisco Big Data: Data + Science = Data Science
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
Cloud Computing ...changes everything
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Deploy Deep Learning Models with TensorFlow + Lambda
New Developments in Spark
20181027 deep learningcommunity_aws
Dev Ops Training
아마존의 딥러닝 기술 활용 사례
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Scalable AutoML for Time Series Forecasting using Ray
Satwik Mishra resume
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Running Presto and Spark on the Netflix Big Data Platform
04 open source_tools
Ad

More from Amazon Web Services Korea (20)

PDF
[D3T1S01] Gen AI를 위한 Amazon Aurora 활용 사례 방법
PDF
[D3T1S06] Neptune Analytics with Vector Similarity Search
PDF
[D3T1S03] Amazon DynamoDB design puzzlers
PDF
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
PDF
[D3T1S07] AWS S3 - 클라우드 환경에서 데이터베이스 보호하기
PDF
[D3T1S05] Aurora 혼합 구성 아키텍처를 사용하여 예상치 못한 트래픽 급증 대응하기
PDF
[D3T1S02] Aurora Limitless Database Introduction
PDF
[D3T2S01] Amazon Aurora MySQL 메이저 버전 업그레이드 및 Amazon B/G Deployments 실습
PDF
[D3T2S03] Data&AI Roadshow 2024 - Amazon DocumentDB 실습
PDF
AWS Modern Infra with Storage Roadshow 2023 - Day 2
PDF
AWS Modern Infra with Storage Roadshow 2023 - Day 1
PDF
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
PDF
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
PDF
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
PDF
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
PDF
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
PDF
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
PDF
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
PDF
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
PDF
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
[D3T1S01] Gen AI를 위한 Amazon Aurora 활용 사례 방법
[D3T1S06] Neptune Analytics with Vector Similarity Search
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S07] AWS S3 - 클라우드 환경에서 데이터베이스 보호하기
[D3T1S05] Aurora 혼합 구성 아키텍처를 사용하여 예상치 못한 트래픽 급증 대응하기
[D3T1S02] Aurora Limitless Database Introduction
[D3T2S01] Amazon Aurora MySQL 메이저 버전 업그레이드 및 Amazon B/G Deployments 실습
[D3T2S03] Data&AI Roadshow 2024 - Amazon DocumentDB 실습
AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 1
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Machine learning based COVID-19 study performance prediction
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The Rise and Fall of 3GPP – Time for a Sabbatical?
Digital-Transformation-Roadmap-for-Companies.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf

AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ,
  • 2. S 1 • S • S 1 S 1 2 • M 2 • . 2 1 A
  • 3. .
  • 6. 데이터베이스 관Dd 부담이 많습니다. 관계형 DB 는 확장성이 T지 않아요. Had,,p 배o 및 관Dp기. 힘?니다. 기존 D)는 L잡p고 비싸고 느립니다. 상a DB는 고비a에 관D, 확장이 어B워요. 실W/ 데이터는 수집p고 분석p기 힘?니다. 데이터 클E징(E(L)을 좀더 T게 할 수 없을까요? 딥러닝 데이터 H델/배o를 좀 더 T게 p고 싶어요. ü Amazon RDS ü Amazon DynamoDB ü Amazon EMR ü Amazon Redshift ü Amazon Aurora ü Amazon Kinesis ü AWS Glue ü Amazon SageMaker
  • 7. , A
  • 9. Transactions ERP Data analysts 1 4 0 9 5 Amazon Quicksight
  • 10. 2 45 ) B e M 01) 3 W D cKa cG CS : ) R D ( :4 ) : E D https://aws.amazon.com/ko/solutions/case-studies/supercell/
  • 11. Transactions ERP Data Lake expdp Data Data analysts Data Warehouse Amazon Redshift Direct Query Amazon Athena Data Storage Amazon S3
  • 12. Data Lake Business users Transactions ERP Social media Data Stream Capture Amazon Kinesis Events Amazon QuickSight Data Warehouse Amazon Redshift Stream Data Amazon ElasticSearch Data Storage Amazon S3
  • 13. Raw Data Amazon S3 ETL (Hadoop) Amazon EMR Triggered Code Amazon Lambda Staged Data (Data Lake) Amazon S3 ETL & Catalog Management AWS Glue Data Warehouse Amazon Redshift Triggered Code Amazon Lambda
  • 15. A QUIET OFFICE Amazon SageMaker Image Classification Amazon Rekognition Image CHAIR LAPTOP LAMP DESK 97% 95% 88% 82% Object Identification WORKING! <HISTORY>
  • 16. Modern data architecture Real-time engagement and interactive customer experiences Transactions ERP Data analysts Data scientists Business users Engagement platformsConnected devices Automation / events Data Event Action Insights Data Lake ML / Analytics Predict / Recommend AI Services Social media Web logs / clickstream
  • 17. .
  • 18. AWS DEEP LEARNING AMI Apache MXNet TensorFlowCaffe2 Torch KerasCNTK PyTorch GluonTheano VISION AWS DeepLensAmazon SageMaker LANGUAGE Amazon Rekognition Amazon Polly Amazon Lex Amazon Rekognition Video Amazon Transcribe Amazon Comprehend Alexa for Business VR/AR Amazon Sumerian Amazon Machine Learning Amazon EMR & SparkMechanical Turk INSTANCES GPU(G2/P2/P3) CPU (C5) FPGA (F1) Amazon Translate ( )
  • 19. ! ( )
  • 21. - J N ) ( (( K-Means Clustering Principal Component Analysis Neural Topic Modelling Factorization Machines Linear Learner - Regression XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq Linear Learner - Classification ALGORITHMS Apache MXNet TensorFlow Caffe2, CNTK, PyTorch, Torch FRAMEWORKS
  • 24. K-Means Clustering Principal Component Analysis Neural Topic Modelling Factorization Machines Linear Learner - Regression XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq Linear Learner - Classification BUILT ALGORITHMS Caffe2, CNTK, PyTorch, Torch IM Estimators in Spark DEEP LEARNING FRAMEWORKS Bring Your Own Script (IM builds the Container) BRING YOUR OWN MODEL ML Training code Fetch Training data Save Model Artifacts Amazon ECR Save Inference Image Amazon S3
  • 25. https://nucleusresearch.com/research/single/guidebook-tensorflow-aws/ In analyzing the experiences of researchers supporting more than 388unique projects, Nucleus found that 88 percent of cloud-based TensorFlow projects are running on Amazon Web Services (AWS). “
  • 26. from sagemaker.tensorflow import TensorFlow tf_estimator = TensorFlow( entry_point="tf-train.py", role='SageMakerRole', training_steps=10000, evaluation_steps=100, train_instance_count=1, train_instance_type='ml.p2.xlarge’) tf_estimator.fit('s3://bucket/path/to/training/data’) from sagemaker.pytorch import Pytorch pytorch_estimator = Pytorch(entry_point="pt-train.py", framework_version=”0.4.0”, role='SageMakerRole', train_instance_type="ml.p2.xlarge", train_instance_count=2, hyperparameters={ 'epochs': 6, 'backend': 'gloo’ }) pytorch_estimator.fit("s3://my_bucket/my_training_data/")
  • 29. predictor = tf_estimator.deploy( initial_instance_count=1, instance_type='ml.c4.xlarge') predictor = mxnet_estimator.deploy( deploy_instance_type="ml.p2.xlarge", min_instances=1, https://runtime.sagemaker.us-east-1.amazonaws.com/ endpoints/model-name/invocations • BK A ID A • A I
  • 30. SageMaker Notebooks Training Algorithm SageMaker Training Amazon ECR Code Commit Code Pipeline SageMaker Hosting Coco dataset AWS Lambda API Gateway Build Train Deploy static website hosted on S3 Inference requests Amazon S3 Amazon Cloudfront Web assets on Cloudfront
  • 31. - • • w I n l 9 l NF l l T • C x N S • , C e , • 03 2 x oMs r C • N S • C 8
  • 32. sagemaker = boto3.client(service_name='sagemaker') sagemaker.create_training_job(**training_params) create_model_response = sage.create_model( ModelName = model_name, ExecutionRoleArn = role, PrimaryContainer = primary_container) endpoint_config_response = sage.create_endpoint_config( EndpointConfigName = endpoint_config_name, ProductionVariants=[{ 'InstanceType':'ml.m4.xlarge', 'InitialInstanceCount':1, 'ModelName':model_name, 'VariantName':'AllTraffic'}]) endpoint_response = sagemaker.create_endpoint( 'EndpointName': endpoint_name, 'EndpointConfigName': endpoint_config_name 2 . 1 3 .
  • 34. 1 4.75 8.5 12.25 16 1 4.75 8.5 12.25 16 Speedup(x) # GPUs Resnet 152 Inceptin V3 Alexnet Ideal P2.16xlarge (8 Nvidia Tesla K80 - 16 GPUs) Synchronous SGD (Stochastic Gradient Descent) 91% Efficiency 88% Efficiency 16x P2.16xlarge by AWS CloudFormation Mounted on Amazon EFS # GPUs
  • 35. ## train data num_gpus = 4 gpus = [mx.gpu(i) for i in range(num_gpus)] model = mx.model.FeedForward( ctx = gpus, symbol = softmax, num_round = 20, learning_rate = 0.01, momentum = 0.9, wd = 0.00001) model.fit(X = train, eval_data = val, batch_end_callback = mx.callback.Speedometer(batch_size=batch_size))
  • 36. 기반 예제 B : A I A AA • ( A B • . DD A DD A B A • A A IBD A AD D AD • -A D AD D : D • BB A • -. D A: D : • /D BD D A • - AD C D : • A D • D A D )A B D A A ).. http://mxnet.io/ https://github.com/dmlc/mxnet http://incubator.apache.org/projects/mxnet.html
  • 37. http://gluon.mxnet.io - H • ,X P b fd S • ( C X g NT MI ce • ) A ) A A A K a W • A ,C C a
  • 38. We plan to use Amazon SageMaker to train models against petabytes of Earth observation imagery datasets using hosted Jupyter notebooks, so DigitalGlobe's Geospatial Big Data Platform (GBDX) users can just push a button, create a model, and deploy it all within one scalable distributed environment at scale. - Dr. Walter Scott, CTO of Maxar Technologies and founder of DigitalGlobe
  • 39. EC : A C “With Amazon SageMaker, we can accelerate our Artificial Intelligence initiatives at scale by building and deploying our algorithms on the platform. We will create novel large-scale machine learning and AI algorithms and deploy them on this platform to solve complex problems that can power prosperity for our customers." - Ashok Srivastava, Chief Data Officer, Intuit
  • 40. $$$$ $$$ $$ $ Minutes Hours Days Weeks Months Single Machine Distributed, with Strong Machines
  • 41. $$$$ $$$ $$ $ Minutes Hours Days Weeks Months EC2 + AMI Amazon SageMaker On-premise
  • 42. ! AWS Only http://bit.ly/awskr-ml-credits - e - o n S Ug m M m R . 21, 31 : r L A : k a) .( m :