SlideShare a Scribd company logo
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hagay Lupesko
01.25.2018
Model Serving for Deep Learning
Amazon AI
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning
AI
Machine
Learning
Deep
Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brief Intro to Deep Learning – Neural Networks
Output
Layer
Input
Layer
Hidden
Layers
Many
More…
• Non linear
• Hierarchical
feature learning
• Scalable
architecture
• Computationally
intensive
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It has a growing impact on our lives
Personalization Logistics Voice Autonomous
Vehicles
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Learning is a Big Deal
It’s able to do better than other ML and Humans
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
Model Server
Mobile
Desktop
IoT
Internet
So what does a deployed model looks like?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Availability
Networking
Monitoring
Model Decoupling
Cross Framework
Cross Platform
The Undifferentiated
Heavy Lifting of
Model Serving
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tensor Flow
Serving
Model Server
for MXNet
UC Berkeley
Clipper
Model Serving Systems for Deep Learning
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Demo Time!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model Archive
REST and
OpenAPI
Containerized
ONNX Support Operational Metrics
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Trained
Network
Model
Signature
Custom
Code
Auxiliary
Assets
Model Archive
Model Export CLI
Model Archive
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
REST and OpenAPI
REST-like endpoint: <model-name>/predict
Endpoint auto-generated from the model’s signature.json
JSON encoding by default
Binary input via request payload
OpenAPI support – client code-gen and tooling
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
MMS
Dockerfile
Build
Push
Launch
Containerization
Container Cluster
MMS Container
MMS ContainerMMS Container
MXNet NGINX
MXNet Model Server
Lightweight virtualization, isolation, runs anywhere
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Requests
• Latencies
• Resources
Metrics
• Model Name
• Host Name
Dimensions
• Log / CSV
• AWS CloudWatch
Target
Operational Metrics
Back
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
O(n2)
Pairs
MXNet
Caffe2
PyTorch
TF
CNTKCoreML
TensorRT
NGraph
SNPEMany Frameworks
ONNX Support
Many Platforms
ONNX: Common IR
Supported in MMS v0.2
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
• Batching
• Caching
• JIT Compilation
• Custom code
• Quantization Platform
• New players
• ONNX
• Plugins
Adoption
• Ease of use
• Internal
Amazon dev
tools
• Industry
partners
Challenges Ahead
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Open source – try it out and file issues
github.com/awslabs/mxnet-model-server
mxnet-sdk-team@amazon.com

More Related Content

PDF
Tensult introduction deck
PDF
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
PDF
How Websites go Serverless - WebSummit Lisbon 2018
PDF
Amazon's Innovation with Machine Learning
PDF
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
PDF
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
PPTX
Amazon Time Sync Service now makes it easier to generate and compare timestamps
PPTX
Aws sysops.1
Tensult introduction deck
AWS 마켓플레이스 기반 API 비즈니스 성장 경험 공유 (김건오 대표, 트윈워드) :: AWS TechShift 2018
How Websites go Serverless - WebSummit Lisbon 2018
Amazon's Innovation with Machine Learning
Cloud Computing Tutorial For Beginners | What is Cloud Computing | AWS Traini...
AWS SysOps Administrator Training | AWS SysOps Tutorial | Edureka
Amazon Time Sync Service now makes it easier to generate and compare timestamps
Aws sysops.1

Similar to Deep learning systems model serving (20)

PPTX
Model Serving for Deep Learning
PPTX
Emotion recognition in images: from idea to a model in production - Nordic DS...
PPTX
ONNX - The Lingua Franca of Deep Learning
PPTX
AI powered emotion recognition: From Inception to Production - Global AI Conf...
PPTX
AI powered emotion recognition: From Inception to Production - Global AI Conf...
PDF
Emotion Recognition in Images
PDF
Time series modeling workd AMLD 2018 Lausanne
PDF
Apache MXNet EcoSystem - ACNA2018
PPTX
End-to-End Deep Learning Deployment with ONNX
PDF
Artificial Intelligence (Machine Learning) on AWS: How to Start
PPTX
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
PDF
Artificial Intelligence (Machine Learning) on AWS: How to Start
PDF
C19013010 the tutorial to build shared ai services session 1
PDF
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
PDF
Deploy PyTorch models in Production on AWS with TorchServe
PPTX
Clipper at UC Berkeley RISECamp 2017
PDF
AI Services for Developers - Floor28
PPTX
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
PDF
Deploying End-to-End Deep Learning Pipelines with ONNX
PDF
AI & Machine Learning at AWS - An Introduction
Model Serving for Deep Learning
Emotion recognition in images: from idea to a model in production - Nordic DS...
ONNX - The Lingua Franca of Deep Learning
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Emotion Recognition in Images
Time series modeling workd AMLD 2018 Lausanne
Apache MXNet EcoSystem - ACNA2018
End-to-End Deep Learning Deployment with ONNX
Artificial Intelligence (Machine Learning) on AWS: How to Start
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Artificial Intelligence (Machine Learning) on AWS: How to Start
C19013010 the tutorial to build shared ai services session 1
엔터프라이즈를 위한 머신러닝 그리고 AWS (김일호 솔루션즈 아키텍트, AWS) :: AWS Techforum 2018
Deploy PyTorch models in Production on AWS with TorchServe
Clipper at UC Berkeley RISECamp 2017
AI Services for Developers - Floor28
Deep Learning with TensorFlow and Apache MXNet on Amazon SageMaker (March 2019)
Deploying End-to-End Deep Learning Pipelines with ONNX
AI & Machine Learning at AWS - An Introduction
Ad

Recently uploaded (20)

PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Digital Strategies for Manufacturing Companies
PPTX
Introduction to Artificial Intelligence
PDF
AI in Product Development-omnex systems
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Wondershare Filmora 15 Crack With Activation Key [2025
Understanding Forklifts - TECH EHS Solution
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
2025 Textile ERP Trends: SAP, Odoo & Oracle
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
ai tools demonstartion for schools and inter college
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
How to Choose the Right IT Partner for Your Business in Malaysia
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PTS Company Brochure 2025 (1).pdf.......
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
How to Migrate SBCGlobal Email to Yahoo Easily
Operating system designcfffgfgggggggvggggggggg
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Odoo POS Development Services by CandidRoot Solutions
Digital Strategies for Manufacturing Companies
Introduction to Artificial Intelligence
AI in Product Development-omnex systems
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Ad

Deep learning systems model serving

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hagay Lupesko 01.25.2018 Model Serving for Deep Learning Amazon AI
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning AI Machine Learning Deep Learning
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Brief Intro to Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More… • Non linear • Hierarchical feature learning • Scalable architecture • Computationally intensive
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It has a growing impact on our lives Personalization Logistics Voice Autonomous Vehicles
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Model Server Mobile Desktop IoT Internet So what does a deployed model looks like?
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance Availability Networking Monitoring Model Decoupling Cross Framework Cross Platform The Undifferentiated Heavy Lifting of Model Serving
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tensor Flow Serving Model Server for MXNet UC Berkeley Clipper Model Serving Systems for Deep Learning
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s Demo Time!
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model Archive REST and OpenAPI Containerized ONNX Support Operational Metrics
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Trained Network Model Signature Custom Code Auxiliary Assets Model Archive Model Export CLI Model Archive Back
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. REST and OpenAPI REST-like endpoint: <model-name>/predict Endpoint auto-generated from the model’s signature.json JSON encoding by default Binary input via request payload OpenAPI support – client code-gen and tooling Back
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MMS Dockerfile Build Push Launch Containerization Container Cluster MMS Container MMS ContainerMMS Container MXNet NGINX MXNet Model Server Lightweight virtualization, isolation, runs anywhere Back
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Requests • Latencies • Resources Metrics • Model Name • Host Name Dimensions • Log / CSV • AWS CloudWatch Target Operational Metrics Back
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. O(n2) Pairs MXNet Caffe2 PyTorch TF CNTKCoreML TensorRT NGraph SNPEMany Frameworks ONNX Support Many Platforms ONNX: Common IR Supported in MMS v0.2
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Performance • Batching • Caching • JIT Compilation • Custom code • Quantization Platform • New players • ONNX • Plugins Adoption • Ease of use • Internal Amazon dev tools • Industry partners Challenges Ahead
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Open source – try it out and file issues github.com/awslabs/mxnet-model-server mxnet-sdk-team@amazon.com