SlideShare a Scribd company logo
Thom Lane
19th September 2018
ONNX & Edge Deployments
Open Neural Network Exchange
Agenda
1. What is ONNX?
2. Creating/finding ONNX models
3. Visualizing ONNX models
4. Deploying ONNX models
5. Optimizing ONNX models
What is ONNX?
What is ONNX?
Open Neural Network Exchange Format
CoreML
ONNX Partners
Creating/finding ONNX models
Where can you get ONNX models from?
Model Zoo Train Your Own
Fine-tuning
Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Where can you get ONNX models from?
Model Zoo Train Your Own
Fine-tuning
Apache MXNet
on AWS SageMaker
Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Which model to choose for edge?
Accuracy Computation Size
Accuracy on
ImageNet (top-1)
Millions of
Mult-Adds
Millions of
Parameters
VGG16 71.5% 15300 138
1.0 MobileNet-224 70.6% 569 4.2
Depthwise Separable Convolution
3x10x10 16x10x10 3x10x10 3x10x10 16x10x10
3x3(x3)
Convolution
3x3(x1)
Convolution
1x1(x3)
Convolution
Regular Convolution Depthwise Separable Convolution
# of params: 432
# of computations: 43,200
# of params: 27+48 = 75
# of computations: 2,700+4,800 = 7,500
MobileNet Example
MobileNet
from ONNX Model Zoo
(pretrained on ImageNet)
Fine-tune on
CALtech101
Apache MXNet
on AWS SageMaker
Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Apache MXNet Overview
Scalable Debuggable
Optimized
librariesFlexible
7 frontend
languages Portable
Speech Bubble by Weltenraser, Scale by Ben Davis, Bug by Nociconist, Mobile by Rafael Garcia Motta, flexible by AdbA Icons from the Noun Project
Data
Model
Loss
Optimizer & Trainer
Gluon Sample
Forward & Backwards
Update Parameters
Gluon Sample
AWS SageMaker Overview
DeployTrain & TuneBuild
MobileNet Example
MobileNet
from ONNX Model Zoo
(pretrained on ImageNet)
Fine-tune on
CALtech101
Apache MXNet
on AWS SageMaker
Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
Visualizing ONNX models
How to visualize ONNX models?
lutzroeder.github.io/netron
Deploying ONNX models
How to deploy ONNX models?
AWS SageMaker
AWS Fargate
with Model Servers
AWS GreenGrass
Custom Deployments
Raspberry Pi by Ben Davis , clouds by Viktor Vorobyev from the Noun Project
AWS GreenGrass Overview
AWS GreenGrass Group Deployments
Lambda
function
ML Model
Resource
Device
Resource
SubscriptionDevice
Optimizing ONNX models
How to optimize ONNX models?
1. Use half-precision (float16) if possible: e.g. Mali-GPU
2. Use quantization with calibration if possible (experimental)
3. Compile model with TVM Stack
NNVM TVM
CUDA
LLVM
OpenCL
TVM Compiler TVM Runtime
lib
MXNet
ONNX
CoreML
frontends backends
What type of optimizations?
Pruning
Fusing
NNVM: Graph Optimizations
Tiling
Vectorization
TVM: Tensor Optimizations
What type of optimizations?
Pruning
Fusing
NNVM: Graph Optimizations
Tiling
Vectorization
TVM: Tensor Optimizations
Conv
Conv
Conv
Conv
Dropout
What type of optimizations?
Pruning
Fusing
NNVM: Graph Optimizations
Tiling
Vectorization
TVM: Tensor Optimizations
Conv
Conv Conv
Relu
Conv
with
Relu
What type of optimizations?
Pruning
Fusing
NNVM: Graph Optimizations
Tiling
Vectorization
TVM: Tensor Optimizations
N*C*H*W
N*(C/16)*H*W*16
What type of optimizations?
Pruning
Fusing
NNVM: Graph Optimizations
Tiling
Vectorization
TVM: Tensor Optimizations
1 + 3 = 4
2 + 2 = 4
1 + 0 = 1
1 + 1 = 2
1 3 = 4
2 2 = 4
1 0 = 1
1 1 = 2
+
Summary
1. Creating/finding ONNX models
• ONNX Model Zoo
• And fine-tune with Apache MXNet and AWS SageMaker
2. Visualizing ONNX models
• Netron
3. Deploying ONNX models
• AWS GreenGrass
4. Optimizing ONNX models
• TVM Stack
Thanks!
And don’t forget to check out:
https://medium.com/apache-mxnet

More Related Content

PPTX
Google Vertex AI
PDF
CloudFormation/SAMのススメ
PPTX
事業の進展とデータマネジメント体制の進歩(+プレトタイプの話)
PDF
最近のストリーム処理事情振り返り
PDF
Re: ゼロから始める監視設計
PDF
20190320 AWS Black Belt Online Seminar Amazon EBS
PDF
AWSではじめるMLOps
PDF
AWS Black Belt Online Seminar 2017 Amazon Kinesis
Google Vertex AI
CloudFormation/SAMのススメ
事業の進展とデータマネジメント体制の進歩(+プレトタイプの話)
最近のストリーム処理事情振り返り
Re: ゼロから始める監視設計
20190320 AWS Black Belt Online Seminar Amazon EBS
AWSではじめるMLOps
AWS Black Belt Online Seminar 2017 Amazon Kinesis

What's hot (20)

PDF
MLOps 플랫폼을 만드는 과정의 고민과 해결 사례 공유(feat. Kubeflow)
PDF
AWS Black Belt Techシリーズ AWS Lambda
PDF
Azure load testingを利用したパフォーマンステスト
PDF
AWS導入から3年 AWSマルチアカウント管理で変わらなかったこと変えていったこと
PDF
Kinesis + Elasticsearchでつくるさいきょうのログ分析基盤
PDF
Azureを頑張る理由と頑張り方(Cloud Skills Challenge 2022 winter 発表資料)
PDF
AWS Black Belt Online Seminar 2018 AWS Well-Architected Framework
PDF
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
PDF
データ仮想化を活用したデータ分析のフローと分析モデル作成の自動化のご紹介
PPTX
RCカーを用いた自動運転車両シミュレーション環境に関する研究
PDF
20200422 AWS Black Belt Online Seminar Amazon Elastic Container Service (Amaz...
PDF
20190220 AWS Black Belt Online Seminar Amazon S3 / Glacier
PDF
DataDrift in Azure Machine Learning
PDF
Amazon GameLift FlexMatch
PDF
MediaPipeの紹介
PPTX
Microsoft Graph完全に理解した気がしてた
PPTX
introduction Azure OpenAI by Usama wahab khan
PDF
MLOps に基づく AI/ML 実運用最前線 ~画像、動画データにおける MLOps 事例のご紹介~(映像情報メディア学会2021年冬季大会企画セッショ...
PDF
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
PDF
単なるキャッシュじゃないよ!?infinispanの紹介
MLOps 플랫폼을 만드는 과정의 고민과 해결 사례 공유(feat. Kubeflow)
AWS Black Belt Techシリーズ AWS Lambda
Azure load testingを利用したパフォーマンステスト
AWS導入から3年 AWSマルチアカウント管理で変わらなかったこと変えていったこと
Kinesis + Elasticsearchでつくるさいきょうのログ分析基盤
Azureを頑張る理由と頑張り方(Cloud Skills Challenge 2022 winter 発表資料)
AWS Black Belt Online Seminar 2018 AWS Well-Architected Framework
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
データ仮想化を活用したデータ分析のフローと分析モデル作成の自動化のご紹介
RCカーを用いた自動運転車両シミュレーション環境に関する研究
20200422 AWS Black Belt Online Seminar Amazon Elastic Container Service (Amaz...
20190220 AWS Black Belt Online Seminar Amazon S3 / Glacier
DataDrift in Azure Machine Learning
Amazon GameLift FlexMatch
MediaPipeの紹介
Microsoft Graph完全に理解した気がしてた
introduction Azure OpenAI by Usama wahab khan
MLOps に基づく AI/ML 実運用最前線 ~画像、動画データにおける MLOps 事例のご紹介~(映像情報メディア学会2021年冬季大会企画セッショ...
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
単なるキャッシュじゃないよ!?infinispanの紹介
Ad

Similar to ONNX and Edge Deployments (20)

PPTX
ONNX - The Lingua Franca of Deep Learning
PPTX
Onnx and onnx runtime
PPTX
Onnx to Symbol Table Project
PPTX
Deployment of the Machine Learning at the production level
PPTX
ONNX and MLflow
PDF
OVHcloud TechTalks - ML serving
PDF
Accelerated Training of Transformer Models
PDF
Apache MXNet EcoSystem - ACNA2018
PPTX
Build, Train and Deploy ML Models using Amazon SageMaker
PPTX
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
PDF
Flock: Data Science Platform @ CISL
PPTX
Amazon SageMaker (December 2018)
PDF
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
PPTX
Deep learning systems model serving
PPTX
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
PPTX
An Introduction to Amazon SageMaker (October 2018)
PPTX
Onnx at lf oss na 20200629 v5
PDF
Neural Network File Format for Inference Framework
PDF
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
PDF
Machine learning from software developers point of view
ONNX - The Lingua Franca of Deep Learning
Onnx and onnx runtime
Onnx to Symbol Table Project
Deployment of the Machine Learning at the production level
ONNX and MLflow
OVHcloud TechTalks - ML serving
Accelerated Training of Transformer Models
Apache MXNet EcoSystem - ACNA2018
Build, Train and Deploy ML Models using Amazon SageMaker
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
Flock: Data Science Platform @ CISL
Amazon SageMaker (December 2018)
Julien Simon, Principal Technical Evangelist at Amazon - Machine Learning: Fr...
Deep learning systems model serving
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
An Introduction to Amazon SageMaker (October 2018)
Onnx at lf oss na 20200629 v5
Neural Network File Format for Inference Framework
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
Machine learning from software developers point of view
Ad

More from Apache MXNet (20)

PPTX
Recent Advances in Natural Language Processing
PPTX
Fine-tuning BERT for Question Answering
PPTX
Introduction to GluonNLP
PPTX
Introduction to object tracking with Deep Learning
PPTX
Introduction to GluonCV
PPTX
Introduction to Computer Vision
PPTX
Image Segmentation: Approaches and Challenges
PPTX
Introduction to Deep face detection and recognition
PPTX
Generative Adversarial Networks (GANs) using Apache MXNet
PPTX
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
PDF
Using Java to deploy Deep Learning models with MXNet
PPTX
AI powered emotion recognition: From Inception to Production - Global AI Conf...
PPTX
MXNet Paris Workshop - Intro To MXNet
PDF
Apache MXNet ODSC West 2018
PDF
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
PDF
Distributed Inference with MXNet and Spark
PDF
Multivariate Time Series
PDF
AI On the Edge: Model Compression
PDF
Building Content Recommendation Systems using MXNet Gluon
PDF
Game Playing RL Agent
Recent Advances in Natural Language Processing
Fine-tuning BERT for Question Answering
Introduction to GluonNLP
Introduction to object tracking with Deep Learning
Introduction to GluonCV
Introduction to Computer Vision
Image Segmentation: Approaches and Challenges
Introduction to Deep face detection and recognition
Generative Adversarial Networks (GANs) using Apache MXNet
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Using Java to deploy Deep Learning models with MXNet
AI powered emotion recognition: From Inception to Production - Global AI Conf...
MXNet Paris Workshop - Intro To MXNet
Apache MXNet ODSC West 2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
Distributed Inference with MXNet and Spark
Multivariate Time Series
AI On the Edge: Model Compression
Building Content Recommendation Systems using MXNet Gluon
Game Playing RL Agent

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Machine Learning_overview_presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Empathic Computing: Creating Shared Understanding
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Review of recent advances in non-invasive hemoglobin estimation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectroscopy.pptx food analysis technology
Building Integrated photovoltaic BIPV_UPV.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Machine Learning_overview_presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Programs and apps: productivity, graphics, security and other tools
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

ONNX and Edge Deployments

  • 1. Thom Lane 19th September 2018 ONNX & Edge Deployments Open Neural Network Exchange
  • 2. Agenda 1. What is ONNX? 2. Creating/finding ONNX models 3. Visualizing ONNX models 4. Deploying ONNX models 5. Optimizing ONNX models
  • 4. What is ONNX? Open Neural Network Exchange Format CoreML
  • 7. Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 8. Where can you get ONNX models from? Model Zoo Train Your Own Fine-tuning Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 9. Which model to choose for edge? Accuracy Computation Size Accuracy on ImageNet (top-1) Millions of Mult-Adds Millions of Parameters VGG16 71.5% 15300 138 1.0 MobileNet-224 70.6% 569 4.2
  • 10. Depthwise Separable Convolution 3x10x10 16x10x10 3x10x10 3x10x10 16x10x10 3x3(x3) Convolution 3x3(x1) Convolution 1x1(x3) Convolution Regular Convolution Depthwise Separable Convolution # of params: 432 # of computations: 43,200 # of params: 27+48 = 75 # of computations: 2,700+4,800 = 7,500
  • 11. MobileNet Example MobileNet from ONNX Model Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 12. Apache MXNet Overview Scalable Debuggable Optimized librariesFlexible 7 frontend languages Portable Speech Bubble by Weltenraser, Scale by Ben Davis, Bug by Nociconist, Mobile by Rafael Garcia Motta, flexible by AdbA Icons from the Noun Project
  • 14. Forward & Backwards Update Parameters Gluon Sample
  • 16. MobileNet Example MobileNet from ONNX Model Zoo (pretrained on ImageNet) Fine-tune on CALtech101 Apache MXNet on AWS SageMaker Fine Tuning by Emma Mitchell, Teacher by Gregor Cresnar, take by Adrien Coquet from the Noun Project
  • 18. How to visualize ONNX models? lutzroeder.github.io/netron
  • 20. How to deploy ONNX models? AWS SageMaker AWS Fargate with Model Servers AWS GreenGrass Custom Deployments Raspberry Pi by Ben Davis , clouds by Viktor Vorobyev from the Noun Project
  • 22. AWS GreenGrass Group Deployments Lambda function ML Model Resource Device Resource SubscriptionDevice
  • 24. How to optimize ONNX models? 1. Use half-precision (float16) if possible: e.g. Mali-GPU 2. Use quantization with calibration if possible (experimental) 3. Compile model with TVM Stack NNVM TVM CUDA LLVM OpenCL TVM Compiler TVM Runtime lib MXNet ONNX CoreML frontends backends
  • 25. What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations
  • 26. What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Conv Dropout
  • 27. What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations Conv Conv Conv Relu Conv with Relu
  • 28. What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations N*C*H*W N*(C/16)*H*W*16
  • 29. What type of optimizations? Pruning Fusing NNVM: Graph Optimizations Tiling Vectorization TVM: Tensor Optimizations 1 + 3 = 4 2 + 2 = 4 1 + 0 = 1 1 + 1 = 2 1 3 = 4 2 2 = 4 1 0 = 1 1 1 = 2 +
  • 30. Summary 1. Creating/finding ONNX models • ONNX Model Zoo • And fine-tune with Apache MXNet and AWS SageMaker 2. Visualizing ONNX models • Netron 3. Deploying ONNX models • AWS GreenGrass 4. Optimizing ONNX models • TVM Stack
  • 31. Thanks! And don’t forget to check out: https://medium.com/apache-mxnet