SlideShare a Scribd company logo
On how to efficiently
implement Deep Learning
algorithms on PYNQ platform
Luca Stornaiuolo
Dipartimento di Elettronica Informazione e Bioingegneria (DEIB)
luca.stornaiuolo@polimi.it
Donatella Sciuto, Marco D. Santambrogio
05/25/2018
Xilinx
San Francisco, CA
2
Context Definition
[1] "Efficient Processing of Deep Neural Networks: A Tutorial and Survey"
Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel Emer
3
Context Definition
Model
New Data
Prediction
Inference
CONDOR
4
Context Definition
Model
New Data
Prediction
Inference
5
Context Definition
Model
New Data
Prediction
Inference
Smart Embedded Systems
6
Context Definition
Model
New Data
Prediction
Inference
Smart Embedded Systems
7
Context Definition
Model
New Data
Prediction
Inference
Smart Embedded Systems
8
Context Definition
Model
New Data
Prediction
Inference
Smart Embedded Systems
Resource-constrained
environment
9
Technology
Overlay
10
Technology
DDRSDRAMmemory
Memory Hierarchy
11
What we can do
• Binarized Neural Networks (BNNs)
BNN-PYNQ [1]
• Quantized Neural Networks (QNNs)
QNN-MO-PYNQ [1]
• MobileNets [2]
[1] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, Kees Vissers
[2] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam
12
BRAM Parallel Cache
Cache
PL
PS
13
BRAM partitioning
14
BRAM partitioning
Custom
Partitioning
15
Proposed Solution
A novel implementation of the
Polymorphic Register File
for Vivado HLS
• raises the level of abstraction
for efficient parallel accesses
to on-chip memory;
• improve productivity within
hardware design flow;
• introduces masked methods
to avoid overwrites
16
Case Study
Matrix Multiplication (BxC and CxB)
B C
17
Case Study
Matrix Multiplication (BxC and CxB)
B C
18
Case Study
Matrix Multiplication (BxC and CxB)
Naïve solutions without PRF:
• Send data twice
• Duplicate data
• Complete partitioning
B C
19
Case Study
Matrix Multiplication (BxC and CxB)
Naïve solutions without PRF:
• Send data twice
(data transfer overhead)
• Duplicate data
(high BRAM utilization)
• Complete partitioning
(high registers utilization)
B C
20
Case Study
Matrix Multiplication (BxC and CxB)
Naïve solutions without PRF:
• Send data twice
(data transfer overhead)
• Duplicate data
(high BRAM utilization)
• Complete partitioning
(high registers utilization)
B C
PRF
21
Case Study
HLS Block Partitioning HLS PRF
22
Case Study
HLS Block Partitioning HLS PRF
23
Experimental Results
On a Xilinx Virtex-7 VC707 with clock frequency equal to 100MHz and
the PRF execution time achieves a speedup of 5x
24
Experimental Results
On a Xilinx Virtex-7 VC707 with clock frequency equal to 100MHz and
the PRF execution time achieves a speedup of 5x
25
Other Schemes
26
Failures Recovery
!
27
Failures Recovery
28
Failures Recovery
!
Luca Stornaiuolo
Dipartimento di Elettronica Informazione e Bioingegneria (DEIB)
luca.stornaiuolo@polimi.it
Donatella Sciuto, Marco D. Santambrogio
05/25/2018
On how to efficiently implement Deep
Learning algorithms on PYNQ platform
https://necst.it/
https://www.slideshare.net/necstlab

More Related Content

PPTX
CNN Dataflow Implementation on FPGAs
PPTX
Budget bioinfo service presentation
PPT
Tools for Image Retrieval in Large Multimedia Databases
PPTX
Ai project weka_0336
PDF
5 Practical Steps to a Successful Deep Learning Research
PDF
Deep Learning Initiative @ NECSTLab
PPTX
The Transformation of Systems Biology Into A Large Data Science
PPTX
Bioclouds CAMDA (Robert Grossman) 09-v9p
CNN Dataflow Implementation on FPGAs
Budget bioinfo service presentation
Tools for Image Retrieval in Large Multimedia Databases
Ai project weka_0336
5 Practical Steps to a Successful Deep Learning Research
Deep Learning Initiative @ NECSTLab
The Transformation of Systems Biology Into A Large Data Science
Bioclouds CAMDA (Robert Grossman) 09-v9p

Similar to On how to efficiently implement Deep Learning algorithms on PYNQ platform (20)

PDF
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
PPTX
CNN Dataflow implementation on FPGAs
PDF
How HPC and large-scale data analytics are transforming experimental science
PDF
migrate-case-study
PPTX
Bionimbus Cambridge Workshop (3-28-11, v7)
PPTX
6Tisch telecom_bretagne_2016
PPTX
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
PDF
Possibility of hpc application on cloud infrastructure by container cluster
PPTX
Bionimbus - Northwestern CGI Workshop 4-21-2011
PPTX
CNN Dataflow Implementation on FPGAs
PDF
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
PDF
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
PDF
Recent developments in Deep Learning
PPTX
CNN Dataflow Implementation on FPGAs
PPT
TeraGrid Communication and Computation
PDF
Reservoir computing fast deep learning for sequences
PPTX
An Overview of Bionimbus (March 2010)
PDF
HPC + Ai: Machine Learning Models in Scientific Computing
PDF
Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
CNN Dataflow implementation on FPGAs
How HPC and large-scale data analytics are transforming experimental science
migrate-case-study
Bionimbus Cambridge Workshop (3-28-11, v7)
6Tisch telecom_bretagne_2016
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Possibility of hpc application on cloud infrastructure by container cluster
Bionimbus - Northwestern CGI Workshop 4-21-2011
CNN Dataflow Implementation on FPGAs
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
Recent developments in Deep Learning
CNN Dataflow Implementation on FPGAs
TeraGrid Communication and Computation
Reservoir computing fast deep learning for sequences
An Overview of Bionimbus (March 2010)
HPC + Ai: Machine Learning Models in Scientific Computing
Don't Be Scared. Data Don't Bite. Introduction to Big Data.
Ad

More from NECST Lab @ Politecnico di Milano (20)

PDF
Mesticheria Team - WiiReflex
PPTX
Punto e virgola Team - Stressometro
PDF
BitIt Team - Stay.straight
PDF
BabYodini Team - Talking Gloves
PDF
printf("Nome Squadra"); Team - NeoTon
PPTX
BlackBoard Team - Motion Tracking Platform
PDF
#include<brain.h> Team - HomeBeatHome
PDF
Flipflops Team - Wave U
PDF
Bug(atta) Team - Little Brother
PDF
#NECSTCamp: come partecipare
PDF
NECSTCamp101@2020.10.1
PDF
NECSTLab101 2020.2021
PDF
TreeHouse, nourish your community
PDF
TiReX: Tiled Regular eXpressionsmatching architecture
PDF
Embedding based knowledge graph link prediction for drug repurposing
PDF
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
PDF
EMPhASIS - An EMbedded Public Attention Stress Identification System
PDF
Luns - Automatic lungs segmentation through neural network
PDF
BlastFunction: How to combine Serverless and FPGAs
PDF
Maeve - Fast genome analysis leveraging exact string matching
Mesticheria Team - WiiReflex
Punto e virgola Team - Stressometro
BitIt Team - Stay.straight
BabYodini Team - Talking Gloves
printf("Nome Squadra"); Team - NeoTon
BlackBoard Team - Motion Tracking Platform
#include<brain.h> Team - HomeBeatHome
Flipflops Team - Wave U
Bug(atta) Team - Little Brother
#NECSTCamp: come partecipare
NECSTCamp101@2020.10.1
NECSTLab101 2020.2021
TreeHouse, nourish your community
TiReX: Tiled Regular eXpressionsmatching architecture
Embedding based knowledge graph link prediction for drug repurposing
PLASTER - PYNQ-based abandoned object detection using a map-reduce approach o...
EMPhASIS - An EMbedded Public Attention Stress Identification System
Luns - Automatic lungs segmentation through neural network
BlastFunction: How to combine Serverless and FPGAs
Maeve - Fast genome analysis leveraging exact string matching
Ad

Recently uploaded (20)

PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Digital Logic Computer Design lecture notes
PPT
Project quality management in manufacturing
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT
introduction to datamining and warehousing
PPTX
Current and future trends in Computer Vision.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Well-logging-methods_new................
PPTX
Sustainable Sites - Green Building Construction
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
web development for engineering and engineering
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Foundation to blockchain - A guide to Blockchain Tech
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Internet of Things (IOT) - A guide to understanding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Digital Logic Computer Design lecture notes
Project quality management in manufacturing
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
introduction to datamining and warehousing
Current and future trends in Computer Vision.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Well-logging-methods_new................
Sustainable Sites - Green Building Construction
Model Code of Practice - Construction Work - 21102022 .pdf
web development for engineering and engineering
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Foundation to blockchain - A guide to Blockchain Tech

On how to efficiently implement Deep Learning algorithms on PYNQ platform