SlideShare a Scribd company logo
Comparison	of	deep	learning
frameworks	from	a	viewpoint	of
double	backpropagation
Preferred	Networks,	Inc.
Kenta	Oono <oono@preferred.jp>
Chainer Meetup	#6@Preferred	Networks
Sep.	30th 2017
1
Agenda
• Technological	stack	of	DL	frameworks
• Design	choice	in	DL	frameworks
• Double	backprop primer
• Coding	examples	of	double	backprop in	Chainer,	
PyTorch,	and	TF
2
Technology	stack	of	a	DL	framework
name functions example
Graphical visualization DIGITS, TensorBoard
Machine learning workflow
management
Dataset prep, Save/Load
Training loop
Keras, TF slim
Computational graph(CG)
management
Build/Optimize CGs
Forward/Back prop
Theano, TensorFlow
Torch.nn
Multi-dimensional
array processing
High-level array
manipulation
NumPy, CuPy
Eigen, Torch (core)
Numerical computation Matrix operation
Convolution
BLAS(OpenBLAS, MKL),
cuBLAS, cuDNN, MKL
DNN
Computational device CPU, GPU, TPU, FPGA
3
Technology	stack	of	Chainer
cuDNN
Chainer
NumPy CuPy
BLAS
cuBLAS,	
cuRAND
CPU GPU
4
name
Graphical visualization
Machine learning workflow
management
Computational graph
management
Multi-dimensional
array processing
Numerical computation
Computational device
Technology	stack	of	TensorFlow
cuDNN
TensorFlow
Eigen::Tensor
BLAS
cuBLAS,	
cuRAND
CPU GPU
5
TensorBoard
TF	slim
Keras
name
Graphical visualization
Machine learning workflow
management
Computational graph
management
Multi-dimensional
array processing
Numerical computation
Computational device
Technology	stack	of	Theano
CUDA,	OpenCL
CUDAToolkit
Theano
BLAS
CPU GPU
6
lib
gpuarray
NumPy
Keras,	Lasagne,	Blocks,	etc.
name
Graphical visualization
Machine learning workflow
management
Computational graph
management
Multi-dimensional
array processing
Numerical computation
Computational device
Technology	stack	of	Keras
7
Keras
TensorFlowTheano
Technology
Stack	of	Theano
Technology	
Stack	of	TF
name
Graphical visualization
Machine learning workflow
management
Computational graph
management
Multi-dimensional
array processing
Numerical computation
Computational device
8
9
10
11
12
Important	Design	Choices
through	user’s	typical	workflow
Write	NNs
(in	which	language?)
Compute	backprop
(how?)
Update	parameters
(how	to	represent?)
(how	to	update?)
Run	user	codes
(when?)
Optimize	CG
(how?)
Scale	up	training
(how?)
Coding Execution Improvement
Important	Design	Choices
through	user’s	typical	workflow
Write	NNs
(in	which	language?)
Compute	backprop
(how?)
Update	parameters
(how	to	represent?)
(how	to	update?)
Run	user	codes
(when?)
Coding Execution Improvement
Optimize	CG
(how?)
Scale	up	training
(how?)
13
http://bit.ly/aaai-dlif
14
Neural	Network	as	a	Computational	Graph
• In	most	frameworks,	NN	is	conceptualized	as	a	computational	graph	(CG).
• The	simplest	form	of	CG	is	a	bipartite DAG	(Directed	Acyclic	Graph)	
consisting	of	data	nodes and	operator	nodes.
y = x1 * x2
z = y - x3
x1 mul suby
x3
z
x2
data	node
operator	node
15
Multi	Layer	Perceptron	(MLP)
x Affine
W1 b1
h1 ReLU a1
Affine
W2 b2
h2 ReLU a2
Soft
max
prob
Cross
Entropy
loss
t 16
How	to	compute	backprop
Backprop through	graphs
Framework	only	builds	graphs	of	
forward	prop,	and	do	backprop
by	backtracking	the	graphs.
E.g.	Torch.nn,	Caffe
Backprop as	extended	graphs
Framework	builds	graphs	for	
backprop as	well	as	those	for	
forward	prop.
E.g.	Theano,	MXNet,	TensorFlow,	
Chainer,	PyTorch
a mul suby
c
z
b
a mul suby
c
z
b
gzid
neg
mul
mul
gy
gc
ga
gb
∇y z∇a z ∇z z = 1
17
How	to	compute	backprop
Backprop through	graphs
Easy	and	simple	to	implement
Backprop	computation	need	not	
be	defined	as	graphs.
Low	flexibility
Features	available	for	graphs	may	
not	apply	to	backprop	
computations.
Backprop as	extended	graphs
Implementation	gets	complicated
High	flexibility
Any	features	available	for	graphs	can	
also	be	applied	to	backprop	
computations	(e.g.	backprop	of	
backprop).
18
Double	backprop
x F z
y
・・・ L
class F(FunctionNode):
def forward(self, x, y):
return x * x + y
def backward(self, x, y, gz):
return 2 * gz * x, gz
NumPy,	CuPy
Note:	The	interface	is	simplified	from	actual	implementation.
chainer.Variable
->	Creates	CG
19
Double	backprop
x F z
y
gx Grad F gz
gy
・・・ L
Backprop!
=∂L/∂z=∂L/∂x
=∂L/∂y
1.0
=∂L/∂L
Mul
x
gz
y
gx
gy
*2
20
Double	backprop
x F z
y
gx Grad F 1.0
gy
Backprop!
=∂z/∂x
=∂z/∂y 21
Double	backprop
x F z
y
gx
Grad F1.0
gy
22
Double	backprop
x Mul z
y
gx
Grad F1.0
gy
Backprop!
1.0
Double
Grad F
ggx
=∂2z/∂x2
23
Double	backprop
x f z
Computes	the	differentiation	of	L = G(f(x), ∇f(x)) with	respect	to	x
L = G(f(x), ∇f(x))
24
Double	backprop
x f z
gxGrad f
Computes	the	differentiation	of	L = G(f(x), ∇f(x)) with	respect	to	x
L = G(f(x), ∇f(x))
25
Double	backprop
x f z
gxGrad f
・・・ L
Computes	the	differentiation	of	L = G(f(x), ∇f(x)) with	respect	to	x
L = G(f(x), ∇f(x))
26
Double	backprop
x f z
gxGrad f
・・・ L
Backprop!
ggx
Double
Grad f
∂L/∂x
1.0gzGrad f
Computes	the	differentiation	of	L = G(f(x), ∇f(x)) with	respect	to	x
L = G(f(x), ∇f(x))
27
Example	(Chainer)
http://bit.ly/2wpEzO5
28
Example	(PyTorch)
29
Example	(TensorFlow)
30
Conclusion
• Several	DL	frameworks	have	similarity	in	their	
structure
• Difference	in	choice	of	design	determines	capability	
of	frameworks
• Introduction	of	double	backprop and	toy	examples	
in	several	frameworks.
31

More Related Content

PDF
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
PDF
モデルアーキテクチャ観点からのDeep Neural Network高速化
PPTX
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
PDF
신입 개발자 생활백서
PDF
Deep State Space Models for Time Series Forecasting の紹介
PDF
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PDF
[DL輪読会]SeqGan Sequence Generative Adversarial Nets with Policy Gradient
PPTX
物体検出の歴史(R-CNNからSSD・YOLOまで)
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
モデルアーキテクチャ観点からのDeep Neural Network高速化
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
신입 개발자 생활백서
Deep State Space Models for Time Series Forecasting の紹介
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[DL輪読会]SeqGan Sequence Generative Adversarial Nets with Policy Gradient
物体検出の歴史(R-CNNからSSD・YOLOまで)

What's hot (20)

PPTX
[DL輪読会]It's not just size that maters small language models are also few sho...
PDF
【DL輪読会】CLIPORT: What and Where Pathways for Robotic Manipulation (CoRL 2021)
PPTX
[DL輪読会]Meta Reinforcement Learning
PDF
シェーダだけで世界を創る!three.jsによるレイマーチング
PDF
Contrastive learning 20200607
PPTX
遠赤外線カメラと可視カメラを利用した悪条件下における画像取得
PDF
第3回NIPS読み会・関西発表資料
PDF
【DL輪読会】Egocentric Video Task Translation (CVPR 2023 Highlight)
PDF
Introduction to Prioritized Experience Replay
PPTX
[DL輪読会]Attentive neural processes
PPTX
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
PPTX
機械学習 / Deep Learning 大全 (1) 機械学習基礎編
PDF
物体検出コンペティションOpen Imagesに挑む
PPTX
効用最大化理論の観点から見る強化学習
PPTX
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
PPTX
A3C解説
PPTX
SageMaker Neoの可能性について - 第3回 Amazon SageMaker 事例祭り+体験ハンズオン
PPTX
Self training with noisy student
PDF
Generative Adversarial Networks (GAN) の学習方法進展・画像生成・教師なし画像変換
PDF
GAN(と強化学習との関係)
[DL輪読会]It's not just size that maters small language models are also few sho...
【DL輪読会】CLIPORT: What and Where Pathways for Robotic Manipulation (CoRL 2021)
[DL輪読会]Meta Reinforcement Learning
シェーダだけで世界を創る!three.jsによるレイマーチング
Contrastive learning 20200607
遠赤外線カメラと可視カメラを利用した悪条件下における画像取得
第3回NIPS読み会・関西発表資料
【DL輪読会】Egocentric Video Task Translation (CVPR 2023 Highlight)
Introduction to Prioritized Experience Replay
[DL輪読会]Attentive neural processes
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
機械学習 / Deep Learning 大全 (1) 機械学習基礎編
物体検出コンペティションOpen Imagesに挑む
効用最大化理論の観点から見る強化学習
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
A3C解説
SageMaker Neoの可能性について - 第3回 Amazon SageMaker 事例祭り+体験ハンズオン
Self training with noisy student
Generative Adversarial Networks (GAN) の学習方法進展・画像生成・教師なし画像変換
GAN(と強化学習との関係)
Ad

Similar to Comparison of deep learning frameworks from a viewpoint of double backpropagation (20)

PDF
Common Design of Deep Learning Frameworks
PDF
Tokyo Webmining Talk1
PDF
Introduction to Chainer: A Flexible Framework for Deep Learning
PDF
Overview of Chainer and Its Features
PDF
Differences of Deep Learning Frameworks
PDF
Open-Source Frameworks for Deep Learning: an Overview
PDF
Deep Learning libraries and first experiments with Theano
PDF
Austin,TX Meetup presentation tensorflow final oct 26 2017
PDF
Chainer OpenPOWER developer congress HandsON 20170522_ota
PDF
Benchmarking open source deep learning frameworks
PDF
Introduction to Chainer
PDF
Chainer GTC 2016
PDF
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
PDF
Lecture 4: Deep Learning Frameworks
PDF
Deep Learning on ARM Platforms - SFO17-509
PDF
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
PPTX
Deep Learning in your Browser: powered by WebGL
PDF
Neural Networks from Scratch - TensorFlow 101
PDF
Open source ai_technical_trend
PDF
Julien Simon - Deep Dive: Compiling Deep Learning Models
Common Design of Deep Learning Frameworks
Tokyo Webmining Talk1
Introduction to Chainer: A Flexible Framework for Deep Learning
Overview of Chainer and Its Features
Differences of Deep Learning Frameworks
Open-Source Frameworks for Deep Learning: an Overview
Deep Learning libraries and first experiments with Theano
Austin,TX Meetup presentation tensorflow final oct 26 2017
Chainer OpenPOWER developer congress HandsON 20170522_ota
Benchmarking open source deep learning frameworks
Introduction to Chainer
Chainer GTC 2016
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Lecture 4: Deep Learning Frameworks
Deep Learning on ARM Platforms - SFO17-509
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning in your Browser: powered by WebGL
Neural Networks from Scratch - TensorFlow 101
Open source ai_technical_trend
Julien Simon - Deep Dive: Compiling Deep Learning Models
Ad

More from Kenta Oono (20)

PDF
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
PDF
Deep learning for molecules, introduction to chainer chemistry
PDF
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
PDF
深層学習フレームワーク概要とChainerの事例紹介
PDF
20170422 数学カフェ Part2
PDF
20170422 数学カフェ Part1
PDF
情報幾何学の基礎、第7章発表ノート
PDF
GTC Japan 2016 Chainer feature introduction
PDF
On the benchmark of Chainer
PDF
VAE-type Deep Generative Models
PDF
Introduction to Chainer and CuPy
PDF
Stochastic Gradient MCMC
PDF
Chainer Contribution Guide
PDF
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
PDF
Introduction to Chainer (LL Ring Recursive)
PDF
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
PDF
提供AMIについて
PDF
Chainerインストール
PDF
Caffeインストール
PDF
ディープラーニング最近の発展とビジネス応用への課題
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Deep learning for molecules, introduction to chainer chemistry
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
深層学習フレームワーク概要とChainerの事例紹介
20170422 数学カフェ Part2
20170422 数学カフェ Part1
情報幾何学の基礎、第7章発表ノート
GTC Japan 2016 Chainer feature introduction
On the benchmark of Chainer
VAE-type Deep Generative Models
Introduction to Chainer and CuPy
Stochastic Gradient MCMC
Chainer Contribution Guide
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
Introduction to Chainer (LL Ring Recursive)
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
提供AMIについて
Chainerインストール
Caffeインストール
ディープラーニング最近の発展とビジネス応用への課題

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
KodekX | Application Modernization Development
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Cloud computing and distributed systems.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Reach Out and Touch Someone: Haptics and Empathic Computing
KodekX | Application Modernization Development
Per capita expenditure prediction using model stacking based on satellite ima...
Network Security Unit 5.pdf for BCA BBA.
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
20250228 LYD VKU AI Blended-Learning.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Review of recent advances in non-invasive hemoglobin estimation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity
Cloud computing and distributed systems.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

Comparison of deep learning frameworks from a viewpoint of double backpropagation