Invertible Residual Networks
박수철

모두의연구소 

풀잎스쿨 Deep Generative Models
Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen

David Duvenaud, Jörn-Henrik Jacobsen
Resnets
Fθ = I + gθ
Kaiming He et al. Deep Residual Learning for Image Recognition. 2015
1. Layer가 Invertible해야 한다.
Resnet구조를 사용하면서 non-linear function에 Lipshitz constraint 부여

2. Inverse를 구할 수 있어야 한다.
Banach fixed-point thorem에 근거해 iteration을 통해 근사

3. Layer의 log-determinant를 쉽게 구할 수 있어야 한다.
log-determinant를 matrix logarithm의 trace로 표현하고 이를 근사하여 구
함
Flow Models에 필요한 조건
Flow Models에 필요한 조건
1.Layer가 Invertible해야 한다.
Sufficient condition for invertible ResNets
Jens Behrmann et al. Invertible Residual Networks. 2019
Domain
Image
x
x’
f(x’)
f(x)
||x-x’||
||f(x)-f(x’)||
Lipshitz Constant, Lipshitz Norm
Sufficient condition for invertible ResNets
Takeru Miyato et al. Spectral Normalization for Generative Adversarial Networks. 2018
Jens Behrmann et al. Invertible Residual Networks. 2019
Satisfying the Lipschitz Constraint
1. Relu, ELU, tanh 등의 non-linear activation들은 이미
Lipschitz constraint 만족

2. Matrix multiplication으로 표현되는 dense layer,
convolution layer들은 weight matrix를 largest singular
value로 나누어 normalization하는 것으로 Lipschitz
constraint를 만족시킬 수 있다.
https://en.wikipedia.org/wiki/Singular_value_decomposition
Satisfying the Lipschitz Constraint
M: an arbitrary m × n matrix
U: m×m unitary(orthogonal)
matrix
Σ: m×n diagonal matrix with
non-negative real numbers
on the diagonal
V: n×n unitary(orthogonal)
matrix

Singular Value Decomposition
https://en.wikipedia.org/wiki/Singular_value_decomposition
Satisfying the Lipschitz Constraint
Singular Value Decomposition
Satisfying the Lipschitz Constraint
Weight Normalization
Jens Behrmann et al. Invertible Residual Networks. 2019
Finding the largest singular value
Jens Behrmann et al. Invertible Residual Networks. 2019
Singular Value Decomposition을 수행하는데 O(D^3)의 연산량이 요
구되지만 다음과 같은 알고리즘에 의해 근사가 가능하다.
2. Inverse를 구할 수 있어야 한다.
Inverse of i-Resnet Layer
Banach fixed-point theorem
Jens Behrmann et al. Invertible Residual Networks. 2019
Inverse of i-Resnet Layer
Jens Behrmann et al. Invertible Residual Networks. 2019
3. Layer의 log-determinant를
쉽게 구할 수 있어야 한다.
Log-determinant of Jacobian
Jens Behrmann et al. Invertible Residual Networks. 2019
ln px(x) = ln pz(z) + ln det JF(x)
= tr(ln JF)
ln det JF(x) = ln det JF(x)
= tr(ln(I + Jg(x)))
=
∞
∑
k=1
(−1)k+1
tr(Jk
g)
k
Change of variable
by Lipshitz constraint
by Withers & Nadarajah (2010)
for z = F(x) = (I + g)(x)
by definition
Log-determinant of Jacobian
Hall, B. C. Lie groups, lie algebras, and representations: An elementary introduction. Graduate Texts in Mathematics,
222 (2nd ed.), Springer, 2015.
Complex Logarithm
Matrix Logarithm
Log-determinant of Jacobian
ln det JF(x) =
∞
∑
k=1
(−1)k+1
tr(Jk
g)
k
Problems
1. 를 구하는데 O(d^2) 연산량 필요
2. Jacobian matrix 자체를 를 구하기 어렵다.
3. 무한 수열의 합을 계산해야한다.
tr(Jk
g)
Jk
g
Solutions
1.2. Deep learning framework에서 제공하는 automatic differentiation을
이용해서 vector-Jacobian 을 구하고 이를 이용해 matrix trace를
stochastic approximation함.
3. 임의의 index n까지만 계산하여 근사한다. 이로인해 biased estimator가 되지
만 에러가 bound됨을 증명.
vT
Jg
Jens Behrmann et al. Invertible Residual Networks. 2019
Log-determinant of Jacobian
Jens Behrmann et al. Invertible Residual Networks. 2019
tr(A) = 𝔼p(v) [vT
Av]
Hutchinsons trace estimator
, where A ∈ ℝd×d
, 𝔼[v] = 0, Cov(v) = I
Log-determinant of Jacobian
Implementation of 

backpropagation
https://github.com/eriklindernoren/ML-From-Scratch/blob/master/mlfromscratch/deep_learning/layers.py
∂L
∂y
∂y
∂x
y = Wx + b
∂y
∂x
= W
Log-determinant of Jacobian
에서 vector 를 샘플링하고 원하는 layer의 backward 입력으로
넣으면 vector-Jacobian 을 구할 수 있다.
p(v) v
vT
Jg
Layer g Layer g
x
y = g(x) v ∼ p(v)
Forward Backward
vT
Jg
Log-determinant of Jacobian
처음에 로 초기화되고 그다음부터는 의 값을 가짐vT
wT
Jg
결국 값을 갖게 되고 를 근사한다.vT
Jk
gv tr(Jk
g)
Jens Behrmann et al. Invertible Residual Networks. 2019
Log-determinant of Jacobian
Jens Behrmann et al. Invertible Residual Networks. 2019
비교
Jens Behrmann et al. Invertible Residual Networks. 2019
결과
Jens Behrmann et al. Invertible Residual Networks. 2019
결과
Jens Behrmann et al. Invertible Residual Networks. 2019
결과
Jens Behrmann et al. Invertible Residual Networks. 2019
끝

More Related Content

PDF
Why Does Deep and Cheap Learning Work So Well
PPTX
iT Cafe - Deep Learning
PPTX
머피's 머신러닝, Mixture model and EM algorithm
PDF
머피의 머신러닝: Undirencted Graphical Model
PPTX
Murpy's Machine Learing: 10. Directed Graphical Model
PPTX
XAI recent researches
PPTX
머피's 머신러닝: Latent Linear Model
Why Does Deep and Cheap Learning Work So Well
iT Cafe - Deep Learning
머피's 머신러닝, Mixture model and EM algorithm
머피의 머신러닝: Undirencted Graphical Model
Murpy's Machine Learing: 10. Directed Graphical Model
XAI recent researches
머피's 머신러닝: Latent Linear Model

What's hot (13)

PPTX
머피's 머신러닝: Latent Linear Model
PDF
7. Linear Regression
PDF
8. Logistic Regression
PPTX
머피의 머신러닝 : Gaussian Processes
PDF
Lecture 3: Unsupervised Learning
PDF
FCN to DeepLab.v3+
PDF
Lecture 4: Neural Networks I
PDF
[기초개념] Graph Convolutional Network (GCN)
PDF
Bayesian nets 발표 1
PDF
[Paper] eXplainable ai(xai) in computer vision
PPTX
Murpy's Machine Learning 9. Generalize Linear Model
PPTX
Eigendecomposition and pca
PDF
3 Generative models for discrete data
머피's 머신러닝: Latent Linear Model
7. Linear Regression
8. Logistic Regression
머피의 머신러닝 : Gaussian Processes
Lecture 3: Unsupervised Learning
FCN to DeepLab.v3+
Lecture 4: Neural Networks I
[기초개념] Graph Convolutional Network (GCN)
Bayesian nets 발표 1
[Paper] eXplainable ai(xai) in computer vision
Murpy's Machine Learning 9. Generalize Linear Model
Eigendecomposition and pca
3 Generative models for discrete data
Ad

Similar to Invertible residual networks Review (20)

PDF
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
PDF
Energy based models and boltzmann machines - v2.0
PDF
Multi armed bandit
PDF
Energy based models and boltzmann machines
PPTX
Les net
PDF
The fastalgorithmfordeepbeliefnets
PDF
Coursera Machine Learning (by Andrew Ng)_강의정리
PDF
Deep Learning from scratch 5장 : backpropagation
PPTX
Anomaly Detection with GANs
PDF
03. linear regression
PDF
Survey of activation functions
PDF
Lecture 2: Supervised Learning
PDF
RLCode와 A3C 쉽고 깊게 이해하기
PDF
[컴퓨터비전과 인공지능] 10. 신경망 학습하기 파트 1 - 2. 데이터 전처리
PDF
Early stopping as nonparametric variational inference
PDF
마이캠퍼스 딥러닝스쿨(한돌) 파트#2-딥러닝핵심
PPTX
Generative adversarial network
PDF
Code로 이해하는 RNN
PPTX
[한글] Tutorial: Sparse variational dropout
PDF
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
Energy based models and boltzmann machines - v2.0
Multi armed bandit
Energy based models and boltzmann machines
Les net
The fastalgorithmfordeepbeliefnets
Coursera Machine Learning (by Andrew Ng)_강의정리
Deep Learning from scratch 5장 : backpropagation
Anomaly Detection with GANs
03. linear regression
Survey of activation functions
Lecture 2: Supervised Learning
RLCode와 A3C 쉽고 깊게 이해하기
[컴퓨터비전과 인공지능] 10. 신경망 학습하기 파트 1 - 2. 데이터 전처리
Early stopping as nonparametric variational inference
마이캠퍼스 딥러닝스쿨(한돌) 파트#2-딥러닝핵심
Generative adversarial network
Code로 이해하는 RNN
[한글] Tutorial: Sparse variational dropout
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
Ad

Invertible residual networks Review