SlideShare a Scribd company logo
5
Most read
8
Most read
11
Most read
Variational Autoencoders
Spring 2020
11-785 Introduction to Deep Learning
By Lady Yuying Zhu
And Lord Christopher George
Outline
● Autoencoders
● Bake a cake 🤔 TBD, vanilla please
● Probability Refresher
● Generative Models (basic overview)
● Take over the universe
● Discuss our eventual doomed existence and the impact of covid-19 on society
● Variational Autoencoders
Auto-encoder Recap
A neural network that the output is the input itself.
Intuition:
● A good representation should keep the information well (reconstruction error)
● Deep + nonlinearity might help enhance the representation power
Slide Credits: Shenlong Wang, Deep Generative Models
Why auto-encoder? Why not?
Why?
● Map high-dimensional data to low-dimensions for visualization
● Learn the salient features of the data.
Why not?
Encoded data can be decoded
without loss if the autoencoder
has enough degrees of freedom
(leads to serve overfitting)
Irregular latent space prevent us from using autoencoder for new content generation
Without explicit regularisation,
some points of the latent space
are “meaningless” once decoded
Probability Refresher
● Continuous vs. Discrete
● Bayes Rule
● Prior/ Posterior
● KL Divergence:measure the closeness of the two distributions
● ELBO (evidence lower bound)
Generative Models, An Overview
Task: generate new samples follows the same probabilistic distribution of a given a training dataset
Variational Autoencoders from Autoencoders
“In VAEs, the encoder becomes a variational inference network, mapping observed inputs to (approximate)
posterior distributions over latent space, and the decoder becomes a generative network, capable of mapping
arbitrary latent coordinates back to distributions over the original data space.”
Fast Forward Labs, Under the Hood of the Variational Autoencoder (in Prose and Code)
Q(z|x) P(x|z)
Original
Autoencoder
Variational Autoencoders, (VAEs)
Big Idea: Maximize likelihood of seeing our data. “We are aiming maximize the probability of each X in the training
set under the entire generative process.”
Where X is a datapoint, z is the latent variable, and theta are the parameters used for calculating z.
Generally in VAEs
And
Note: Why are we doing this? Because Variational Autoencoders can be interpreted as using variational bayesian inference where, in this bayesian
view, our data is seen as being pulled from some underlying distribution. x ~ p(x)
Tutorial on Variational Autoencoders, Section 1.1 Preliminaries: Latent Variable Models
Fast Forward Labs, Introducing Variational Autoencoders (in Prose and Code)
What
could this
be?
VAEs, Question 1
How do we define the latent variables z (i.e., decide what information they represent)?
* Samples of z can be drawn from a simple distribution, i.e., N (0, I), where I is the identity matrix.
* Provided powerful function approximators, we can simply learn a function which maps our independent, normally-distributed z values
to whatever latent variables might be needed for the model, and then map those latent variables to X.
A Neural
Network!
Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
VAEs, Question 2
How do we deal with the integral over z?
* VAEs alter the sampling procedure to make it faster!
* The key idea behind the variational autoencoder is to attempt to sample values of z that are likely to have produced X, and compute
P(X) just from those. This means that we need a new function Q(z|X) which can take a value of X and give us a distribution over z
values that are likely to produce X.
* Given this Q(Z|x), we can compute P(X|z) relatively easily, so…
Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
VAEs and KL Divergence
So now we need to relate E_{z~Q}P(X|z) and P(X), which we can do by minimizing the KL Divergence:
Minimizing KL Divergence = Maximizing ELBO
Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
By definition
Apply Bayes Rule
Rearrange
Reconstruct Q
Take the gradient of RHS
KLD Loss: Has a closed form solution
VAEs and KL Divergence
Negative Reconstruction Loss: Can take sample of z and
treat P(X|z) as an approximation, then backpropagate
through the neural network. → Reparameterization trick.
Minimizing KL Divergence = Maximizing ELBO
Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
Note: ELBO is the Evidence Lower Bound, and can be reformulated as the equation above.
VAEs, Reparameterization
Tutorial on Variational Autoencoders, Section 2.2 Optimizing the objective
Deep Generative Models, Section 20.9 Back-Propagation through Random Operations
VAEs, Example
Fast Forward Labs, Introducing Variational Autoencoders (in Prose and Code)
References
(1) Shenlong, Wang Deep Generative Models
(2) Chapter 20, Deep Generative Models
(3) Tutorial on Variational Autoencoders
(4) Fast Forward Labs, Under the Hood of the Variational
Autoencoder (in Prose and Code)
(5) Fast Forward Labs, Introducing Variational
Autoencoders (in Prose and Code)
(6) examples/main.py at master · pytorch/examples
(7) CS231n, stanford

More Related Content

PDF
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
PPTX
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
PDF
Jonathan Ronen - Variational Autoencoders tutorial
PPTX
Variational Auto Encoder and the Math Behind
PDF
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
PPTX
DPL-co4-session-4-VAEDPL-co4-session-4-VAE.pptx
PDF
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
PDF
Variational Autoencoder from scratch.pdf
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Representation Learning & Generative Modeling with Variational Autoencoder(VA...
Jonathan Ronen - Variational Autoencoders tutorial
Variational Auto Encoder and the Math Behind
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
DPL-co4-session-4-VAEDPL-co4-session-4-VAE.pptx
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
Variational Autoencoder from scratch.pdf

Similar to Introduction to Variational Auto Encoder (20)

PDF
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
PDF
VAE-type Deep Generative Models
PDF
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
PPTX
MODULE 6 - 1 VAE - Variational Autoencoder
PDF
is anyone_interest_in_auto-encoding_variational-bayes
PDF
Auto encoding-variational-bayes
PDF
Iclr2016 vaeまとめ
PPTX
250721_Thien_Labseminar[Variational Graph Auto-Encoders].pptx
PDF
Lecture 7-8 From Autoencoder to VAE.pdf
PDF
Explanation of Autoencoder to Variontal Auto Encoder
PDF
Variational Autoencoders For Image Generation
PPTX
Introduction to Autoencoders: Types and Applications
PDF
Introduction to modern Variational Inference.
PDF
(DL hacks輪読) Variational Inference with Rényi Divergence
PDF
Journal Club: VQ-VAE2
PDF
Autoencoder in Deep Learning and its types
PDF
Explicit Density Models
PPTX
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
PDF
Autoencoder
PDF
從 VAE 走向深度學習新理論
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
VAE-type Deep Generative Models
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
MODULE 6 - 1 VAE - Variational Autoencoder
is anyone_interest_in_auto-encoding_variational-bayes
Auto encoding-variational-bayes
Iclr2016 vaeまとめ
250721_Thien_Labseminar[Variational Graph Auto-Encoders].pptx
Lecture 7-8 From Autoencoder to VAE.pdf
Explanation of Autoencoder to Variontal Auto Encoder
Variational Autoencoders For Image Generation
Introduction to Autoencoders: Types and Applications
Introduction to modern Variational Inference.
(DL hacks輪読) Variational Inference with Rényi Divergence
Journal Club: VQ-VAE2
Autoencoder in Deep Learning and its types
Explicit Density Models
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Autoencoder
從 VAE 走向深度學習新理論
Ad

Recently uploaded (20)

PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
PPT on Performance Review to get promotions
PPTX
Sustainable Sites - Green Building Construction
PDF
composite construction of structures.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Well-logging-methods_new................
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Welding lecture in detail for understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPT on Performance Review to get promotions
Sustainable Sites - Green Building Construction
composite construction of structures.pdf
R24 SURVEYING LAB MANUAL for civil enggi
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
OOP with Java - Java Introduction (Basics)
Model Code of Practice - Construction Work - 21102022 .pdf
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Well-logging-methods_new................
CH1 Production IntroductoryConcepts.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Embodied AI: Ushering in the Next Era of Intelligent Systems
Welding lecture in detail for understanding
CYBER-CRIMES AND SECURITY A guide to understanding
Ad

Introduction to Variational Auto Encoder

  • 1. Variational Autoencoders Spring 2020 11-785 Introduction to Deep Learning By Lady Yuying Zhu And Lord Christopher George
  • 2. Outline ● Autoencoders ● Bake a cake 🤔 TBD, vanilla please ● Probability Refresher ● Generative Models (basic overview) ● Take over the universe ● Discuss our eventual doomed existence and the impact of covid-19 on society ● Variational Autoencoders
  • 3. Auto-encoder Recap A neural network that the output is the input itself. Intuition: ● A good representation should keep the information well (reconstruction error) ● Deep + nonlinearity might help enhance the representation power Slide Credits: Shenlong Wang, Deep Generative Models
  • 4. Why auto-encoder? Why not? Why? ● Map high-dimensional data to low-dimensions for visualization ● Learn the salient features of the data. Why not? Encoded data can be decoded without loss if the autoencoder has enough degrees of freedom (leads to serve overfitting) Irregular latent space prevent us from using autoencoder for new content generation Without explicit regularisation, some points of the latent space are “meaningless” once decoded
  • 5. Probability Refresher ● Continuous vs. Discrete ● Bayes Rule ● Prior/ Posterior ● KL Divergence:measure the closeness of the two distributions ● ELBO (evidence lower bound)
  • 6. Generative Models, An Overview Task: generate new samples follows the same probabilistic distribution of a given a training dataset
  • 7. Variational Autoencoders from Autoencoders “In VAEs, the encoder becomes a variational inference network, mapping observed inputs to (approximate) posterior distributions over latent space, and the decoder becomes a generative network, capable of mapping arbitrary latent coordinates back to distributions over the original data space.” Fast Forward Labs, Under the Hood of the Variational Autoencoder (in Prose and Code) Q(z|x) P(x|z) Original Autoencoder
  • 8. Variational Autoencoders, (VAEs) Big Idea: Maximize likelihood of seeing our data. “We are aiming maximize the probability of each X in the training set under the entire generative process.” Where X is a datapoint, z is the latent variable, and theta are the parameters used for calculating z. Generally in VAEs And Note: Why are we doing this? Because Variational Autoencoders can be interpreted as using variational bayesian inference where, in this bayesian view, our data is seen as being pulled from some underlying distribution. x ~ p(x) Tutorial on Variational Autoencoders, Section 1.1 Preliminaries: Latent Variable Models Fast Forward Labs, Introducing Variational Autoencoders (in Prose and Code) What could this be?
  • 9. VAEs, Question 1 How do we define the latent variables z (i.e., decide what information they represent)? * Samples of z can be drawn from a simple distribution, i.e., N (0, I), where I is the identity matrix. * Provided powerful function approximators, we can simply learn a function which maps our independent, normally-distributed z values to whatever latent variables might be needed for the model, and then map those latent variables to X. A Neural Network! Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
  • 10. VAEs, Question 2 How do we deal with the integral over z? * VAEs alter the sampling procedure to make it faster! * The key idea behind the variational autoencoder is to attempt to sample values of z that are likely to have produced X, and compute P(X) just from those. This means that we need a new function Q(z|X) which can take a value of X and give us a distribution over z values that are likely to produce X. * Given this Q(Z|x), we can compute P(X|z) relatively easily, so… Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders
  • 11. VAEs and KL Divergence So now we need to relate E_{z~Q}P(X|z) and P(X), which we can do by minimizing the KL Divergence: Minimizing KL Divergence = Maximizing ELBO Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders By definition Apply Bayes Rule Rearrange Reconstruct Q Take the gradient of RHS
  • 12. KLD Loss: Has a closed form solution VAEs and KL Divergence Negative Reconstruction Loss: Can take sample of z and treat P(X|z) as an approximation, then backpropagate through the neural network. → Reparameterization trick. Minimizing KL Divergence = Maximizing ELBO Tutorial on Variational Autoencoders, Section 2, Variational Autoencoders Note: ELBO is the Evidence Lower Bound, and can be reformulated as the equation above.
  • 13. VAEs, Reparameterization Tutorial on Variational Autoencoders, Section 2.2 Optimizing the objective Deep Generative Models, Section 20.9 Back-Propagation through Random Operations
  • 14. VAEs, Example Fast Forward Labs, Introducing Variational Autoencoders (in Prose and Code)
  • 15. References (1) Shenlong, Wang Deep Generative Models (2) Chapter 20, Deep Generative Models (3) Tutorial on Variational Autoencoders (4) Fast Forward Labs, Under the Hood of the Variational Autoencoder (in Prose and Code) (5) Fast Forward Labs, Introducing Variational Autoencoders (in Prose and Code) (6) examples/main.py at master · pytorch/examples (7) CS231n, stanford