BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis

Large Scale GAN Training for
High Fidelity Natural Image Synthesis
Andrew Brock, Jeﬀ Donahue, Karen Simonyan
2018.10.14
Presented by Young Seok Kim
Under Review for ICLR 2019
https://arxiv.org/abs/1809.11096 
https://openreview.net/forum?id=B1xsqj09Fm
BigGAN

Related papers in PR12
• Goodfellow, Ian J. et al. “Generative Adversarial Nets.” NIPS (2014)

• PR-001 : https://youtu.be/L3hz57whyNw 
• Chen, Xi et al. “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative
Adversarial Nets.” NIPS (2016)

• PR-022 : https://youtu.be/_4jbgniqt_Q 
• Mirza, Mehdi and Simon Osindero. “Conditional Generative Adversarial Nets.” CoRR abs/1411.1784 (2014)

• PR-051 : https://youtu.be/iCgT8G4PkqI 
• Miyato, Takeru et al. “Spectral Normalization for Generative Adversarial Networks.” ICLR (2018)

• PR-087 : https://youtu.be/iXSYqohGQhM
!3

Motivation
• Since the advent of GAN, there has been several attempts to stabilize the training of
GAN.

• Although some were quite successful in small scale, generating high-resolution and
diverse samples from datasets like ImageNet is hard

• We need more analysis to instabilities of GAN
!4

Contributions
• We demonstrate that GANs benefit dramatically from scaling. We introduce two simple,
general architectural changes that improve scalability, and modify a regularization
scheme to improve conditioning, demonstrably boosting performance.

• Our models become amenable to the “truncation trick,” a simple sampling technique
that allows explicit, fine-grained control of the tradeoff between sample variety and
fidelity.

• We discover instabilities specific to large scale GANs, and characterize them
empirically.
!5

Core Methods
• Shared Embedding

• Hierarchical latent space

• Orthogonal Regularization

• Truncation Trick
!6

Shared embedding
• Introduced in

• Perez, Ethan et al. “FiLM: Visual
Reasoning with a General
Conditioning Layer.” AAAI (2018)

• Shared embedding is linearly
projected to each layer’s gains
and biases

• Reduces computation and
memory cost and improves
training speed by 37%
!7

Hierarchical latent space
• Noise vector z is fed into multiple layers of G,  
instead of just the initial layer

• z is spliced into one chunk per resolution, 
and concatenated to conditional vector c, 
which is projected to BatchNorm gains and
biases

• Improves memory and compute costs

• Provides performance improvement of 4%

• Improves training speed by further 18%
!8

Truncation Trick
• Truncating a z vector by resampling the values with magnitude above a chosen threshold

• Leads to improvement in individual sample quality at the cost of reduction in overall sample variety
!10

Orthogonal Regularization
• Original thoughts
!11
from Spectral Normalization paper

• Original thoughts
!12
• Suggested Method

• “The version we ﬁnd to work best removes the diagonal terms from the regularization,
and aims to minimize the pairwise cosine similarity between ﬁlters but does not
constrain their norm”

• Without Orthogonal Regularization 
-> only 16% of models are amenable to truncation

• With Orthogonal Regularization 
-> 60% of models are amenable to truncation
!13
• Suggested Method

Review
• Inception Score (IS)

• Fréchet Inception Distance (FID)

• Spectral Normalization
!14

Inception Score (IS)
• Introduced in

• Salimans, Tim et al. “Improved Techniques for Training GANs.” NIPS (2016).
!15
• p(y|x) is evaluated with Inception model

• If the x (image) is recognizable as y (label), p(y|x) should have low entropy
• p(y) is calculated with marginal.

• If model outputs diverse images, p(y) should have high entropy
• TLDR; Higher is better.

Fréchet Inception Distance (FID)
• Introduced in

• Heusel, Martin et al. “GANs Trained by a Two Time-Scale Update Rule Converge to a
Local Nash Equilibrium.” NIPS (2017).
!16
• Improves IS by actually comparing the statistics of generated samples to real samples

• TDLR; Lower is Better

Spectral Normalization Review
!17
Deﬁnition of Norm Spectral norm is equal to the highest singular value

Result
!18
Batch Experiment
Ablation Experiment

Result
!19
A typical plot of the ﬁrst singular value σ0 in the layers of G (a) and D (b) before Spectral Normalization

References
• https://nealjean.com/ml/frechet-inception-distance/

• Explanation about IS / FID
!27

BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis

More Related Content

What's hot (20)

Similar to BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis (20)

Recently uploaded (20)

BigGAN: Large Scale GAN Training for High Fidelity Natural Image Synthesis