Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and its Applications - Sreya Ghosh

Let's Paint a Picasso:
A look at Generative Adversarial Networks (GAN)
and its applications.
Sreya Ghosh (Ph.D.)
Data Scientist II, Merchandising
Wayfair

Portrait of Edmond Belamy
Artist: Obvious
Year: 2018
Auction House: Christie’s
Price: $432,500

Generative Adversarial Networks (GANS)
Generative: Generate new data based on some learned features.
Adversarial: A game-theory derived cost function
Network: Deep Neural Networks
Yann LeCun : “Adversarial training is the coolest thing since sliced bread.”
Generative Adversarial Nets.
Ian J. Goodfellow , Jean Pouget-Abadie , Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair , Aaron
Courville, Yoshua Bengio
University of Montreal, NeuralIPS, 2014

AutoEncoders
Encoder: This is the part of the network that compresses the input into a latent-
space representation. It can be represented by an encoding function h=f(x).
Decoder: This part aims to reconstruct the input from the latent space
representation. It can be represented by a decoding function r=g(h).

Generative Model
● Noise from a simple distribution like a Gaussian/Uniform Distribution is used to
represent Latent Features.
● The semantic meaning of the latent features is learnt via a neural network.

Discriminative Model
● Generator alone will just create
random noise. Conceptually, the
discriminator in GAN provides
guidance to the generator on what
to create.
● Discriminator predicts the label of
images.
○ Fake images generated from the
Generator are 0
○ Images from the distribution is 1
● The idea is to bring the accuracy
of the Discriminator to 50%

The Cost Function
Discriminator: Cross-Entropy
Generator:
GAN:

Types of GANs
Deep Convolutional GANs (DCGANs): The first improvement over GAN. This
network studied and found optimal hyper-parameters.
Conditional GANs (cGANs): These are GANs that use extra label information.
This results in better quality images and being able to control to an extent how
generated images will look.
Wasserstein GANs : Change the loss function to include the Wasserstein
distance. As a result, WassGANs have loss functions that correlate with image
quality. Also, training stability improves and is not as dependent on the
architecture.

Super Resolution GAN (SRGAN)
● Super-resolution GAN applies a deep
network in combination with an
adversary network to produce higher
resolution images.
● During the training, a high-resolution
image (HR) is downsampled to a low-
resolution image (LR).
● A GAN generator upsamples LR
images to super-resolution images
(SR).
● A discriminator to distinguish the HR
images and backpropagate the GAN
loss to train the discriminator and the
generator.

StackGAN: Generative Adversarial Text to Image
Synthesis
● Generate photo-realistic images
conditioned on text descriptions.
● The Stage-I GAN sketches the
primitive shape and colors of the
object based on the given text
description.
● The Stage-II GAN takes Stage-I
results and text descriptions as
inputs, and generates high-
resolution images with photo-
realistic details.
● Conditioning Augmentation
technique encourages
smoothness in the latent
conditioning manifold.

Generative Visual Manipulation on the Natural Image
Manifold
● We then define a class of image
editing operations, and constrain
their output to lie on that learned
manifold at all times.
● The model automatically adjusts
the output keeping all edits as
realistic as possible.
● The presented method can further
be used for changing one image
to look like the other, as well as
generating novel imagery from
scratch based on user's scribbles.

Attribute2Image: Conditional Image Generation from
Visual Attributes
● The image as a composite of
foreground and background
● A layered generative model
with disentangled latent
variables that can be learned
end-to-end using a
variational auto-encoder.
● Generating realistic and
diverse samples with
disentangled latent
representations.

Pose Guided Person Image Generation

Trying to paint Modern Art
● Scraped 1500 images from Google
images using search terms paintings
of faces.
● Augmented the data-set using flipped
images.
● All images were resized to
112x112x3.
● DC-GAN was used to generate the
images
● 20,000 epochs.

Challenges
● The networks try to take successive steps
to minimize a non-convex objective and
end up in an oscillating process rather
than decreasing the underlying true
objective.
● The generator can accidentally start to
produce several copies of exactly the
same image.
● GANs can sometimes be far-sighted and
fail to differentiate the number of particular
objects that should occur at a location.
● GANs sometime are not capable of
differentiating between front and back
view.

Future Work with GAN
● Find effective solutions to challenges, different cost functions that ensure
Nash equilibrium.
● Drug discovery and potential medical applications.
● Application in NLP, especially language modelling.
● Speech generation.
● Cataloging and Image based search.

Citations:
1.Generative Adversarial Nets. Ian J. Goodfellow , Jean Pouget-Abadie , Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil
Ozair , Aaron Courville, Yoshua Bengio, University of Montreal, NeuralIPS, 2014.
2. GAN — What is Generative Adversary Networks GAN? Jonathan Hui. 06-2018
3. GAN — Some cool applications of GANs. Jonathan Hui. 06-2018
4.Generative Adversarial Networks (GANs) — A Beginner’s Guide. Owen Carey. 2018
5. DCGAN: Generate the images with Deep Convolutional GAN. Keisuke Umezawa. 2018
6. https://github.com/eriklindernoren/Keras-GAN

Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and its Applications - Sreya Ghosh

More Related Content

Similar to Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and its Applications - Sreya Ghosh (20)

More from Catalina Arango (9)

Recently uploaded (20)

Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and its Applications - Sreya Ghosh