SlideShare a Scribd company logo
International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 5 Issue 4, May-June 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 778
An Extensive Review on
Generative Adversarial Networks (GAN’s)
Atharva Chitnavis1, Yogeshchandra Puranik2
1PG Student, 2Assistant Professor,
1,2Department of (MCA) P.E.S.’s Modern College of Engineering, Pune, Maharashtra, India
ABSTRACT
This paper is to provide a high-level understanding of Generative Adversarial
Networks. This paper will be covering the working of GAN’s by explaining the
background idea of the framework, types of GAN’s in the industry, it’s
advantages and disadvantages, history of how GAN’s are developed and
enhanced along the timeline and some applicationswhereGAN’soutperforms
themselves.
How to cite this paper: Atharva Chitnavis
| Yogeshchandra Puranik "An Extensive
Review on Generative Adversarial
Networks (GAN’s)"
Published in
International Journal
of Trend in Scientific
Research and
Development(ijtsrd),
ISSN: 2456-6470,
Volume-5 | Issue-4,
June 2021, pp.778-782, URL:
www.ijtsrd.com/papers/ijtsrd42357.pdf
Copyright © 2021 by author (s) and
International Journal ofTrendinScientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
the Creative
Commons Attribution
License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
1. INTRODUCTION
In the past many years there have been enormous
enhancements in the field of Deep Learning.Deeplearningis
a branch of Machine Learning which deploys algorithmsand
imitates the thinking process of a human brain. This is done
by embedding layers of Neural Networks together to carry
out tasks such as Human speech, Object recognitions, etc.
Feature extraction from the provideddata isthemaingoal of
Deep Learning. Artificial Neural Networks (ANN) is a sub-
branch of Deep Learning. They are mimicked from the
structure of neurons in our brain which are connected to
each other and transmit signals based on the input they
receive from the previous neuron. In 1951, Narvin Minsky
made the first Artificial Neural Network while working at
Princeton and since then there has been a substantial
increase in the power on ANN’s due to high computational
processors. Computer vision uses highlycomplicatedNeural
Networks depending ontheframework.Computervisionisa
field of Neural Networks where we have to make the
computer see the world which we perceive as picturesinthe
form of vector arraysasindependentpixels.Learningfeature
representation from large data sets of unlabelled data has
been a highly active area of research. Computer vision has
the leverage of having large volumes of unlabelled datasets
of videos and images available to be trained. Generative
Adversarial Networks (GAN’s) is a machine learning
framework designed by Ian Goodfellow with the main focus
of collecting data and creating new unseen noise data from
the trained model. GAN’s use two different neural networks
in order to predict outcomes. The first NN is the Generator
Model and the second NN is the DiscriminatorModel.Having
these two models simultaneously has given GAN’s a certain
edge over the rest of the framework in filtering out fakedata
from the whole dataset. In order to make a GAN very
effective we have to find a perfect balance between both the
models so that the second model is not masking the output
of the first model. GAN’s are used nowadays very widely for
computer vision-based tasks for accurate predictions and
better results. Applications GAN’s extends are (Generate
Photographs of Human Faces, Super Resolution, 3D Object
Generation, etc).
2. MODELS IN GAN’s
2.1. GENERATIVE MODELS
Generative models have gained a lot of popularity in recent
years. A generative model is used to create fresh new
instances and fetch them forward. A generative model could
create new videos, photos, or any kind of noise. This noise
data can also be used to fill out the missing data or predict
the missing data. Types of framework whichusedgenerative
models are as follows. Generative models study the joint
probability P(x,y).
IJTSRD42357
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 779
2.1.1. Bayesians Network
Bayesian’s network is a generative probabilistic model
which we can use effectively to represent random variables.
This model works with the help of two main parts structure
and parameter. The structure is the acyclic graph and
parameters consist of probabilities within each node. Based
on the output of these two parts the model predicts final
probabilities and generates final outputs.
2.1.2. Gaussian Model
This model assumes that all data points are a mixture of
finite Gaussians distribution. These points are then further
used to predict outcomessuchasbiometricsystems.Thetwo
main parts of this model are data pointsandequiprobability.
Real life application of this model can be clustering iris
databases.
2.2. Discriminative Model
Discriminative models use logistic regression techniques to
discriminate between multiple categories. The main task is
to train a model such that it could categorize the dataset by
the features it receives. These are highly used in statistical
classification in supervised learning. These models are also
known as conditional models. These models support vector
machines, decision trees and random forest. Discriminative
model studies the joint probability of P(x|y) i.e. It predicts
the probability of y targets when given x.
3. GAN IS A TWO PLAYER GAME
GAN networks stand out fromtheothergenerativenetworks
because it depends on two neural networks within itself for
predicting outputs. The first network is a generative model.
It provides GAN with large numbers of noise data. This data
can be anything from images, videos, voice etc. This data is
then used to train the second neural network which is the
discriminator model. Discriminator model is to betrainedto
classify the given data from the generator model in correct
slots. The discriminator should be able to tell whether the
receiving data is correct or incorrect data. We have to find a
balance between the generator and the discriminator such
that the data is successfully classified. A classic example of
GAN’s is (Let’s assume we have an Art thief whose work isto
replicate original art work which will act as a generator
model in our framework. Now we have an art inspector who
can differentiate between real and fake copies of art work
which will act as a discriminator model in our framework.)
The basic idea behind GAN’s is the same where two models
will work together where one is fetching real and fake data
and other is classifying the data in the categories. Larger the
data set the better the discriminator model keeps growing.
During the training process, weights and biases areadjusted
through backpropagating weights until the discriminator
learns to distinguish between real and fake images. The
generator gets this feedback fromthediscriminatoranduses
it to produce more real images. The discriminator model isa
convolutional neural network while the generator is a
deconvolutional network.
4. DIFFERENT TYPES OF GAN’S
4.1. Basic GAN’s
This framework has two neural networks generator and the
discriminator. Generator generates real andfakeimagesand
fetches them to the discriminator for further classification
and training, while discriminator uses those images to
classify them as real and fake to make the model stronger
and efficient.
4.2. Deep Convolutional GAN’s (DCGAN)
Convolutional networks can take image input and then
extract specific important features from the image and are
also able to differentiate between them. Rather than
implementing hardly encoded filters on the image,
convolutional networks are used to learn those filters
automatically and filter out important aspects in the image.
Similarly, DCGAN’sareimageversionsofGAN’s.DCGAN’s use
convolutional layers, here we replace max polling with
convolutional strides, transposed convolution for up
sampling, use ReLU in the generator model and Leaky ReLU
in the discriminator model.
4.3. Conditional GAN’s
CGAN is an update over the basic GAN. In GAN’s, the
generator model generates an image by plucking it from a
very large dataset. These result in very random images
which have no relation with our application. We can make
the image generation conditional if applied to a classlabel to
generate a specific type of image.
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 780
4.4. Stack GAN
The work of GAN’s was to generate output models by
filtering images from the dataset. This moto of GAN’s was
used and was implemented by applying text to image
conversion. In Stack GAN theuserprovidesa textdescription
of the output image which is required. Then the stack GAN
works in two stages. The first stage applies conditional
augmentation on the text parameters and stacks images of
64x64 and upsamples it in the generator. These images are
sent to the discriminator after down-sampling them. This
result goes to the second stage where it is down sampled
again to convert them into residual blocks resultingina final
generator output of 256x256 images, then finally the last
discriminator down samples it again gets the final image
output.
Applications of Stack GAN are (Comic creation, Art Creation,
Movie Creation, High Quality Image Generation, etc.)
4.5. Info GAN
This is an improvement on GAN’s and alsoisa contrasttothe
CGAN’s where we use labels to filter the dataset. Info GAN is
an unsupervised learning technique. Info GAN implements
information theory from statistics in the GAN framework.
Information theory suggests that there is a high volume of
information in an unlikely event compared to a likely event.
Info GAN is able to learn disentangled representations of
image in an unsupervised manner. This model is used when
the dataset is not labelled and is highly complex.
4.6. Cycle GAN’s
Image-image translation is generating a newly synthesised
version of a providedimage withmodifiedspecifications. E.g.
A horse image to a zebra image. In order to work perfectly
they need a large dataset of paired examples which are
highly expensive and difficult to prepare. To overcome this
problem, we use Cycle GAN which involves automatic
training of image-image translation modelswithoutneeding
paired examples. The model receives images from source
and target domain which don’t needto berelatedinany way.
Individual images are selected from these two domains and
the required features are extracted from them. Then these
features are combined to make a third translated image.
4.7. Discover Cross-DomainRelationswith GAN(Disco
GAN’s)
It is easy for humans to recognize relations between
different things from multiple domains easily without the
help of external supervision. This is a difficult task for
machines to achieve as they can automatically relate two
things. Disco GAN are very closely related to Cycle GAN.
Disco GAN uses two GAN networks inside itself that maps
each domain to its counterpart domain. The reconstruction
loss is used to verify how well the image is translated from
one Domain 1 to Domain 2 and vice versa. Each domain of
Disco GAN has its separate reconstruction losses which
makes a huge impact on cross domain reconstruction.
5. IMPLEMENTING GAN’S
5.1. Variable Settings
As GAN’s are highly used in the application of image
generation we will be using it in a DCGAN. Firstly, we will set
the parameters needed for the GAN such as (dataset root,
threads to load data, batch size, image size, colour channels
to use, length of vector, epoch numbers, setting learning
rate). Next is to set weight. Theweight_initfunctiontakes the
previous model as input and batch normalization,
convolutional-transpose.
5.2. Generator Setup
Generator is designed to map the latent space vector to data
space. Since our data is in the form of an image, we are
converting them into an RBG vector with the same
dimension as the image. The output of the generator is sent
to a tanh function to normalize it in output range of [-1,1].
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 781
5.3. Discriminator Setup
Discriminator is a binary classifier network that will take an
image as an input and give a scalar probability output which
will tell us if the input image is a real or a fake. The input
image from the generator goes through Conv2d,
BatchNorm2d, Leaky ReLU layers and outputs the final
output through a sigmoid function
5.4. Loss Functions and Optimizers
Once the generator and discriminator are set, wecanspecify
how they can learn from their loss. We use a Binary Cross
Entropy function fromPyTorchandthenoptimizetheoutput
Generator and Discriminator using ADAMS optimizer.
5.5. Training GAN
After both the neural nets have been set, we need to train
our model in two parts. GAN’s training is a tricky part
because it can fail
Due to wrong parameters and won’t give much of an
explanation for the collapse.
5.5.1. Training the discriminator
The goal of the discriminator is to classify the given input in
real or fake with high probabilities. Firstly, we need to train
it with a batch of real samples from the training dataset and
pass it to the discriminator, then calculate the gradient loss
in a backward pass. Then we need to do the same process
with the fake samples from the dataset. Now the gradient
pass from both the generator and discriminator can help us
set a step for the discriminator optimizer.
5.5.2. Training the generator
The work of the generator is to develop high quality fakes.
We can achieve this with the help of the gradient pass
received from the discriminator.
5.6. Results
Given below is thegraphdevelopedbyimplementingDCGAN
for generating random images from the dataset we provide.
We generate 2 different outputs as (Discriminator and
Generator losses along with Generator’s output with fixed
noise). In the graph we can see that the generator loss goes
on decreasing as the iteration goes on increasing. Lower the
loss closer is the model to generate images.
6. HISTORY OF GAN’S
Generative Adversarial Networks was a framework
developed by Ian Good fellow in the final year of his PhD in
2014. This idea was implemented further in 2017 for
enhanced focus on realistic texture rather than pixel
accuracy which helped in generating higher quality images
on even high levels of magnifications. In the year 2017 GAN
generated a first virtual human face which was displayed in
2018 at the Grand Palais. An innovative solution named
“Creative Adversarial Network” was developed and sold in
2018 which was able to generate appealing high quality
abstract paintings. In 2019 GAN produced a first human
talking video by generating each and every frame on its own
given only a single photo of that person. In 2020 Nvidia
taught an AI system (GameGAN) to recreate the complete
game of PAC-MAN by just simply watching it being played.
GAN has developed exponentially in a short time due to the
vast applications it can be applied in and also the idea of
implementing multiple neural nets in such a framework. It
gave it an edge especially in the computer vision field to
generate images efficiently.
7. Why Generative Adversarial Networks
In the 21th century machine learning has influenced many
areas in science, commerce and arts. All the way from
diagnosing skin diseases, generating abstract arts, to
enhancing credit systems we are able to implementmachine
learning algorithms everywhere.Oneofthemostchallenging
part is to fool the existing algorithm of neural networks by
adding noise data into original datasets. To tackle the
problem of Deepfake data we use neural networks such as
GAN’s Having two neural nets inside the framework helps to
keep up with the problem of filtering out noise from the
dataset easily solving the problem of Deepfakes.
8. APPLICATIONS OF GAN’S
8.1. Image Processing
8.1.1. Image Dataset Generations
GAN’s very first use was generating datasets of images from
the available sample data. Such as (Adding glasses to a face
image for a face with no glass). We can generate many such
types of datasets using GAN’s
8.1.2. Super Resolution
We have used GAN’s to generate images with much higher
resolution than the original image so that they won loose
detail due to magnification.
8.1.3. Face Generations
Generating human face images and videos by collecting
information from the available dataset was one pinnacle
achieved with the help of GAN’s.
8.1.4. Text to image translation
This was a useful functionality now possible with the helpof
GAN. We can input a lie of text and the GAN network will
generate an image based on the requirement entered by the
user.
8.2. Speech Generation
8.2.1. Music Generation
GAN was able to generate melodious audio on its own by
knowing what kind of music humans like. By analysing data
of many musical libraries, it was able to train the network
and generate audio based on the discriminator.
8.2.2. Speech Generation
We were also able to achieve complete speech generation
with help of GAN from scratch. We can feed topics to the
network as a label and the model returns a complete speech
in return.
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 782
9. ADVANTAGES AND DISADVANTAGES OF GANS
9.1. Advantages of GAN’s
A. GAN can generate high volumes of unseen data in form
of images, audio, video, text.
B. They don’t need any kind of labelled data for generation
or working of the network. They come under
unsupervised learning.
C. GAN’s are able to learn highly complicateddatasetswith
various data distributions.
9.2. Disadvantages of GAN’s
A. During the gradient backflow from final to thefirstlayer
the slope keeps on getting smaller. At times the slopes
are so small that the starting layers learn very slowly or
sometimes even stop learning.
B. The model collapses due to wrong hyper parameters
and does not provide with much of an explanation for
developer to solve the problem
10. Conclusion
This paper presents an extensive view on GAN’s, types of
GAN’s, Implementation, History, Advantages and
Disadvantages and its applications. I believe this paper will
help the reader get an in-depth understanding of GAN’s and
its working.
REFERENCES
[1] Ian J. Goodfellow, Jean Pouget-Abadie_, Mehdi Mirza,
Bing Xu, David Warde-Farley, Sherjil Ozairy, Aaron
Courville, Yoshua Bengio. “Generative Adversarial
Networks”
[2] Mehdi Mirza “Conditional Generative Adversarial
Networks”
[3] Alec Radford & Luke Metz” Unsupervised
Representation Learning with Deep Convolutional
Generative Adversarial Networks”
[4] https://pytorch.org/tutorials/beginner/dcgan_faces_
tutorial.html
[5] https://machinelearningmastery.com/what-are-
generative-adversarial-networks-gans/

More Related Content

PPT
Perceptron algorithm
PPTX
Fotogrametri dijital sift dan surf
PPTX
Image Sensing and Acquisition.pptx
PPTX
Pruning convolutional neural networks for resource efficient inference
PDF
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
PDF
Neural Networks: Multilayer Perceptron
PPTX
PPT
neural networks
Perceptron algorithm
Fotogrametri dijital sift dan surf
Image Sensing and Acquisition.pptx
Pruning convolutional neural networks for resource efficient inference
Artificial Neural Network Lecture 6- Associative Memories & Discrete Hopfield...
Neural Networks: Multilayer Perceptron
neural networks

What's hot (20)

PPT
Supervised and unsupervised learning
PPTX
Introduction to Genetic Algorithms
PPSX
zernike moments for image classification
PPTX
Intelligent control
PDF
Neural Networks: Principal Component Analysis (PCA)
PPTX
Particle Swarm Optimization
PPT
Nural network ER. Abhishek k. upadhyay
PDF
Particle Swarm Optimization: The Algorithm and Its Applications
PPTX
Artificial Neural Networks for NIU session 2016 17
PDF
Lecture11 - neural networks
PPTX
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
PPTX
Hybrid systems
PPTX
Fundamental Steps Of Image Processing
PDF
Modern Convolutional Neural Network techniques for image segmentation
PPTX
Support Vector Machine without tears
PDF
Kartografi dan Geovisualisasi - PCP (Parallel Coordinate Plot) - STC (Space-T...
PPTX
Optimization and particle swarm optimization (O & PSO)
PPTX
Data mining 8 estimasi linear regression
PPTX
03 Single layer Perception Classifier
PDF
Multi Layer Perceptron & Back Propagation
Supervised and unsupervised learning
Introduction to Genetic Algorithms
zernike moments for image classification
Intelligent control
Neural Networks: Principal Component Analysis (PCA)
Particle Swarm Optimization
Nural network ER. Abhishek k. upadhyay
Particle Swarm Optimization: The Algorithm and Its Applications
Artificial Neural Networks for NIU session 2016 17
Lecture11 - neural networks
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Hybrid systems
Fundamental Steps Of Image Processing
Modern Convolutional Neural Network techniques for image segmentation
Support Vector Machine without tears
Kartografi dan Geovisualisasi - PCP (Parallel Coordinate Plot) - STC (Space-T...
Optimization and particle swarm optimization (O & PSO)
Data mining 8 estimasi linear regression
03 Single layer Perception Classifier
Multi Layer Perceptron & Back Propagation
Ad

Similar to An Extensive Review on Generative Adversarial Networks GAN’s (20)

PDF
Top Blockchain Development Services | Build Your Blockchain Today
PDF
Exploring The Potential of Generative Adversarial Network: A Comparative Stud...
PDF
Synthetic Brain Images: Bridging the Gap in Brain Mapping With Generative Adv...
PDF
Synthetic Brain Images: Bridging the Gap in Brain Mapping With Generative Adv...
PDF
Generative Adversarial Networks GANs.pdf
PPTX
Face-GAN project report.pptx
PPTX
Face-GAN project report
PDF
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
PDF
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
PDF
Plant Disease Detection using Convolution Neural Network (CNN)
PDF
Image Captioning Generator using Deep Machine Learning
PDF
POTENTIAL IMPACT OF GENERATIVE ARTIFICIAL INTELLIGENCE(AI) ON THE FINANCIAL I...
PDF
Image Segmentation and Classification using Neural Network
PDF
Image Segmentation and Classification using Neural Network
PDF
IRJET - Deep Learning Applications and Frameworks – A Review
PDF
Unit 4 Deep Generative Models Unit 4 Deep Generative Model
PDF
Deep Learning for X ray Image to Text Generation
PDF
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
PDF
IRJET- Anomaly Detection System in CCTV Derived Videos
PDF
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
Top Blockchain Development Services | Build Your Blockchain Today
Exploring The Potential of Generative Adversarial Network: A Comparative Stud...
Synthetic Brain Images: Bridging the Gap in Brain Mapping With Generative Adv...
Synthetic Brain Images: Bridging the Gap in Brain Mapping With Generative Adv...
Generative Adversarial Networks GANs.pdf
Face-GAN project report.pptx
Face-GAN project report
PADDY CROP DISEASE DETECTION USING SVM AND CNN ALGORITHM
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
Plant Disease Detection using Convolution Neural Network (CNN)
Image Captioning Generator using Deep Machine Learning
POTENTIAL IMPACT OF GENERATIVE ARTIFICIAL INTELLIGENCE(AI) ON THE FINANCIAL I...
Image Segmentation and Classification using Neural Network
Image Segmentation and Classification using Neural Network
IRJET - Deep Learning Applications and Frameworks – A Review
Unit 4 Deep Generative Models Unit 4 Deep Generative Model
Deep Learning for X ray Image to Text Generation
Comparative Analysis of GANs and VAEs in Generating High-Quality Images: A Ca...
IRJET- Anomaly Detection System in CCTV Derived Videos
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
Ad

More from ijtsrd (20)

PDF
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
PDF
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
PDF
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
PDF
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
PDF
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
PDF
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
PDF
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
PDF
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
PDF
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
PDF
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
PDF
Automatic Accident Detection and Emergency Alert System using IoT
PDF
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
PDF
The Role of Media in Tribal Health and Educational Progress of Odisha
PDF
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
PDF
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
PDF
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
PDF
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Uterine Fibroids Homoeopathic Perspectives
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
Automatic Accident Detection and Emergency Alert System using IoT
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
The Role of Media in Tribal Health and Educational Progress of Odisha
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
Vitiligo Treated Homoeopathically A Case Report
Vitiligo Treated Homoeopathically A Case Report
Uterine Fibroids Homoeopathic Perspectives

Recently uploaded (20)

PDF
VCE English Exam - Section C Student Revision Booklet
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Insiders guide to clinical Medicine.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Classroom Observation Tools for Teachers
PPTX
Institutional Correction lecture only . . .
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Pharma ospi slides which help in ospi learning
PDF
01-Introduction-to-Information-Management.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
VCE English Exam - Section C Student Revision Booklet
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Insiders guide to clinical Medicine.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Renaissance Architecture: A Journey from Faith to Humanism
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Anesthesia in Laparoscopic Surgery in India
Classroom Observation Tools for Teachers
Institutional Correction lecture only . . .
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
Microbial disease of the cardiovascular and lymphatic systems
Pharma ospi slides which help in ospi learning
01-Introduction-to-Information-Management.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?

An Extensive Review on Generative Adversarial Networks GAN’s

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 5 Issue 4, May-June 2021 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 778 An Extensive Review on Generative Adversarial Networks (GAN’s) Atharva Chitnavis1, Yogeshchandra Puranik2 1PG Student, 2Assistant Professor, 1,2Department of (MCA) P.E.S.’s Modern College of Engineering, Pune, Maharashtra, India ABSTRACT This paper is to provide a high-level understanding of Generative Adversarial Networks. This paper will be covering the working of GAN’s by explaining the background idea of the framework, types of GAN’s in the industry, it’s advantages and disadvantages, history of how GAN’s are developed and enhanced along the timeline and some applicationswhereGAN’soutperforms themselves. How to cite this paper: Atharva Chitnavis | Yogeshchandra Puranik "An Extensive Review on Generative Adversarial Networks (GAN’s)" Published in International Journal of Trend in Scientific Research and Development(ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4, June 2021, pp.778-782, URL: www.ijtsrd.com/papers/ijtsrd42357.pdf Copyright © 2021 by author (s) and International Journal ofTrendinScientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0) 1. INTRODUCTION In the past many years there have been enormous enhancements in the field of Deep Learning.Deeplearningis a branch of Machine Learning which deploys algorithmsand imitates the thinking process of a human brain. This is done by embedding layers of Neural Networks together to carry out tasks such as Human speech, Object recognitions, etc. Feature extraction from the provideddata isthemaingoal of Deep Learning. Artificial Neural Networks (ANN) is a sub- branch of Deep Learning. They are mimicked from the structure of neurons in our brain which are connected to each other and transmit signals based on the input they receive from the previous neuron. In 1951, Narvin Minsky made the first Artificial Neural Network while working at Princeton and since then there has been a substantial increase in the power on ANN’s due to high computational processors. Computer vision uses highlycomplicatedNeural Networks depending ontheframework.Computervisionisa field of Neural Networks where we have to make the computer see the world which we perceive as picturesinthe form of vector arraysasindependentpixels.Learningfeature representation from large data sets of unlabelled data has been a highly active area of research. Computer vision has the leverage of having large volumes of unlabelled datasets of videos and images available to be trained. Generative Adversarial Networks (GAN’s) is a machine learning framework designed by Ian Goodfellow with the main focus of collecting data and creating new unseen noise data from the trained model. GAN’s use two different neural networks in order to predict outcomes. The first NN is the Generator Model and the second NN is the DiscriminatorModel.Having these two models simultaneously has given GAN’s a certain edge over the rest of the framework in filtering out fakedata from the whole dataset. In order to make a GAN very effective we have to find a perfect balance between both the models so that the second model is not masking the output of the first model. GAN’s are used nowadays very widely for computer vision-based tasks for accurate predictions and better results. Applications GAN’s extends are (Generate Photographs of Human Faces, Super Resolution, 3D Object Generation, etc). 2. MODELS IN GAN’s 2.1. GENERATIVE MODELS Generative models have gained a lot of popularity in recent years. A generative model is used to create fresh new instances and fetch them forward. A generative model could create new videos, photos, or any kind of noise. This noise data can also be used to fill out the missing data or predict the missing data. Types of framework whichusedgenerative models are as follows. Generative models study the joint probability P(x,y). IJTSRD42357
  • 2. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 779 2.1.1. Bayesians Network Bayesian’s network is a generative probabilistic model which we can use effectively to represent random variables. This model works with the help of two main parts structure and parameter. The structure is the acyclic graph and parameters consist of probabilities within each node. Based on the output of these two parts the model predicts final probabilities and generates final outputs. 2.1.2. Gaussian Model This model assumes that all data points are a mixture of finite Gaussians distribution. These points are then further used to predict outcomessuchasbiometricsystems.Thetwo main parts of this model are data pointsandequiprobability. Real life application of this model can be clustering iris databases. 2.2. Discriminative Model Discriminative models use logistic regression techniques to discriminate between multiple categories. The main task is to train a model such that it could categorize the dataset by the features it receives. These are highly used in statistical classification in supervised learning. These models are also known as conditional models. These models support vector machines, decision trees and random forest. Discriminative model studies the joint probability of P(x|y) i.e. It predicts the probability of y targets when given x. 3. GAN IS A TWO PLAYER GAME GAN networks stand out fromtheothergenerativenetworks because it depends on two neural networks within itself for predicting outputs. The first network is a generative model. It provides GAN with large numbers of noise data. This data can be anything from images, videos, voice etc. This data is then used to train the second neural network which is the discriminator model. Discriminator model is to betrainedto classify the given data from the generator model in correct slots. The discriminator should be able to tell whether the receiving data is correct or incorrect data. We have to find a balance between the generator and the discriminator such that the data is successfully classified. A classic example of GAN’s is (Let’s assume we have an Art thief whose work isto replicate original art work which will act as a generator model in our framework. Now we have an art inspector who can differentiate between real and fake copies of art work which will act as a discriminator model in our framework.) The basic idea behind GAN’s is the same where two models will work together where one is fetching real and fake data and other is classifying the data in the categories. Larger the data set the better the discriminator model keeps growing. During the training process, weights and biases areadjusted through backpropagating weights until the discriminator learns to distinguish between real and fake images. The generator gets this feedback fromthediscriminatoranduses it to produce more real images. The discriminator model isa convolutional neural network while the generator is a deconvolutional network. 4. DIFFERENT TYPES OF GAN’S 4.1. Basic GAN’s This framework has two neural networks generator and the discriminator. Generator generates real andfakeimagesand fetches them to the discriminator for further classification and training, while discriminator uses those images to classify them as real and fake to make the model stronger and efficient. 4.2. Deep Convolutional GAN’s (DCGAN) Convolutional networks can take image input and then extract specific important features from the image and are also able to differentiate between them. Rather than implementing hardly encoded filters on the image, convolutional networks are used to learn those filters automatically and filter out important aspects in the image. Similarly, DCGAN’sareimageversionsofGAN’s.DCGAN’s use convolutional layers, here we replace max polling with convolutional strides, transposed convolution for up sampling, use ReLU in the generator model and Leaky ReLU in the discriminator model. 4.3. Conditional GAN’s CGAN is an update over the basic GAN. In GAN’s, the generator model generates an image by plucking it from a very large dataset. These result in very random images which have no relation with our application. We can make the image generation conditional if applied to a classlabel to generate a specific type of image.
  • 3. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 780 4.4. Stack GAN The work of GAN’s was to generate output models by filtering images from the dataset. This moto of GAN’s was used and was implemented by applying text to image conversion. In Stack GAN theuserprovidesa textdescription of the output image which is required. Then the stack GAN works in two stages. The first stage applies conditional augmentation on the text parameters and stacks images of 64x64 and upsamples it in the generator. These images are sent to the discriminator after down-sampling them. This result goes to the second stage where it is down sampled again to convert them into residual blocks resultingina final generator output of 256x256 images, then finally the last discriminator down samples it again gets the final image output. Applications of Stack GAN are (Comic creation, Art Creation, Movie Creation, High Quality Image Generation, etc.) 4.5. Info GAN This is an improvement on GAN’s and alsoisa contrasttothe CGAN’s where we use labels to filter the dataset. Info GAN is an unsupervised learning technique. Info GAN implements information theory from statistics in the GAN framework. Information theory suggests that there is a high volume of information in an unlikely event compared to a likely event. Info GAN is able to learn disentangled representations of image in an unsupervised manner. This model is used when the dataset is not labelled and is highly complex. 4.6. Cycle GAN’s Image-image translation is generating a newly synthesised version of a providedimage withmodifiedspecifications. E.g. A horse image to a zebra image. In order to work perfectly they need a large dataset of paired examples which are highly expensive and difficult to prepare. To overcome this problem, we use Cycle GAN which involves automatic training of image-image translation modelswithoutneeding paired examples. The model receives images from source and target domain which don’t needto berelatedinany way. Individual images are selected from these two domains and the required features are extracted from them. Then these features are combined to make a third translated image. 4.7. Discover Cross-DomainRelationswith GAN(Disco GAN’s) It is easy for humans to recognize relations between different things from multiple domains easily without the help of external supervision. This is a difficult task for machines to achieve as they can automatically relate two things. Disco GAN are very closely related to Cycle GAN. Disco GAN uses two GAN networks inside itself that maps each domain to its counterpart domain. The reconstruction loss is used to verify how well the image is translated from one Domain 1 to Domain 2 and vice versa. Each domain of Disco GAN has its separate reconstruction losses which makes a huge impact on cross domain reconstruction. 5. IMPLEMENTING GAN’S 5.1. Variable Settings As GAN’s are highly used in the application of image generation we will be using it in a DCGAN. Firstly, we will set the parameters needed for the GAN such as (dataset root, threads to load data, batch size, image size, colour channels to use, length of vector, epoch numbers, setting learning rate). Next is to set weight. Theweight_initfunctiontakes the previous model as input and batch normalization, convolutional-transpose. 5.2. Generator Setup Generator is designed to map the latent space vector to data space. Since our data is in the form of an image, we are converting them into an RBG vector with the same dimension as the image. The output of the generator is sent to a tanh function to normalize it in output range of [-1,1].
  • 4. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 781 5.3. Discriminator Setup Discriminator is a binary classifier network that will take an image as an input and give a scalar probability output which will tell us if the input image is a real or a fake. The input image from the generator goes through Conv2d, BatchNorm2d, Leaky ReLU layers and outputs the final output through a sigmoid function 5.4. Loss Functions and Optimizers Once the generator and discriminator are set, wecanspecify how they can learn from their loss. We use a Binary Cross Entropy function fromPyTorchandthenoptimizetheoutput Generator and Discriminator using ADAMS optimizer. 5.5. Training GAN After both the neural nets have been set, we need to train our model in two parts. GAN’s training is a tricky part because it can fail Due to wrong parameters and won’t give much of an explanation for the collapse. 5.5.1. Training the discriminator The goal of the discriminator is to classify the given input in real or fake with high probabilities. Firstly, we need to train it with a batch of real samples from the training dataset and pass it to the discriminator, then calculate the gradient loss in a backward pass. Then we need to do the same process with the fake samples from the dataset. Now the gradient pass from both the generator and discriminator can help us set a step for the discriminator optimizer. 5.5.2. Training the generator The work of the generator is to develop high quality fakes. We can achieve this with the help of the gradient pass received from the discriminator. 5.6. Results Given below is thegraphdevelopedbyimplementingDCGAN for generating random images from the dataset we provide. We generate 2 different outputs as (Discriminator and Generator losses along with Generator’s output with fixed noise). In the graph we can see that the generator loss goes on decreasing as the iteration goes on increasing. Lower the loss closer is the model to generate images. 6. HISTORY OF GAN’S Generative Adversarial Networks was a framework developed by Ian Good fellow in the final year of his PhD in 2014. This idea was implemented further in 2017 for enhanced focus on realistic texture rather than pixel accuracy which helped in generating higher quality images on even high levels of magnifications. In the year 2017 GAN generated a first virtual human face which was displayed in 2018 at the Grand Palais. An innovative solution named “Creative Adversarial Network” was developed and sold in 2018 which was able to generate appealing high quality abstract paintings. In 2019 GAN produced a first human talking video by generating each and every frame on its own given only a single photo of that person. In 2020 Nvidia taught an AI system (GameGAN) to recreate the complete game of PAC-MAN by just simply watching it being played. GAN has developed exponentially in a short time due to the vast applications it can be applied in and also the idea of implementing multiple neural nets in such a framework. It gave it an edge especially in the computer vision field to generate images efficiently. 7. Why Generative Adversarial Networks In the 21th century machine learning has influenced many areas in science, commerce and arts. All the way from diagnosing skin diseases, generating abstract arts, to enhancing credit systems we are able to implementmachine learning algorithms everywhere.Oneofthemostchallenging part is to fool the existing algorithm of neural networks by adding noise data into original datasets. To tackle the problem of Deepfake data we use neural networks such as GAN’s Having two neural nets inside the framework helps to keep up with the problem of filtering out noise from the dataset easily solving the problem of Deepfakes. 8. APPLICATIONS OF GAN’S 8.1. Image Processing 8.1.1. Image Dataset Generations GAN’s very first use was generating datasets of images from the available sample data. Such as (Adding glasses to a face image for a face with no glass). We can generate many such types of datasets using GAN’s 8.1.2. Super Resolution We have used GAN’s to generate images with much higher resolution than the original image so that they won loose detail due to magnification. 8.1.3. Face Generations Generating human face images and videos by collecting information from the available dataset was one pinnacle achieved with the help of GAN’s. 8.1.4. Text to image translation This was a useful functionality now possible with the helpof GAN. We can input a lie of text and the GAN network will generate an image based on the requirement entered by the user. 8.2. Speech Generation 8.2.1. Music Generation GAN was able to generate melodious audio on its own by knowing what kind of music humans like. By analysing data of many musical libraries, it was able to train the network and generate audio based on the discriminator. 8.2.2. Speech Generation We were also able to achieve complete speech generation with help of GAN from scratch. We can feed topics to the network as a label and the model returns a complete speech in return.
  • 5. International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD42357 | Volume – 5 | Issue – 4 | May-June 2021 Page 782 9. ADVANTAGES AND DISADVANTAGES OF GANS 9.1. Advantages of GAN’s A. GAN can generate high volumes of unseen data in form of images, audio, video, text. B. They don’t need any kind of labelled data for generation or working of the network. They come under unsupervised learning. C. GAN’s are able to learn highly complicateddatasetswith various data distributions. 9.2. Disadvantages of GAN’s A. During the gradient backflow from final to thefirstlayer the slope keeps on getting smaller. At times the slopes are so small that the starting layers learn very slowly or sometimes even stop learning. B. The model collapses due to wrong hyper parameters and does not provide with much of an explanation for developer to solve the problem 10. Conclusion This paper presents an extensive view on GAN’s, types of GAN’s, Implementation, History, Advantages and Disadvantages and its applications. I believe this paper will help the reader get an in-depth understanding of GAN’s and its working. REFERENCES [1] Ian J. Goodfellow, Jean Pouget-Abadie_, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozairy, Aaron Courville, Yoshua Bengio. “Generative Adversarial Networks” [2] Mehdi Mirza “Conditional Generative Adversarial Networks” [3] Alec Radford & Luke Metz” Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks” [4] https://pytorch.org/tutorials/beginner/dcgan_faces_ tutorial.html [5] https://machinelearningmastery.com/what-are- generative-adversarial-networks-gans/