SlideShare a Scribd company logo
Slide Number
Generative AI
Section Name
Faculty Name : Dr J Alamelu Mangai
Designation : Professor
Department : CSE
Subject Code & Subject Name : CSE3348 Generative AI
Students: School of Computer Science and Engineering
What is GenAI?
• Generative AI refers to a set of algorithms that can generate new
content in any medium such as image, text, audio or video.
• This generated content is similar to the content that the algorithm is
trained on.
• A prominent type of generative AI is the large language model (LLM),
which generates natural language texts based on prompts.
• GPT (Generative Pre-trained Transformer) series is a well-known
example of generative AI.
• ChatGPT is a renowned example of LLMs.
Ranjitha P-20213CSE0014 2
Ranjitha P-20213CSE0014 3
What is GenAI?
• GenAI :
• algorithms that generate novel content
• unlike traditional predictive ML, they do
not analyse or act on the existing data
• GenAI models have the ability to generate
text, images and other creative content
indistinguishable from human-generated
content.
Ranjitha P-20213CSE0014 4
Generative Vs. Discriminative Modeling [T2 Pg 1- 5
• Discriminative Modeling is like supervised learning
Ranjitha P-20213CSE0014 5
What is Generative modeling? [T2 pg. 1 – 5]
• A generative model describes how a data set is generated in terms of
a probabilistic model.
• By sampling this model, new data can be generated.
Ranjitha P-20213CSE0014 6
What is Generative modelling?
• Any generative modeling process has:
• A training data : examples of the entity the model has to generate.
• Observation : one of the examples from the training data
• Each Observation is defined using many features.
• Ex: Image of a horse has individual pixel values as features
• A generative model has to be probabilistic and not deterministic.
• The model should have some randomness that influences the sample
generated every time by the model.
• The model has to identify the unknown prob. distribution that
justifies/distinguishes the images present in the training data from
those not in the training set.
Ranjitha P-20213CSE0014 7
• If the model mimics this distribution, by sampling it can generate new
observations that look realistic.
• Discriminative modelling is done on a labelled data.
• Generative modeling is usually done on an unlabelled data (like
unsupervised learning)
• It can also be used to generate samples of a distinct class in the
training data.
Ranjitha P-20213CSE0014 8
Generative Modeling projects- Examples
• StyleGan by NVIDIA – generates hyper-realistic images of human
faces.
• GPT by OpenAI : given a short introductory passage, the model
completes the given passage.
Ranjitha P-20213CSE0014 9
Generative Modeling projects – Examples[T1 pg 4-
• OpenAI :
• A US based AI research company that promotes and develop friendly AI
applications.
• Started as a non-profit organisation in 2015 .
• In 2019, it became a for profit organisation.
• Significant achievements : Gym library for training reinforcement learning
algorithms
• Recently – GPT-n models and Dall-E generative models which generates
images from text.
Ranjitha P-20213CSE0014 10
Generative models? T1 Pg.4
• Artificial Intelligence (AI) : a broad field of CS focussed on creating intelligent
agents that can reason, learn and act autonomously
• Machine Learning(ML): a subset of AI, focussed on developing algorithms that
can learn from data.
• Deep Learning(DL): uses deep neural networks with many layers, as a mechanism
of ML to learn complex patterns from data.
• Generative models are a type of ML model, that can generate new data based on
patterns learnt from the input data.
• Language Models (LMs): are statistical models used to predict words in a
sequence of natural language text. ”The sky is ********”
• Large Language models (LLMs) : uses deep learning and are trained on massive
data sets.
Ranjitha P-20213CSE0014 11
Ranjitha P-20213CSE0014 12
• Generative models :
• a powerful type of AI that can generate new data that resembles the training
data.
• They handle different data modalities
• They are used in different domains – text, image, music and video
• They synthesise new data rather than just making predictions/decisions
• They are used in applications generating text, image, music and video.
• When real data is scarce to train an AI model, generative models can be used
to create synthetic data.
Ranjitha P-20213CSE0014 13
OpenAI’s generative models https://platform.openai.com/docs/models
Ranjitha P-20213CSE0014 14
Evolution of Generative AI
• 1948: Claude Shannon wrote a paper called “A Mathematical Theory
of Communication“. In this paper, he introduced the idea of n-grams,
a statistical model that can generate new text based on existing text.
• 1950: Alan Turing wrote a paper called “Computing Machinery and
Intelligence“. In this paper, he introduced the Turing Test, which is a
way to determine if a machine can behave intelligently like a human.
• 1952: A.L. Hodgkin and A.F. Huxley created a mathematical model
that explained how the brain uses neurons to create an electrical
network. This model inspired the development of artificial neural
networks, which are used in generative AI.
Ranjitha P-20213CSE0014 15
• 1965: Alexey Ivakhnenko and Valentin Lapa developed the first
learning algorithm for feedforward neural networks. This algorithm
enabled the networks to learn complex nonlinear functions from data.
• 1979: Kunihiko Fukushima introduced the neocognitron, a powerful
type of neural network known as a deep convolutional neural
network. It was specifically designed to identify and recognize
handwritten digits and various other patterns.
• 1986: David Rumelhart, Geoffrey Hinton, and Ronald Williams wrote a
paper called “Learning Representations by Back-propagating Errors.”
This paper introduced the backpropagation algorithm, which is
commonly used to train neural networks.
Ranjitha P-20213CSE0014 16
• 1991: Sepp Hochreiter introduced the long short-term memory
(LSTM) network. It is a type of recurrent neural network that can
learn long-term relationships in sequential data.
• 2001: Yoshua Bengio and his colleagues created a neural network
called the Neural Probabilistic Language Model (NPLM). This model
can learn how words are used in natural language.
• 2014: Diederik Kingma and Max Welling introduced the variational
autoencoder (VAE). It is a type of model that can learn
representations of data and generate new data based on those
learned representations.
Ranjitha P-20213CSE0014 17
• 2014: Ian Goodfellow and his colleagues introduced the generative
adversarial network (GAN). It is a type of generative model that
comprises two neural networks: a generator and a discriminator. The
generator aims to generate realistic data, while the discriminator aims
to differentiate between real and fake data.
• 2015: Yann LeCun and his team proposed the diffusion model. It is a
generative model that learns to reverse a process that gradually
transforms data into noise.
• 2016: Aaron van den Oord and his team introduced WaveNet, a
powerful neural network that can create lifelike speech and music
waveforms.
Ranjitha P-20213CSE0014 18
• 2017: Ashish Vaswani and his team introduced the Transformer, a
neural network design that leverages attention mechanisms to learn
from sequential information, like language or speech.
• 2018: Alec Radford and his team introduced Generative Pre-trained
Transformer (GPT). This is a big model that uses the Transformer
architecture to create different kinds of text on different subjects.
• 2018: Jacob Devlin and his team introduced BERT, a powerful model
that can understand the meaning of words and sentences in any
language. It uses a technique called Transformers to learn from lots of
text without needing specific labels.
Ranjitha P-20213CSE0014 19
• 2019: a researcher named Tero Karras and his team
introduced StyleGAN, an enhanced type of GAN (Generative
Adversarial Network) that can create a wide range of detailed and
realistic images, including faces, animals, landscapes, and more.
• 2020: Large Language Models Take Center Stage: OpenAI’s GPT-3
(Generative Pre-trained Transformer 3) with 175 billion parameters
pushes the boundaries of language generation, demonstrating
impressive capabilities in text creation, translation, and code writing.
• 2020: a team led by Alexei Baevski introduced wav2vec 2.0. It is a
model that can learn speech representations directly from raw audio
and achieved excellent performance in speech recognition tasks.
Ranjitha P-20213CSE0014 20
• 2021: Aditya Ramesh and his team created DALL-E, a powerful model
that can create lifelike images based on written descriptions.
• 2021: Focus on Control and Explainability: Researchers grapple with
the “black box” nature of large language models, seeking methods to
improve control over generated outputs and explain the reasoning
behind their creations.
• 2022: Diffusion Models Gain Traction: Diffusion models, known for
their ability to create realistic images, experience a surge in
popularity. Applications in image generation, editing, and inpainting
become prominent.
Ranjitha P-20213CSE0014 21
• 2023: Multimodal Generative AI Takes Shape: Models capable of
generating across different modalities, like text and image
combinations, start to emerge. This opens doors for more interactive
and immersive experiences.
• 2023: Ethical Considerations Mount: Concerns around bias,
misinformation, and potential misuse of generative AI lead to
discussions on responsible development and deployment practices.
• 2024: Focus on Real-World Integration: A growing trend towards
integrating generative AI tools into real-world applications across
various industries like customer service, product design, and
marketing.
Ranjitha P-20213CSE0014 22
Advantages of generative modeling
• Synthetic data generation using generative models reduces the cost of
labelling and improves the training efficiency.
• Microsoft Research trained their LLM named phi-1 using generative
modelling, for basic Python coding.
• It is a transformer with 1.3 billion parameters.
• Trained on code from The Stack, Q&A content from StackOverflow,
synthetic codes generated by GPT3.5
• “Textbooks Are All You Need, June 2023”
https://www.microsoft.com/en-us/research/publication/textbooks-
are-all-you-need
/
Ranjitha P-20213CSE0014 23
Ranjitha P-20213CSE0014 24
Types of generative models[T1 pg 6]
• Different types of generative models for different data modalities:
1) Text-to-text :
• models that generate text from input text, like conversational agents. Ex:
LLaMa 2, GPT-4, Claude, PaLM 2
• A conversational agent is a program designed to converse with humans in
natural language.
• It can talk to people on phones, computers, and other devices, allowing them
to order food or do other functions through voice, text, or chat.
• It can achieve these using technologies like natural language processing (NLP),
machine learning (ML), speech recognition, text-to-speech synthesis, and
dialog management to interact with people through various mediums.
Ranjitha P-20213CSE0014 25
• Llama 2 is a family of pre-trained and fine-tuned
large language models (LLMs) released by Meta AI in 2023.
• Released free of charge for research and commercial use, Llama 2 AI
models are capable of a variety of
natural language processing (NLP) tasks, from text generation to
programming code.
Ranjitha P-20213CSE0014 26
• GPT-n by OpenAI:
• Generative Pre-trained Transformer 3
(GPT-3) is a large language model
released by OpenAI in 2020.
• it is a decoder-only transformer model
of deep neural network and convolution
-based architectures with a
technique known as "attention“ with 175
billion parameters.
Ranjitha P-20213CSE0014 27
2) Text-to-Image:
• Models that generate images from text captions. Ex: Dall-E 2, Stable Diffusion
and Imagen.
• Dall-E 2 : https://openai.com/index/dall-e-2/
• DALL·E is a 12-billion parameter version of GPT-3 (opens in a new window)
trained to generate images from text descriptions, using a dataset of text–
image pairs.
Ranjitha P-20213CSE0014 28
Ranjitha P-20213CSE0014 29
3) Text-to-Audio:
• Models that generate audio clips and music from text. Ex: Jukebox, AudioLM and
MusicGen
• Jukebox is a neural network-based tool that uses artificial intelligence to
generate music.
• Developed by OpenAI, Jukebox is a neural network model capable of composing
original songs in different genres and styles.
• Jukebox employs a combination of deep learning techniques, including generative
modeling and reinforcement learning, to create music that is both coherent and
creative.
• The main use cases of Jukebox include music generation, song completion, and
music style transfer. It can generate new songs in the style of a given artist or
even complete a song given a short melody.
Ranjitha P-20213CSE0014 30
4) Text-to-video:
• Models that generate video content from text descriptions. Ex:
Phenaki and Emu Video
• Phenaki : A model for generating videos from text, with prompts that
can change over time, and videos that can be as long as multiple
minutes. https://phenaki.video/
5) Text-to-Speech: Models that synthesize speech audio from input
text. Ex: WaveNet and Tacotron
6) Speech-to-text: Models that transcribe speech to text [ also called
Automatic Speech Recognition ASR]. Ex: Whisper and SpeechGPT
Ranjitha P-20213CSE0014 31
7) Image-to-text: Models that generate image captions from images.
Ex: CLIP and DALL-E 3.
8) Image to Image: Applications –
• data augmentation,
• Neural style transfer (NST) - manipulate digital images,
or videos, in order to adopt the appearance or visual style of
another image.
• generating a new image by combining the content of
one image with the style of another image.
• The goal of style transfer is to create an image that
preserves the content of the original image while
applying the visual style of another image.
Ranjitha P-20213CSE0014 32
Ranjitha P-20213CSE0014 33
• Inpainting : removing defects in the image
Ex: Right arm is missing in the original image
Ranjitha P-20213CSE0014 34
9) Text-to-code: models that generate programming code from text.
Ex: Stable diffusion and Dall-E 3
10) Video-to-audio:
Models that analyse video and generate matching audio.
Ex: Soundify
11) Text-to-Math: generates mathematical expressions from text.
• Many other combinations of data modalities exists
• Text is the common modality.
• OpenAI’s GPT-4V model – Sep 2023 takes both text and images to
better OCR to read text from images.
Ranjitha P-20213CSE0014 35

More Related Content

PDF
Lec 1-2 ssdsdffffsssssfsdfsdfstGenAI.pdf
DOCX
What Are Generative Al Models? A Deep Dive Blog.docx
PDF
Zilliz - Overview of Generative models in ML
PPTX
Generative AI and ChatGPT - Scope of AI and advance Generative AI
PDF
SHUBHAM AI PPT for grapsp about artificial intelligence.pdf
PDF
Generative AI: Top Use Cases, Solutions, and How to Implement Them
PDF
The current state of generative AI
PPTX
Exploring the Foundations and Applications of Generative Artificial Intellige...
Lec 1-2 ssdsdffffsssssfsdfsdfstGenAI.pdf
What Are Generative Al Models? A Deep Dive Blog.docx
Zilliz - Overview of Generative models in ML
Generative AI and ChatGPT - Scope of AI and advance Generative AI
SHUBHAM AI PPT for grapsp about artificial intelligence.pdf
Generative AI: Top Use Cases, Solutions, and How to Implement Them
The current state of generative AI
Exploring the Foundations and Applications of Generative Artificial Intellige...

Similar to Gnerative AI presidency Module1_L1_L2.pptx (20)

PPTX
Introduction to Generative AI refers to a subset of artificial intelligence
PDF
GENAI GENAI GENAI GENAIGENAI GENAI GENAI GENAI
PDF
leewayhertz.com-Generative AI in manufacturing.pdf
PPTX
GENERATIVE AI ALMAS engineering - Copy-1.pptx
PDF
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
PDF
leewayhertz.com-Getting started with generative AI A beginners guide.pdf
PDF
Harnessing the Power of Generative AI for your Business By Siddharth.pdf
PPTX
100-Concepts-of-AI with Anupama Kate .pptx
PDF
Learning Generative AI with Real Time use Cases with KloudSaga
PPTX
Introduction to Generative Models.pptx
PPTX
Build with AI : GenAI introduction and usage
PPTX
Past, Present and Future of Generative AI
PDF
Generative Artificial Intelligence and Data Privacy: A Primer
PDF
introduction to the world of generative AI
PPTX
Introduction_to_Generative_AI(Aritifical Intelligence).pptx
PDF
Cavalry Ventures | Deep Dive: Generative AI
PDF
generative-ai-fundamentals and Large language models
PPTX
An Introduction to Generative Artificial Intelligence
PDF
AI Series 01 : From Basics to Breakthroughs
PDF
lecture1-Generative AI Lecture 1 – Recurrent Neural Networks and Language Mod...
Introduction to Generative AI refers to a subset of artificial intelligence
GENAI GENAI GENAI GENAIGENAI GENAI GENAI GENAI
leewayhertz.com-Generative AI in manufacturing.pdf
GENERATIVE AI ALMAS engineering - Copy-1.pptx
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
leewayhertz.com-Getting started with generative AI A beginners guide.pdf
Harnessing the Power of Generative AI for your Business By Siddharth.pdf
100-Concepts-of-AI with Anupama Kate .pptx
Learning Generative AI with Real Time use Cases with KloudSaga
Introduction to Generative Models.pptx
Build with AI : GenAI introduction and usage
Past, Present and Future of Generative AI
Generative Artificial Intelligence and Data Privacy: A Primer
introduction to the world of generative AI
Introduction_to_Generative_AI(Aritifical Intelligence).pptx
Cavalry Ventures | Deep Dive: Generative AI
generative-ai-fundamentals and Large language models
An Introduction to Generative Artificial Intelligence
AI Series 01 : From Basics to Breakthroughs
lecture1-Generative AI Lecture 1 – Recurrent Neural Networks and Language Mod...
Ad

Recently uploaded (20)

PDF
Chapter 7-2.pdf. .
PPTX
Principles of Inheritance and variation class 12.pptx
PPTX
Nervous_System_Drugs_PPT.pptxXXXXXXXXXXXXXXXXX
PDF
Understanding the Rhetorical Situation Presentation in Blue Orange Muted Il_2...
PPTX
Autonomic_Nervous_SystemM_Drugs_PPT.pptx
PDF
HR Jobs in Jaipur: 2025 Trends, Banking Careers & Smart Hiring Tools
PPTX
Cerebral_Palsy_Detailed_Presentation.pptx
PDF
Manager Resume for R, CL & Applying Online.pdf
PPTX
Life Skills Stress_Management_Presentation.pptx
PPT
APPROACH TO DEVELOPMENTALlllllllllllllllll
PPTX
Sports and Dance -lesson 3 powerpoint presentation
PPTX
PE3-WEEK-3sdsadsadasdadadwadwdsdddddd.pptx
PPTX
LIFE ORIENTATION SLIDES 2025 Grade 11.pptx
PPTX
Definition and Relation of Food Science( Lecture1).pptx
PDF
Josh Gao Strength to Strength Book Summary
PPTX
1-4 Chaptedjkfhkshdkfjhalksjdhfkjshdljkfhrs.pptx
PPTX
Foundations-of-Water-Resources-Planning2652.0032.pptx
PPTX
PMP (Project Management Professional) course prepares individuals
DOC
field study for teachers graduating samplr
PDF
Biography of Mohammad Anamul Haque Nayan
Chapter 7-2.pdf. .
Principles of Inheritance and variation class 12.pptx
Nervous_System_Drugs_PPT.pptxXXXXXXXXXXXXXXXXX
Understanding the Rhetorical Situation Presentation in Blue Orange Muted Il_2...
Autonomic_Nervous_SystemM_Drugs_PPT.pptx
HR Jobs in Jaipur: 2025 Trends, Banking Careers & Smart Hiring Tools
Cerebral_Palsy_Detailed_Presentation.pptx
Manager Resume for R, CL & Applying Online.pdf
Life Skills Stress_Management_Presentation.pptx
APPROACH TO DEVELOPMENTALlllllllllllllllll
Sports and Dance -lesson 3 powerpoint presentation
PE3-WEEK-3sdsadsadasdadadwadwdsdddddd.pptx
LIFE ORIENTATION SLIDES 2025 Grade 11.pptx
Definition and Relation of Food Science( Lecture1).pptx
Josh Gao Strength to Strength Book Summary
1-4 Chaptedjkfhkshdkfjhalksjdhfkjshdljkfhrs.pptx
Foundations-of-Water-Resources-Planning2652.0032.pptx
PMP (Project Management Professional) course prepares individuals
field study for teachers graduating samplr
Biography of Mohammad Anamul Haque Nayan
Ad

Gnerative AI presidency Module1_L1_L2.pptx

  • 1. Slide Number Generative AI Section Name Faculty Name : Dr J Alamelu Mangai Designation : Professor Department : CSE Subject Code & Subject Name : CSE3348 Generative AI Students: School of Computer Science and Engineering
  • 2. What is GenAI? • Generative AI refers to a set of algorithms that can generate new content in any medium such as image, text, audio or video. • This generated content is similar to the content that the algorithm is trained on. • A prominent type of generative AI is the large language model (LLM), which generates natural language texts based on prompts. • GPT (Generative Pre-trained Transformer) series is a well-known example of generative AI. • ChatGPT is a renowned example of LLMs. Ranjitha P-20213CSE0014 2
  • 4. What is GenAI? • GenAI : • algorithms that generate novel content • unlike traditional predictive ML, they do not analyse or act on the existing data • GenAI models have the ability to generate text, images and other creative content indistinguishable from human-generated content. Ranjitha P-20213CSE0014 4
  • 5. Generative Vs. Discriminative Modeling [T2 Pg 1- 5 • Discriminative Modeling is like supervised learning Ranjitha P-20213CSE0014 5
  • 6. What is Generative modeling? [T2 pg. 1 – 5] • A generative model describes how a data set is generated in terms of a probabilistic model. • By sampling this model, new data can be generated. Ranjitha P-20213CSE0014 6
  • 7. What is Generative modelling? • Any generative modeling process has: • A training data : examples of the entity the model has to generate. • Observation : one of the examples from the training data • Each Observation is defined using many features. • Ex: Image of a horse has individual pixel values as features • A generative model has to be probabilistic and not deterministic. • The model should have some randomness that influences the sample generated every time by the model. • The model has to identify the unknown prob. distribution that justifies/distinguishes the images present in the training data from those not in the training set. Ranjitha P-20213CSE0014 7
  • 8. • If the model mimics this distribution, by sampling it can generate new observations that look realistic. • Discriminative modelling is done on a labelled data. • Generative modeling is usually done on an unlabelled data (like unsupervised learning) • It can also be used to generate samples of a distinct class in the training data. Ranjitha P-20213CSE0014 8
  • 9. Generative Modeling projects- Examples • StyleGan by NVIDIA – generates hyper-realistic images of human faces. • GPT by OpenAI : given a short introductory passage, the model completes the given passage. Ranjitha P-20213CSE0014 9
  • 10. Generative Modeling projects – Examples[T1 pg 4- • OpenAI : • A US based AI research company that promotes and develop friendly AI applications. • Started as a non-profit organisation in 2015 . • In 2019, it became a for profit organisation. • Significant achievements : Gym library for training reinforcement learning algorithms • Recently – GPT-n models and Dall-E generative models which generates images from text. Ranjitha P-20213CSE0014 10
  • 11. Generative models? T1 Pg.4 • Artificial Intelligence (AI) : a broad field of CS focussed on creating intelligent agents that can reason, learn and act autonomously • Machine Learning(ML): a subset of AI, focussed on developing algorithms that can learn from data. • Deep Learning(DL): uses deep neural networks with many layers, as a mechanism of ML to learn complex patterns from data. • Generative models are a type of ML model, that can generate new data based on patterns learnt from the input data. • Language Models (LMs): are statistical models used to predict words in a sequence of natural language text. ”The sky is ********” • Large Language models (LLMs) : uses deep learning and are trained on massive data sets. Ranjitha P-20213CSE0014 11
  • 13. • Generative models : • a powerful type of AI that can generate new data that resembles the training data. • They handle different data modalities • They are used in different domains – text, image, music and video • They synthesise new data rather than just making predictions/decisions • They are used in applications generating text, image, music and video. • When real data is scarce to train an AI model, generative models can be used to create synthetic data. Ranjitha P-20213CSE0014 13
  • 14. OpenAI’s generative models https://platform.openai.com/docs/models Ranjitha P-20213CSE0014 14
  • 15. Evolution of Generative AI • 1948: Claude Shannon wrote a paper called “A Mathematical Theory of Communication“. In this paper, he introduced the idea of n-grams, a statistical model that can generate new text based on existing text. • 1950: Alan Turing wrote a paper called “Computing Machinery and Intelligence“. In this paper, he introduced the Turing Test, which is a way to determine if a machine can behave intelligently like a human. • 1952: A.L. Hodgkin and A.F. Huxley created a mathematical model that explained how the brain uses neurons to create an electrical network. This model inspired the development of artificial neural networks, which are used in generative AI. Ranjitha P-20213CSE0014 15
  • 16. • 1965: Alexey Ivakhnenko and Valentin Lapa developed the first learning algorithm for feedforward neural networks. This algorithm enabled the networks to learn complex nonlinear functions from data. • 1979: Kunihiko Fukushima introduced the neocognitron, a powerful type of neural network known as a deep convolutional neural network. It was specifically designed to identify and recognize handwritten digits and various other patterns. • 1986: David Rumelhart, Geoffrey Hinton, and Ronald Williams wrote a paper called “Learning Representations by Back-propagating Errors.” This paper introduced the backpropagation algorithm, which is commonly used to train neural networks. Ranjitha P-20213CSE0014 16
  • 17. • 1991: Sepp Hochreiter introduced the long short-term memory (LSTM) network. It is a type of recurrent neural network that can learn long-term relationships in sequential data. • 2001: Yoshua Bengio and his colleagues created a neural network called the Neural Probabilistic Language Model (NPLM). This model can learn how words are used in natural language. • 2014: Diederik Kingma and Max Welling introduced the variational autoencoder (VAE). It is a type of model that can learn representations of data and generate new data based on those learned representations. Ranjitha P-20213CSE0014 17
  • 18. • 2014: Ian Goodfellow and his colleagues introduced the generative adversarial network (GAN). It is a type of generative model that comprises two neural networks: a generator and a discriminator. The generator aims to generate realistic data, while the discriminator aims to differentiate between real and fake data. • 2015: Yann LeCun and his team proposed the diffusion model. It is a generative model that learns to reverse a process that gradually transforms data into noise. • 2016: Aaron van den Oord and his team introduced WaveNet, a powerful neural network that can create lifelike speech and music waveforms. Ranjitha P-20213CSE0014 18
  • 19. • 2017: Ashish Vaswani and his team introduced the Transformer, a neural network design that leverages attention mechanisms to learn from sequential information, like language or speech. • 2018: Alec Radford and his team introduced Generative Pre-trained Transformer (GPT). This is a big model that uses the Transformer architecture to create different kinds of text on different subjects. • 2018: Jacob Devlin and his team introduced BERT, a powerful model that can understand the meaning of words and sentences in any language. It uses a technique called Transformers to learn from lots of text without needing specific labels. Ranjitha P-20213CSE0014 19
  • 20. • 2019: a researcher named Tero Karras and his team introduced StyleGAN, an enhanced type of GAN (Generative Adversarial Network) that can create a wide range of detailed and realistic images, including faces, animals, landscapes, and more. • 2020: Large Language Models Take Center Stage: OpenAI’s GPT-3 (Generative Pre-trained Transformer 3) with 175 billion parameters pushes the boundaries of language generation, demonstrating impressive capabilities in text creation, translation, and code writing. • 2020: a team led by Alexei Baevski introduced wav2vec 2.0. It is a model that can learn speech representations directly from raw audio and achieved excellent performance in speech recognition tasks. Ranjitha P-20213CSE0014 20
  • 21. • 2021: Aditya Ramesh and his team created DALL-E, a powerful model that can create lifelike images based on written descriptions. • 2021: Focus on Control and Explainability: Researchers grapple with the “black box” nature of large language models, seeking methods to improve control over generated outputs and explain the reasoning behind their creations. • 2022: Diffusion Models Gain Traction: Diffusion models, known for their ability to create realistic images, experience a surge in popularity. Applications in image generation, editing, and inpainting become prominent. Ranjitha P-20213CSE0014 21
  • 22. • 2023: Multimodal Generative AI Takes Shape: Models capable of generating across different modalities, like text and image combinations, start to emerge. This opens doors for more interactive and immersive experiences. • 2023: Ethical Considerations Mount: Concerns around bias, misinformation, and potential misuse of generative AI lead to discussions on responsible development and deployment practices. • 2024: Focus on Real-World Integration: A growing trend towards integrating generative AI tools into real-world applications across various industries like customer service, product design, and marketing. Ranjitha P-20213CSE0014 22
  • 23. Advantages of generative modeling • Synthetic data generation using generative models reduces the cost of labelling and improves the training efficiency. • Microsoft Research trained their LLM named phi-1 using generative modelling, for basic Python coding. • It is a transformer with 1.3 billion parameters. • Trained on code from The Stack, Q&A content from StackOverflow, synthetic codes generated by GPT3.5 • “Textbooks Are All You Need, June 2023” https://www.microsoft.com/en-us/research/publication/textbooks- are-all-you-need / Ranjitha P-20213CSE0014 23
  • 25. Types of generative models[T1 pg 6] • Different types of generative models for different data modalities: 1) Text-to-text : • models that generate text from input text, like conversational agents. Ex: LLaMa 2, GPT-4, Claude, PaLM 2 • A conversational agent is a program designed to converse with humans in natural language. • It can talk to people on phones, computers, and other devices, allowing them to order food or do other functions through voice, text, or chat. • It can achieve these using technologies like natural language processing (NLP), machine learning (ML), speech recognition, text-to-speech synthesis, and dialog management to interact with people through various mediums. Ranjitha P-20213CSE0014 25
  • 26. • Llama 2 is a family of pre-trained and fine-tuned large language models (LLMs) released by Meta AI in 2023. • Released free of charge for research and commercial use, Llama 2 AI models are capable of a variety of natural language processing (NLP) tasks, from text generation to programming code. Ranjitha P-20213CSE0014 26
  • 27. • GPT-n by OpenAI: • Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. • it is a decoder-only transformer model of deep neural network and convolution -based architectures with a technique known as "attention“ with 175 billion parameters. Ranjitha P-20213CSE0014 27
  • 28. 2) Text-to-Image: • Models that generate images from text captions. Ex: Dall-E 2, Stable Diffusion and Imagen. • Dall-E 2 : https://openai.com/index/dall-e-2/ • DALL·E is a 12-billion parameter version of GPT-3 (opens in a new window) trained to generate images from text descriptions, using a dataset of text– image pairs. Ranjitha P-20213CSE0014 28
  • 30. 3) Text-to-Audio: • Models that generate audio clips and music from text. Ex: Jukebox, AudioLM and MusicGen • Jukebox is a neural network-based tool that uses artificial intelligence to generate music. • Developed by OpenAI, Jukebox is a neural network model capable of composing original songs in different genres and styles. • Jukebox employs a combination of deep learning techniques, including generative modeling and reinforcement learning, to create music that is both coherent and creative. • The main use cases of Jukebox include music generation, song completion, and music style transfer. It can generate new songs in the style of a given artist or even complete a song given a short melody. Ranjitha P-20213CSE0014 30
  • 31. 4) Text-to-video: • Models that generate video content from text descriptions. Ex: Phenaki and Emu Video • Phenaki : A model for generating videos from text, with prompts that can change over time, and videos that can be as long as multiple minutes. https://phenaki.video/ 5) Text-to-Speech: Models that synthesize speech audio from input text. Ex: WaveNet and Tacotron 6) Speech-to-text: Models that transcribe speech to text [ also called Automatic Speech Recognition ASR]. Ex: Whisper and SpeechGPT Ranjitha P-20213CSE0014 31
  • 32. 7) Image-to-text: Models that generate image captions from images. Ex: CLIP and DALL-E 3. 8) Image to Image: Applications – • data augmentation, • Neural style transfer (NST) - manipulate digital images, or videos, in order to adopt the appearance or visual style of another image. • generating a new image by combining the content of one image with the style of another image. • The goal of style transfer is to create an image that preserves the content of the original image while applying the visual style of another image. Ranjitha P-20213CSE0014 32
  • 34. • Inpainting : removing defects in the image Ex: Right arm is missing in the original image Ranjitha P-20213CSE0014 34
  • 35. 9) Text-to-code: models that generate programming code from text. Ex: Stable diffusion and Dall-E 3 10) Video-to-audio: Models that analyse video and generate matching audio. Ex: Soundify 11) Text-to-Math: generates mathematical expressions from text. • Many other combinations of data modalities exists • Text is the common modality. • OpenAI’s GPT-4V model – Sep 2023 takes both text and images to better OCR to read text from images. Ranjitha P-20213CSE0014 35