SlideShare a Scribd company logo
Video-Language
Pre-training
based on
Transformer
Models
Submitted by,
Raghava Devaraje Urs
015135653
Introduction
• Transfer-based training for natural language processing and computer vision.
• In the pre-training step of machine learning, the model is trained using a large
amount of data referred to as training data.
• Once this is complete, it is fine-tuned on a smaller dataset. This step eventually
helps in the betterment of the downstream tasks.
• Transformer networks have become popular in the field of Deep Learning by
providing precedence in performance. Having a smaller model bias and
network structure easy to deepen makes the transformer ideal for training and
finetuning.
• Making transformers superior compared to Multi-Layer Perceptrons (MLP),
Convolutional Neural Networks (CNNs), and Recurrent Neural Networks
(RNNs).
Transformer architecture
Self-Attention
Multi-Head Attention
Position Encoding
Pre-training and fine-tuning
Proxy tasks Video-Language Downstream Tasks
Proxy tasks
COMPLETION TASKS MATCHING TASKS ORDERING TASKS
Completion
tasks
Masked Language Modelling (MLM)
Masked Frame Modelling
Masked Token Modeling
Masked Modal Modeling
Language Reconstruction
Video Language Matching
Sentence Ordering Modeling
Frame Ordering Modeling
Video Language Downstream Tasks
TEXT-BASED
VIDEO RETRIEVAL
ACTION
RECOGNITION
ACTION
SEGMENTATION
ACTION STEP
LOCALIZATION
VIDEO QUESTION
ANSWERING
VIDEO
CAPTIONING
Video-Language Datasets
LABEL BASED CAPTION BASED
Video-language
Transformer
Models
Single-Stream Transformers
VideoBERT
HERO - Hierarchical Encoder for Omni representation
CLipBERT
DeCEMBERT - Dense Captions and Entropy Minimization
VLM - Video Language Model
VATT - Video Audi Text Transformer
VATT - Video Audi Text Transformer
Video-language
Transformer
Models
Multi-Stream Transformers
CBT
ActBERT
Univl
Summary and
Conclusion
Transformer block perspective
Word and Video embedding
Model training objectives
Model evaluation

More Related Content

PDF
MODELSWARD 2017 Panel
PPTX
Data Parallel and Object Oriented Model
PPTX
Parallel Programing Model
PDF
Advanced computer architecture unit 5
PPTX
parallel language and compiler
PDF
Multi Task Learning and Meta Learning
PPTX
Parallel language & compilers
PDF
Aca2 10 11
MODELSWARD 2017 Panel
Data Parallel and Object Oriented Model
Parallel Programing Model
Advanced computer architecture unit 5
parallel language and compiler
Multi Task Learning and Meta Learning
Parallel language & compilers
Aca2 10 11

Similar to Video-Language Pre-training based on Transformer Models (20)

PDF
[DSC Adria 23]Tin_Ferkovic_Multi_Task_Learning_in_Transformer_Based_Architect...
PDF
Deep network notes.pdf
PDF
LLM Paradigm Adaptations in Recommender Systems.pdf
PPTX
Thomas Wolf "Transfer learning in NLP"
PPTX
Video Description using Deep Learning
PDF
Transformers
PPTX
NLP_and_Transformers_introduction to Transformer models_presentation.pptx
PDF
VITA-1.5 Towards GPT-4o Level Real-Time Vision and Speech Interaction
PDF
ODSC East: Effective Transfer Learning for NLP
PDF
Transformers in Action MEAP V06 Nicole Koenigstein
PPTX
Deep Learning for Machine Translation
PDF
Transformers in Action MEAP V06 Nicole Koenigstein
PDF
“Understand the Multimodal World with Minimal Supervision,” a Keynote Present...
PPTX
Transformers Beyond Translation: The Evolution of Attention Mechanisms in GPT"
PPTX
Semantic Summarization of videos, Semantic Summarization of videos
PDF
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
PDF
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
PDF
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
PDF
Transformers in Action MEAP V06 Nicole Koenigstein
PDF
Application of Foundation Model for Autonomous Driving
[DSC Adria 23]Tin_Ferkovic_Multi_Task_Learning_in_Transformer_Based_Architect...
Deep network notes.pdf
LLM Paradigm Adaptations in Recommender Systems.pdf
Thomas Wolf "Transfer learning in NLP"
Video Description using Deep Learning
Transformers
NLP_and_Transformers_introduction to Transformer models_presentation.pptx
VITA-1.5 Towards GPT-4o Level Real-Time Vision and Speech Interaction
ODSC East: Effective Transfer Learning for NLP
Transformers in Action MEAP V06 Nicole Koenigstein
Deep Learning for Machine Translation
Transformers in Action MEAP V06 Nicole Koenigstein
“Understand the Multimodal World with Minimal Supervision,” a Keynote Present...
Transformers Beyond Translation: The Evolution of Attention Mechanisms in GPT"
Semantic Summarization of videos, Semantic Summarization of videos
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
Transformers in Action MEAP V06 Nicole Koenigstein
Application of Foundation Model for Autonomous Driving
Ad

Recently uploaded (20)

PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
modul_python (1).pptx for professional and student
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PDF
Microsoft Core Cloud Services powerpoint
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
CYBER SECURITY the Next Warefare Tactics
PPTX
Managing Community Partner Relationships
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Introduction to Data Science and Data Analysis
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PPTX
Database Infoormation System (DBIS).pptx
PPT
Predictive modeling basics in data cleaning process
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
Qualitative Qantitative and Mixed Methods.pptx
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
modul_python (1).pptx for professional and student
retention in jsjsksksksnbsndjddjdnFPD.pptx
Microsoft Core Cloud Services powerpoint
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
CYBER SECURITY the Next Warefare Tactics
Managing Community Partner Relationships
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction to Data Science and Data Analysis
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
importance of Data-Visualization-in-Data-Science. for mba studnts
Database Infoormation System (DBIS).pptx
Predictive modeling basics in data cleaning process
Pilar Kemerdekaan dan Identi Bangsa.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Ad

Video-Language Pre-training based on Transformer Models