SlideShare a Scribd company logo
6
Most read
7
Most read
9
Most read
VISVESVARAYA TECHNOLOGICAL UNIVERSITY
"Jnana Sangama", Belgaum: 590 018
H.K.E Society’s
SIR M VISVESVARAYA COLLEGE OF ENGINEERING
(Affiliated to VTU - Belagavi, Approved by AICTE, Accredited by NAAC)
Yeramarus Camp, Raichur-584135, Karnataka
2023-2024
TECHNICAL SEMINAR PRESENTATION
ON
“MULTIMODAL AI ”
UNDER THE GUIDENCE
OF
DR.SHARAN KUMAR
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
POWERING
THE NEXT
CHAPTER IN
GENERATIVE AI
MULTIMODAL AI
PRESENTED
BY
B CHANDANA
3SL20EC003
CONTENTS
• Introduction
• Literature survey
• Block diagram
• Applications
• Future scope
• Benefits and challenges
• Conclusion
• Reference
technical seminar.pptx on multi model of AI
Introduction
• Multi modal AI is an
advanced form of artificial
intelligence that is able to
analyze and interpret
multiple modes of data
simultaneously allowing it
to generate more accurate
and human like responses.
Literature survey
• The release of ChatGPT in November 2022, a conversation-focused
model that follows human instructions, further underscored the
feasibility of AGI in practical applications (Liu et al., 2023a). This
development has had a wide-ranging impact across various sectors,
including journalism (Liu et al., 2023c), education (Zhai, 2023; Liu
et al., 2023b), healthcare (Li et al., 2023; Liu et al., [n. d.]; Holmes
et al., 2023), industry (Dou et al., 2023), agriculture (Rezayi
et al., 2023), law (Bubeck et al., 2023), gaming (Bubeck et al., 2023),
and finance (Wu et al., 2023c), catalyzing a popular wave in AI (Liu
et al., 2023a, g, h).
• Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran
Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg,
Antoine Bosselut, Emma Brunskill, et al. 2021.On the opportunities
and risks of foundation models.arXiv preprint
arXiv:2108.07258 (2021).
technical seminar.pptx on multi model of AI
Sensory Inputs
Sensory inputs refer to the various forms of data collected from different
senses such as vision, hearing, touch, and smell that are processed by
multimodal AI technology for a technical seminar.
Data Fusion
Data fusion involves combining information from multiple modalities, such
as text, images, and videos, to improve the accuracy and robustness of AI
systems in a technical seminar on multimodal AI technology generation.
Machine Learning Algorithms
Machine learning algorithms play a crucial role in generating multimodal
AI technology for technical seminars by effectively analyzing and
interpreting data from multiple sources such as text, images, and audio.
Natural Language Processing
Natural Language Processing is a crucial component of Multimodal AI
technology, allowing for the analysis and understanding of human
language in combination with other modalities such as images or videos.
Computer Vision
Computer Vision is a key component of Multimodal AI technology, which
allows for the integration of visual data processing with other modes of
information to enhance overall system performance.
technical seminar.pptx on multi model of AI
technical seminar.pptx on multi model of AI
Applications
• Social media content moderation: Multimodal AI can be used to analyze text, images, and audio to
identify and moderate harmful content on social media platforms. For instance, it can detect hate
speech, violence, and bullying.
• Virtual assistants: Smart assistants like Google Assistant and Amazon Alexa are powered by
multimodal AI. They can understand and respond to natural language commands, both spoken and
typed.
• Healthcare imaging: In healthcare, multimodal AI can analyze medical images (X-rays, MRIs) along
with text reports and patient history data to improve diagnostics. This can lead to more accurate
diagnoses and better patient outcomes.
• Autonomous vehicles: Self-driving cars rely heavily on multimodal AI. They use a variety of sensors,
including cameras, radar, and LiDAR, to perceive their surroundings and navigate safely.
• E-commerce product recommendations: Many e-commerce websites use multimodal AI to
personalize product recommendations for customers. By considering both the product image and
description, the AI can recommend items that are more likely to interest the customer
technical seminar.pptx on multi model of AI
technical seminar.pptx on multi model of AI
Conclusion
• The future of AI is not just about seeing or hearing, it's
about truly understanding. Multimodal AI holds the
key to unlocking a new level of human-computer
interaction, with applications that can bridge
communication gaps, enhance our understanding of
the world, and empower us to solve complex
challenges in entirely new ways. The potential for
positive impact across various fields is truly limitless.
References
• Rania Abdelghani, Yen-Hsiang Wang, Xingdi Yuan, Tong Wang,
Pauline Lucas, Hélène Sauzéon, and Pierre-Yves Oudeyer.
2023.GPT-3-driven pedagogical agents for training children’s
curious question-asking skills. International Journal of Artificial
Intelligence in Education 167, 3 (2023), 102887.
• Hang Bao, Wen Wang, Li Dong, Qianru Liu, Ola K. Mohammed,
Kirti Aggarwal, and Fang Wei. 2022.Vlmo: Unified vision-language
pre-training with mixture-of-modality-experts. In Advances in
Neural Information Processing Systems (NeurIPS), Vol. 35. 32897–
32912.

More Related Content

PPTX
5. phases of nlp
PDF
Plan Community Manager
PDF
Vector database
PPTX
A Comprehensive Review of Large Language Models for.pptx
PPTX
SAPNO KE SE DIN(CLASSX).pptx
PDF
Generative AI
PPTX
Prompt Engineering Guide.pptx
PPTX
BYJU'S PPT By Jyoti Sharma
5. phases of nlp
Plan Community Manager
Vector database
A Comprehensive Review of Large Language Models for.pptx
SAPNO KE SE DIN(CLASSX).pptx
Generative AI
Prompt Engineering Guide.pptx
BYJU'S PPT By Jyoti Sharma

What's hot (20)

PPTX
Computer vision
PPTX
Machine learning seminar presentation
PPTX
Artificial intelligence
PPT
Machine learning
PDF
A brief history of machine learning
PPT
Machine Learning
PDF
Benefits and risk of artificial intelligence slideshare
PPTX
History of AI
PPTX
Machine learning (webinar)
PDF
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
PPTX
Machine learning seminar ppt
PPTX
Artificial Intelligence Presentation
PPTX
Computer vision ppt
PPT
2.17Mb ppt
PPTX
Artificial Intelligence
PDF
Application of Machine Learning in Cyber Security
PDF
IoT Security: Problems, Challenges and Solutions
PPTX
Machine learning ppt
PPTX
Introduction to ML (Machine Learning)
PPT
Machine learning
Computer vision
Machine learning seminar presentation
Artificial intelligence
Machine learning
A brief history of machine learning
Machine Learning
Benefits and risk of artificial intelligence slideshare
History of AI
Machine learning (webinar)
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginn...
Machine learning seminar ppt
Artificial Intelligence Presentation
Computer vision ppt
2.17Mb ppt
Artificial Intelligence
Application of Machine Learning in Cyber Security
IoT Security: Problems, Challenges and Solutions
Machine learning ppt
Introduction to ML (Machine Learning)
Machine learning
Ad

Similar to technical seminar.pptx on multi model of AI (20)

PDF
PPTX
The Revolutionary Progress of Artificial Inteligence (AI) in Health Care
PPTX
The lastest trending technology.pptx
PPTX
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
PDF
CV VD Mohire-Research
PDF
VOICE RECOGNITION BASED MEDI ASSISTANT
DOCX
Reliability based analytical engine as a service for industrial applications...
PPTX
Artificial Intelligence Introduction training
PDF
Artificial intelligence: PwC Top Issues
PDF
The technologies of ai used in different corporate world
PDF
The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...
PPTX
Ambient intellegence
PDF
Face Detection Using Artificial Intelligence and Machine Learning with Python
PPTX
Generative AI .pptx.....................
PDF
A Case Study of Artificial Intelligence is being used to Reshape Business
PDF
A SURVEY ON AI POWERED PERSONAL ASSISTANT
PPTX
Artificial Intelligence, Data Science, Virtual and Augmented Realitypptx
PDF
compueter.pdfurueue7edjcjte6djdjrjducheduu
PPTX
English Project class 11 ai and its impact.pptx
PDF
Artificial Intelligence Scope and Career Opportunity.pdf
The Revolutionary Progress of Artificial Inteligence (AI) in Health Care
The lastest trending technology.pptx
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
CV VD Mohire-Research
VOICE RECOGNITION BASED MEDI ASSISTANT
Reliability based analytical engine as a service for industrial applications...
Artificial Intelligence Introduction training
Artificial intelligence: PwC Top Issues
The technologies of ai used in different corporate world
The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...
Ambient intellegence
Face Detection Using Artificial Intelligence and Machine Learning with Python
Generative AI .pptx.....................
A Case Study of Artificial Intelligence is being used to Reshape Business
A SURVEY ON AI POWERED PERSONAL ASSISTANT
Artificial Intelligence, Data Science, Virtual and Augmented Realitypptx
compueter.pdfurueue7edjcjte6djdjrjducheduu
English Project class 11 ai and its impact.pptx
Artificial Intelligence Scope and Career Opportunity.pdf
Ad

More from tulsamma584101 (6)

PPTX
rohit final.pptx with bussens and project good ppt
PPTX
HUMAN.pptx A robot project which will be act like toy
PPTX
SOLAR-MOBILE-CHARGER-ppt.pptx technical seminar
PPTX
humanoid final ppt.pptx based on the servo motor
PPTX
bigdog-mostadvancedquadrupedrobot-180721132614 (1).pptx
PPTX
FK.pptxThe detail information presentation on esim
rohit final.pptx with bussens and project good ppt
HUMAN.pptx A robot project which will be act like toy
SOLAR-MOBILE-CHARGER-ppt.pptx technical seminar
humanoid final ppt.pptx based on the servo motor
bigdog-mostadvancedquadrupedrobot-180721132614 (1).pptx
FK.pptxThe detail information presentation on esim

Recently uploaded (20)

PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
web development for engineering and engineering
DOCX
573137875-Attendance-Management-System-original
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
OOP with Java - Java Introduction (Basics)
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Welding lecture in detail for understanding
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Arduino robotics embedded978-1-4302-3184-4.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Internet of Things (IOT) - A guide to understanding
CYBER-CRIMES AND SECURITY A guide to understanding
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
web development for engineering and engineering
573137875-Attendance-Management-System-original
Model Code of Practice - Construction Work - 21102022 .pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Lecture Notes Electrical Wiring System Components
Operating System & Kernel Study Guide-1 - converted.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
OOP with Java - Java Introduction (Basics)
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Welding lecture in detail for understanding
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

technical seminar.pptx on multi model of AI

  • 1. VISVESVARAYA TECHNOLOGICAL UNIVERSITY "Jnana Sangama", Belgaum: 590 018 H.K.E Society’s SIR M VISVESVARAYA COLLEGE OF ENGINEERING (Affiliated to VTU - Belagavi, Approved by AICTE, Accredited by NAAC) Yeramarus Camp, Raichur-584135, Karnataka 2023-2024 TECHNICAL SEMINAR PRESENTATION ON “MULTIMODAL AI ” UNDER THE GUIDENCE OF DR.SHARAN KUMAR DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
  • 4. CONTENTS • Introduction • Literature survey • Block diagram • Applications • Future scope • Benefits and challenges • Conclusion • Reference
  • 6. Introduction • Multi modal AI is an advanced form of artificial intelligence that is able to analyze and interpret multiple modes of data simultaneously allowing it to generate more accurate and human like responses.
  • 7. Literature survey • The release of ChatGPT in November 2022, a conversation-focused model that follows human instructions, further underscored the feasibility of AGI in practical applications (Liu et al., 2023a). This development has had a wide-ranging impact across various sectors, including journalism (Liu et al., 2023c), education (Zhai, 2023; Liu et al., 2023b), healthcare (Li et al., 2023; Liu et al., [n. d.]; Holmes et al., 2023), industry (Dou et al., 2023), agriculture (Rezayi et al., 2023), law (Bubeck et al., 2023), gaming (Bubeck et al., 2023), and finance (Wu et al., 2023c), catalyzing a popular wave in AI (Liu et al., 2023a, g, h). • Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021.On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258 (2021).
  • 9. Sensory Inputs Sensory inputs refer to the various forms of data collected from different senses such as vision, hearing, touch, and smell that are processed by multimodal AI technology for a technical seminar. Data Fusion Data fusion involves combining information from multiple modalities, such as text, images, and videos, to improve the accuracy and robustness of AI systems in a technical seminar on multimodal AI technology generation. Machine Learning Algorithms Machine learning algorithms play a crucial role in generating multimodal AI technology for technical seminars by effectively analyzing and interpreting data from multiple sources such as text, images, and audio. Natural Language Processing Natural Language Processing is a crucial component of Multimodal AI technology, allowing for the analysis and understanding of human language in combination with other modalities such as images or videos. Computer Vision Computer Vision is a key component of Multimodal AI technology, which allows for the integration of visual data processing with other modes of information to enhance overall system performance.
  • 12. Applications • Social media content moderation: Multimodal AI can be used to analyze text, images, and audio to identify and moderate harmful content on social media platforms. For instance, it can detect hate speech, violence, and bullying. • Virtual assistants: Smart assistants like Google Assistant and Amazon Alexa are powered by multimodal AI. They can understand and respond to natural language commands, both spoken and typed. • Healthcare imaging: In healthcare, multimodal AI can analyze medical images (X-rays, MRIs) along with text reports and patient history data to improve diagnostics. This can lead to more accurate diagnoses and better patient outcomes. • Autonomous vehicles: Self-driving cars rely heavily on multimodal AI. They use a variety of sensors, including cameras, radar, and LiDAR, to perceive their surroundings and navigate safely. • E-commerce product recommendations: Many e-commerce websites use multimodal AI to personalize product recommendations for customers. By considering both the product image and description, the AI can recommend items that are more likely to interest the customer
  • 15. Conclusion • The future of AI is not just about seeing or hearing, it's about truly understanding. Multimodal AI holds the key to unlocking a new level of human-computer interaction, with applications that can bridge communication gaps, enhance our understanding of the world, and empower us to solve complex challenges in entirely new ways. The potential for positive impact across various fields is truly limitless.
  • 16. References • Rania Abdelghani, Yen-Hsiang Wang, Xingdi Yuan, Tong Wang, Pauline Lucas, Hélène Sauzéon, and Pierre-Yves Oudeyer. 2023.GPT-3-driven pedagogical agents for training children’s curious question-asking skills. International Journal of Artificial Intelligence in Education 167, 3 (2023), 102887. • Hang Bao, Wen Wang, Li Dong, Qianru Liu, Ola K. Mohammed, Kirti Aggarwal, and Fang Wei. 2022.Vlmo: Unified vision-language pre-training with mixture-of-modality-experts. In Advances in Neural Information Processing Systems (NeurIPS), Vol. 35. 32897– 32912.