SlideShare a Scribd company logo
image captioning project using python for diploma
142 GOVERNMENT POLYTECHNIC COLLEGE
KELAMANGALAM
REVIEW-01
DEPARTMENT OF COMPUTER
ENGINEERING
IMAGE CAPTIONING USING DEEP LEARNING
FOR MULTILANGUAGE
PROJECT MEMBERS
SAHANA .S
AMUDHA .G
DHIVYA .B
MONISHA.Y
SOWMIYA .V
VEDHA SHREE .M
- 23501538
- 23501506
- 23501515
- 23501526
- 23501545
- 23501555
PROJECT GUIDE;
B.SANTHIMEENA. M.E.,HOD..
ABSTRACT
 This project's major goal is to Produce a version that makes it
possible to foresee the caption or text that will appear next to
the image.
The creation of captions for pictures is a difficult and
responsible task.
 This image captioning is a complex and challenging tasks.
 The main aim of this project is to develop a model such that it
can predict the closest text or caption to that particular image.
INTRODUCTION
The term "photo captioning" refers to the explanation of the image's content.
Picture captions are a brand-new trend that is currently generating a lot of
curiosity.
The main objective of this photo capture is to generate a natural language
description for the entry photo that is sent to the version.
In this analysis, we provide a model that depicts characterization as a
linguistic technique using herbs.
Also, the neural network's functionalities are taken from extraordinary
architectures like CNN and LSTM.
EXISTING SYSTEM
In the beginning it is impractical to computer to
characterize an image.
There are still some problems in existing systems.
Disadvantages:
Improving the Quality of Captions.
Not Generating creative captions.
 Handling unseen objects.
 Dealing with multiple objects and relationships are
PROPOSED SYSTEM
Image Caption Generator Model (CNN-RNN model)
= CNN + LSTM
A pre-trained model called Xception is used for
this.
CNN – To extract features from the image.
LSTM – To generate a description from the
extracted information of the image.
MODULES
 Data pre-processing
 VGG16-Xception
 CNN LSTM
Advantages:
 Efficiency.
 Generating creative captions.
SYSTEM REQUIREMENT SPECIFICATION
Processor : Intel or Pentium
Speed : 2.4GHZ
Hard Disk : 1 TB HDD, 256GB SSD
Input : Keyboard, Mouse
Ram : 8GB
HARDWARE REQUIREMENTS
SOFTWARE REQUIREMENTS
Operating System - Windows10 or Higher
Dataset - Flickr 8k
Software - Jupyter Notebook
image captioning project using python for diploma

More Related Content

PPTX
Image captioning using DL and NLP.pptx
PDF
IMAGE CAPTION GENERATOR USING DEEP LEARNING
PDF
Automated Image Captioning – Model Based on CNN – GRU Architecture
PDF
Image Captioning Generator using Deep Machine Learning
PDF
IRJET- Capsearch - An Image Caption Generation Based Search
PPTX
1069358_Navneet_Image_Caption_Generator.pptx
PDF
Image Captioning based on Artificial Intelligence
PPTX
Industrial Trainingdbhkbdbdwjb dbxjnwbndcbj
Image captioning using DL and NLP.pptx
IMAGE CAPTION GENERATOR USING DEEP LEARNING
Automated Image Captioning – Model Based on CNN – GRU Architecture
Image Captioning Generator using Deep Machine Learning
IRJET- Capsearch - An Image Caption Generation Based Search
1069358_Navneet_Image_Caption_Generator.pptx
Image Captioning based on Artificial Intelligence
Industrial Trainingdbhkbdbdwjb dbxjnwbndcbj

Similar to image captioning project using python for diploma (20)

PPTX
Image_Caption_Generator_Presentation_With_Deployment.pptx
PDF
IRJET- Neural Story Teller using RNN and Generative Algorithm
PDF
RESEARCH PROPOSAL ON ENHANCING AUTOMATIC IMAGE CAPTIONING SYSTEM LSTM.pdf
PPTX
Detailed_Image_Caption_Generator_Presentation.pptx
PDF
PPT Image Caption Generator mini project
PPTX
Image captioning
PPTX
image caption generation using deep learning
PDF
IRJET- Visual Information Narrator using Neural Network
PDF
IRJET- Extension to Visual Information Narrator using Neural Network
PDF
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
PDF
IMAGE CONTENT DESCRIPTION USING LSTM APPROACH
PDF
Show and Tell_ A Neural Image Caption Generator.pdf
PPTX
image caption lab and captioning system with pptx
PPTX
Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning
PPTX
Agin Anuradha's Image Caption Generator: Revolutionizing Visual Content Inter...
PDF
Automated Neural Image Caption Generator for Visually Impaired People
PDF
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
PDF
Devanagari Digit and Character Recognition Using Convolutional Neural Network
PDF
Modelling Framework of a Neural Object Recognition
PDF
Liebman_Thesis.pdf
Image_Caption_Generator_Presentation_With_Deployment.pptx
IRJET- Neural Story Teller using RNN and Generative Algorithm
RESEARCH PROPOSAL ON ENHANCING AUTOMATIC IMAGE CAPTIONING SYSTEM LSTM.pdf
Detailed_Image_Caption_Generator_Presentation.pptx
PPT Image Caption Generator mini project
Image captioning
image caption generation using deep learning
IRJET- Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural Network
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
IMAGE CONTENT DESCRIPTION USING LSTM APPROACH
Show and Tell_ A Neural Image Caption Generator.pdf
image caption lab and captioning system with pptx
Frontiers of Vision and Language: Bridging Images and Texts by Deep Learning
Agin Anuradha's Image Caption Generator: Revolutionizing Visual Content Inter...
Automated Neural Image Caption Generator for Visually Impaired People
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Modelling Framework of a Neural Object Recognition
Liebman_Thesis.pdf
Ad

Recently uploaded (20)

PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
CyberSecurity Mobile and Wireless Devices
PPT
Total quality management ppt for engineering students
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PPTX
introduction to high performance computing
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
Visual Aids for Exploratory Data Analysis.pdf
PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PDF
August -2025_Top10 Read_Articles_ijait.pdf
PPTX
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
PPTX
Management Information system : MIS-e-Business Systems.pptx
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
CyberSecurity Mobile and Wireless Devices
Total quality management ppt for engineering students
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
Exploratory_Data_Analysis_Fundamentals.pdf
introduction to high performance computing
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Visual Aids for Exploratory Data Analysis.pdf
distributed database system" (DDBS) is often used to refer to both the distri...
August -2025_Top10 Read_Articles_ijait.pdf
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
Management Information system : MIS-e-Business Systems.pptx
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
"Array and Linked List in Data Structures with Types, Operations, Implementat...
III.4.1.2_The_Space_Environment.p pdffdf
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
Ad

image captioning project using python for diploma

  • 2. 142 GOVERNMENT POLYTECHNIC COLLEGE KELAMANGALAM REVIEW-01 DEPARTMENT OF COMPUTER ENGINEERING IMAGE CAPTIONING USING DEEP LEARNING FOR MULTILANGUAGE
  • 3. PROJECT MEMBERS SAHANA .S AMUDHA .G DHIVYA .B MONISHA.Y SOWMIYA .V VEDHA SHREE .M - 23501538 - 23501506 - 23501515 - 23501526 - 23501545 - 23501555 PROJECT GUIDE; B.SANTHIMEENA. M.E.,HOD..
  • 4. ABSTRACT  This project's major goal is to Produce a version that makes it possible to foresee the caption or text that will appear next to the image. The creation of captions for pictures is a difficult and responsible task.  This image captioning is a complex and challenging tasks.  The main aim of this project is to develop a model such that it can predict the closest text or caption to that particular image.
  • 5. INTRODUCTION The term "photo captioning" refers to the explanation of the image's content. Picture captions are a brand-new trend that is currently generating a lot of curiosity. The main objective of this photo capture is to generate a natural language description for the entry photo that is sent to the version. In this analysis, we provide a model that depicts characterization as a linguistic technique using herbs. Also, the neural network's functionalities are taken from extraordinary architectures like CNN and LSTM.
  • 6. EXISTING SYSTEM In the beginning it is impractical to computer to characterize an image. There are still some problems in existing systems. Disadvantages: Improving the Quality of Captions. Not Generating creative captions.  Handling unseen objects.  Dealing with multiple objects and relationships are
  • 7. PROPOSED SYSTEM Image Caption Generator Model (CNN-RNN model) = CNN + LSTM A pre-trained model called Xception is used for this. CNN – To extract features from the image. LSTM – To generate a description from the extracted information of the image.
  • 8. MODULES  Data pre-processing  VGG16-Xception  CNN LSTM Advantages:  Efficiency.  Generating creative captions.
  • 9. SYSTEM REQUIREMENT SPECIFICATION Processor : Intel or Pentium Speed : 2.4GHZ Hard Disk : 1 TB HDD, 256GB SSD Input : Keyboard, Mouse Ram : 8GB HARDWARE REQUIREMENTS
  • 10. SOFTWARE REQUIREMENTS Operating System - Windows10 or Higher Dataset - Flickr 8k Software - Jupyter Notebook