International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 747
Text Extraction From Image Using GAMMA Correction Method.
1 Gholap Vidya, 2 Jadhav Pranali, 3 Lokhande Manasi, 4 Magare Niyati
1 Student, Computer Engineering, kkwieer, Maharashtra, India
2 Student, Computer Engineering, kkwieer, Maharashtra, India
3 Student, Computer Engineering, kkwieer, Maharashtra, India
4 Student, Computer Engineering, kkwieer, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Text extraction is the task of automatically
extracting structured information from instructed or semi-
structured machine readable documents. When one scans a
paper page into a computer produces just an image file, a
photo of the page. The computer cannot understand the
letter on the page so we need to extract text from image file.
Text extraction from image is one of the complicated areas
in digital image processing. Text data present in images
contain useful information. We can extract text and layout
information from image file. It is a complex process to detect
and recognize the text from images due to their various
sizes, gray scale and complex background. Various
researchers used the method such as morphological filters,
comic text extraction method, connect component labeling
algorithm and mathematical morphology which gives less
accuracy. In proposed system gamma method is used which
may give more accuracy as compared to the existing
methods. In gamma method it suppressed non-text
background details from the image by applying appropriate
gamma value and to remove non text region and makes the
image accurate and extract the text from image.
Key Words: Text Extraction, Text Recognition , Gamma
method, Image Processing .
1. Introduction
Images can be broadly classified into Document images,
Caption text images and Scene text images. A document
image (Figure 1a) usually contains text. Document images
are acquired by scanning journal, printed document,
degraded document images, handwritten historical
document, and book cover etc. The text may appear in a
virtually unlimited number of fonts, style, alignment, size,
shapes, colors, etc. Extraction of text in documents with
text on complex color background is difficult due to
complexity of the background and mix up of color of fore-
ground text with colors of background.
Caption text is also known as Overlay text or Cut line text.
Caption text (Figure 1b) is artificially superimposed on the
video/image at the time of editing and it usually describes
or identifies the subject of the image/video content. These
types of caption text include moving text, rotating text,
growing text, shrinking text, text of arbitrary orientation,
and text of arbitrary size.
Scene text (Figure 1c) appears within the scene which is
then captured by the recording device i.e. text which is
present in the scene when the image or video is shot.
Scene texts occur naturally as a part of the scene and
contain important semantic information. It is difficult to
detect and extract since it may appear in a virtually
unlimited number of poses, size, shapes and colors, low
resolution, complex background, non-uniform lightning or
blurring effects of varying lighting, complex movement
and transformation, unknown layout, uneven lighting,
shadowing and variation in font style, size, orientation,
alignment & complexity of background.
Due to very fast growth of available multimedia
documents and growing requirement, studies in the field
of pattern recognition shows a great amount of interest in
efficient extraction of text, indexing and retrieval from
digital video/document images. The text characters are
difficult to be detected and recognized due to their
deviation of size, font, style, orientation, alignment,
contrast, complex colored, textured background. Intensive
research projects are performed for text extraction in
images by many scholars. Several techniques have been
developed for extracting the text from an image. The
proposed methods were based on morphological
operators, wavelet transform, artificial neural network,
skeletonization operation, edge detection algorithm,
histogram technique etc .
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 748
Fig1. (a) Fig1. (b) Fig1. (c)
Document_image Caption_image Scenec_image
1.1 Literature overview
According to Siddhartha Brahma, the text extraction from
image is done by using the shape context matching [1].
According to Ruini Cao, Chew Lim Tan – the separation of
overlapping text from graphics is a challenging problem in
document image analysis. So they used a specific method
for detecting and extracting characters that are touching
graphics. It is based on the observation that the
constituent strokes of characters are usually short
segment in comparison with those of graphics. It combines
line continuation with the feature line width to decompose
and reconstruct segments and improved the percentage of
correctly detected text as well as the accuracy of character
recognition significantly [2].
Q. Yuan, C. L. Tan presented a well designed method that
makes use of edge information to extract textual blocks
from the gray scale document images. It aims at detecting
textual regions on heavy noise infected newspaper images
and separate them from graphical regions. The algorithm
traces the feature points in different entities and then
groups those edge points of textual regions. By using the
line approximation and layout categorization, it can
successfully retrieve directional placed text blocks. Finally
they used a connected component merging to gather
homogeneous textual regions together within the scope of
its bounding rectangles. They tested this method on a
large group of newspaper images with multiple page
layouts, promising results approved the effectiveness of
their Method [3].
Kohei Arai and Herman Toll stated that Reading digital
comic on mobile phone is demanding now. Instead of
creating new mobile comic contents, adaptation of the
existing digital comic web portal is valuable. In this paper,
they proposed an automatic e-comic mobile content
adaptation method for automatically creating mobile
comic content from digital comic website portal.
Automatic e-comic content adaptation is based on the
comic frame extraction method combined with additional
process to extract comic balloon and text from digital
comic page. Their proposed method is an effective and
efficient method for real time implementation of reading
e-comic comparing to other methods. From their
Experimental results they showed a 100% accuracy of flat
comic frame extraction, 91.48% accuracy of non-flat comic
frame extraction, and about 90% processing time faster
than previous method [4].
Pan et al. [5] projected a text region detector to estimate
the text existing confidence and scale information in image
pyramid, which helped to segment candidate text
components by local Binarization. A conditional random
field (CRF) model considering unary component
properties and binary contextual component relationships
with supervised parameter learning to remove non text
was proposed. Finally, text components were grouped into
text lines/words with a learning-based energy
minimization method.
Chucai et.al [6] proposed an algorithm that was able to
model both character appearance and structure to
generate representative and discriminative text
descriptors. The article Gayathri et.al
[7] discussed in detail about the various existing schemes
on extracting the text from an image
1.2 Text Extraction(Gamma Method Rules)
Rule 1: If the value of Energy > =0.05, find an instance
wherethreshold value is 0.5 from the table. If more than
one instances are found, select an instance which has
maximum value of Contrast and Energy>=0.05. If there is
no instance found, find an instance where threshold value
is next nearer to 0.5. The corresponding gamma value of
this selected instance is the estimated gamma value.
Rule 2: If the value of Energy < 0.05 and the value of
Contrast>=1000, find an instance which has the value of
Energy>=0.1, the value of Contrast>=1000 and threshold
value of 0.5 from the table for the Gamma values 1 to 10. If
more than one instance is found, select an instance which
has the value of Energy maximum and the value of
Contrast > 1000. If there is no such instance found, find an
instance in between gamma value of 0.1 and 0.9 such that
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 749
value of the threshold should be nearer or next nearer
value of 0.5. The corresponding gamma value of this
selected instance if the estimated gamma value.
Rule 3: If the value of Energy < 0.05 and the value of
Contrast<1000, find an instance which has the value of
Energy>=0.1, the value of Contrast is maximum and the
maximum contrast value should be greater than 100 for
the Gamma values 1 to 10.If no such instance found, find
an instance in between gamma value of 0.1 and 1 such that
value of the threshold should be nearer or next nearer
value of 0.5. The corresponding gamma value of this
selected instance is the estimated gamma value. Text
Extraction from Image using Gamma Correction Method
Use a zero before decimal points: “0.25,” not “.25.” Use
“cm3,” not “cc.” (bullet list)
2. Experiment Result
1. Original image
2. Open image
3. Gray Scale of image
4. Binarization of image
3. CONCLUSIONS
There are many applications of a text extraction such as
Keyword based image search, text based image indexing
and retrieval , document analysis, vehicle license detection
and recognition, page segmentation , technical paper
analysis, street signs, name plates, document coding,
object identification, text based video indexing, video
content analysis etc. The Gamma Correction approach got
the average precision rate of 78% and recall rate of 96%.
Gamma Correction method outperforms the existing
methods. So, we can retrieved text from any image,
improve the quality, accuracy & maintain the database of
image.
REFERENCES
[1]. Siddhartha Brahma , “Text Extraction Using Shape
Context Matching”. COS429: Computer Vision. Vol.1, Jan
12, 2006.
[2]. Ruini Cao, Chew Lim Tan, “Separation of overlapping
text from graphics,” vol.29,no.1, pp.20-31, Jan/Feb 2009.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 750
[3]. Q. Yuan, C. L. Tan ,“Text Extraction from Gray Scale
Document Images Using Edge Information,” proceedings
of sixth international conference on document analysis
and recognition, pp.302-306, 2001.
[4]. Kohei Arai and Herman Tolle, “Automatic E-Comic
Content Adaptation,” International Journal of Ubiquitous
Computing (IJUC) vol.1, Issue(1), pp1-11, 2010.
[5]. Y. F. Pan, X. Hou, and C. Liu. A hybrid approach to
detect and localize texts in natural scene images.
IEEEvTrans. on Image Processing, 20(3):800–813, 2011.
[6]. Chucai Yi and Yingli Tian,”Text Extraction from Scene
Images by Character Appearance and Structure Modeling
“,Computer Vision and Image Understanding, Volume 117,
Issue 2, February 2013, Pages 182–194.

More Related Content

PDF
TEMPLATE MATCHING TECHNIQUE FOR SEARCHING WORDS IN DOCUMENT IMAGES
PDF
IRJET- Object Detection using Hausdorff Distance
PDF
IRJET- A Survey on MSER Based Scene Text Detection
PDF
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
PDF
Improved wolf algorithm on document images detection using optimum mean techn...
PDF
K044065257
PDF
20120140506007
PDF
A Neural Network Approach to Identify Hyperspectral Image Content
TEMPLATE MATCHING TECHNIQUE FOR SEARCHING WORDS IN DOCUMENT IMAGES
IRJET- Object Detection using Hausdorff Distance
IRJET- A Survey on MSER Based Scene Text Detection
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
Improved wolf algorithm on document images detection using optimum mean techn...
K044065257
20120140506007
A Neural Network Approach to Identify Hyperspectral Image Content

What's hot (20)

PDF
F045053236
PDF
K2 Algorithm-based Text Detection with An Adaptive Classifier Threshold
PDF
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
PDF
Text documents clustering using modified multi-verse optimizer
PDF
Re-enactment of Newspaper Articles
PDF
Novel Bacteria Foraging Optimization for Energy-efficient Communication in Wi...
PDF
IRJET- Fusion based Brain Tumor Detection
PDF
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
PDF
Number of Iteration Analysis for Complex FSS Shape Using GA for Efficient ESG
PDF
Comparative analysis of c99 and topictiling text segmentation algorithms
PDF
Comparative analysis of c99 and topictiling text
PDF
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
PDF
A new model for iris data set classification based on linear support vector m...
PDF
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
PDF
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
PDF
Learning to Rank Image Tags With Limited Training Examples
PDF
Neighborhood search methods with moth optimization algorithm as a wrapper met...
PDF
direct marketing in banking using data mining
PDF
Data reduction techniques for high dimensional biological data
PDF
C03504013016
F045053236
K2 Algorithm-based Text Detection with An Adaptive Classifier Threshold
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
Text documents clustering using modified multi-verse optimizer
Re-enactment of Newspaper Articles
Novel Bacteria Foraging Optimization for Energy-efficient Communication in Wi...
IRJET- Fusion based Brain Tumor Detection
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
Number of Iteration Analysis for Complex FSS Shape Using GA for Efficient ESG
Comparative analysis of c99 and topictiling text segmentation algorithms
Comparative analysis of c99 and topictiling text
Re-Mining Association Mining Results Through Visualization, Data Envelopment ...
A new model for iris data set classification based on linear support vector m...
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
Learning to Rank Image Tags With Limited Training Examples
Neighborhood search methods with moth optimization algorithm as a wrapper met...
direct marketing in banking using data mining
Data reduction techniques for high dimensional biological data
C03504013016
Ad

Similar to Text Extraction from Image Using GAMMA Correction Method. (20)

PDF
IRJET- Real-Time Text Reader for English Language
PDF
E1803012329
PDF
Methodology for eliminating plain regions from captured images
PDF
Scene Text Detection of Curved Text Using Gradiant Vector Flow Method
PDF
Text Extraction System by Eliminating Non-Text Regions
PDF
Inpainting scheme for text in video a survey
PDF
A Survey On Thresholding Operators of Text Extraction In Videos
PDF
A Survey On Thresholding Operators of Text Extraction In Videos
PPTX
Texture features based text extraction from images using DWT and K-means clus...
PDF
IRJET- Optical Character Recognition using Image Processing
PDF
IRJET- Malayalam Text Detection from Natural-Scene Images
PDF
Optical Character Recognition from Text Image
PDF
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
PDF
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
PDF
Cc31331335
PDF
40120140501009
PDF
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
PDF
IRJET- Detection and Recognition of Text for Dusty Image using Long Short...
PDF
Detection and identification of un-uniformed shape text from blurred video fr...
PDF
IRJET-MText Extraction from Images using Convolutional Neural Network
IRJET- Real-Time Text Reader for English Language
E1803012329
Methodology for eliminating plain regions from captured images
Scene Text Detection of Curved Text Using Gradiant Vector Flow Method
Text Extraction System by Eliminating Non-Text Regions
Inpainting scheme for text in video a survey
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
Texture features based text extraction from images using DWT and K-means clus...
IRJET- Optical Character Recognition using Image Processing
IRJET- Malayalam Text Detection from Natural-Scene Images
Optical Character Recognition from Text Image
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
Cc31331335
40120140501009
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
IRJET- Detection and Recognition of Text for Dusty Image using Long Short...
Detection and identification of un-uniformed shape text from blurred video fr...
IRJET-MText Extraction from Images using Convolutional Neural Network
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PPTX
communication and presentation skills 01
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
Abrasive, erosive and cavitation wear.pdf
PPTX
Software Engineering and software moduleing
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PPT
Total quality management ppt for engineering students
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
PPTX
Current and future trends in Computer Vision.pptx
PDF
August -2025_Top10 Read_Articles_ijait.pdf
PDF
737-MAX_SRG.pdf student reference guides
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
distributed database system" (DDBS) is often used to refer to both the distri...
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
communication and presentation skills 01
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Abrasive, erosive and cavitation wear.pdf
Software Engineering and software moduleing
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Total quality management ppt for engineering students
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
August 2025 - Top 10 Read Articles in Network Security & Its Applications
Current and future trends in Computer Vision.pptx
August -2025_Top10 Read_Articles_ijait.pdf
737-MAX_SRG.pdf student reference guides
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx

Text Extraction from Image Using GAMMA Correction Method.

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 747 Text Extraction From Image Using GAMMA Correction Method. 1 Gholap Vidya, 2 Jadhav Pranali, 3 Lokhande Manasi, 4 Magare Niyati 1 Student, Computer Engineering, kkwieer, Maharashtra, India 2 Student, Computer Engineering, kkwieer, Maharashtra, India 3 Student, Computer Engineering, kkwieer, Maharashtra, India 4 Student, Computer Engineering, kkwieer, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Text extraction is the task of automatically extracting structured information from instructed or semi- structured machine readable documents. When one scans a paper page into a computer produces just an image file, a photo of the page. The computer cannot understand the letter on the page so we need to extract text from image file. Text extraction from image is one of the complicated areas in digital image processing. Text data present in images contain useful information. We can extract text and layout information from image file. It is a complex process to detect and recognize the text from images due to their various sizes, gray scale and complex background. Various researchers used the method such as morphological filters, comic text extraction method, connect component labeling algorithm and mathematical morphology which gives less accuracy. In proposed system gamma method is used which may give more accuracy as compared to the existing methods. In gamma method it suppressed non-text background details from the image by applying appropriate gamma value and to remove non text region and makes the image accurate and extract the text from image. Key Words: Text Extraction, Text Recognition , Gamma method, Image Processing . 1. Introduction Images can be broadly classified into Document images, Caption text images and Scene text images. A document image (Figure 1a) usually contains text. Document images are acquired by scanning journal, printed document, degraded document images, handwritten historical document, and book cover etc. The text may appear in a virtually unlimited number of fonts, style, alignment, size, shapes, colors, etc. Extraction of text in documents with text on complex color background is difficult due to complexity of the background and mix up of color of fore- ground text with colors of background. Caption text is also known as Overlay text or Cut line text. Caption text (Figure 1b) is artificially superimposed on the video/image at the time of editing and it usually describes or identifies the subject of the image/video content. These types of caption text include moving text, rotating text, growing text, shrinking text, text of arbitrary orientation, and text of arbitrary size. Scene text (Figure 1c) appears within the scene which is then captured by the recording device i.e. text which is present in the scene when the image or video is shot. Scene texts occur naturally as a part of the scene and contain important semantic information. It is difficult to detect and extract since it may appear in a virtually unlimited number of poses, size, shapes and colors, low resolution, complex background, non-uniform lightning or blurring effects of varying lighting, complex movement and transformation, unknown layout, uneven lighting, shadowing and variation in font style, size, orientation, alignment & complexity of background. Due to very fast growth of available multimedia documents and growing requirement, studies in the field of pattern recognition shows a great amount of interest in efficient extraction of text, indexing and retrieval from digital video/document images. The text characters are difficult to be detected and recognized due to their deviation of size, font, style, orientation, alignment, contrast, complex colored, textured background. Intensive research projects are performed for text extraction in images by many scholars. Several techniques have been developed for extracting the text from an image. The proposed methods were based on morphological operators, wavelet transform, artificial neural network, skeletonization operation, edge detection algorithm, histogram technique etc .
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 748 Fig1. (a) Fig1. (b) Fig1. (c) Document_image Caption_image Scenec_image 1.1 Literature overview According to Siddhartha Brahma, the text extraction from image is done by using the shape context matching [1]. According to Ruini Cao, Chew Lim Tan – the separation of overlapping text from graphics is a challenging problem in document image analysis. So they used a specific method for detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segment in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments and improved the percentage of correctly detected text as well as the accuracy of character recognition significantly [2]. Q. Yuan, C. L. Tan presented a well designed method that makes use of edge information to extract textual blocks from the gray scale document images. It aims at detecting textual regions on heavy noise infected newspaper images and separate them from graphical regions. The algorithm traces the feature points in different entities and then groups those edge points of textual regions. By using the line approximation and layout categorization, it can successfully retrieve directional placed text blocks. Finally they used a connected component merging to gather homogeneous textual regions together within the scope of its bounding rectangles. They tested this method on a large group of newspaper images with multiple page layouts, promising results approved the effectiveness of their Method [3]. Kohei Arai and Herman Toll stated that Reading digital comic on mobile phone is demanding now. Instead of creating new mobile comic contents, adaptation of the existing digital comic web portal is valuable. In this paper, they proposed an automatic e-comic mobile content adaptation method for automatically creating mobile comic content from digital comic website portal. Automatic e-comic content adaptation is based on the comic frame extraction method combined with additional process to extract comic balloon and text from digital comic page. Their proposed method is an effective and efficient method for real time implementation of reading e-comic comparing to other methods. From their Experimental results they showed a 100% accuracy of flat comic frame extraction, 91.48% accuracy of non-flat comic frame extraction, and about 90% processing time faster than previous method [4]. Pan et al. [5] projected a text region detector to estimate the text existing confidence and scale information in image pyramid, which helped to segment candidate text components by local Binarization. A conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning to remove non text was proposed. Finally, text components were grouped into text lines/words with a learning-based energy minimization method. Chucai et.al [6] proposed an algorithm that was able to model both character appearance and structure to generate representative and discriminative text descriptors. The article Gayathri et.al [7] discussed in detail about the various existing schemes on extracting the text from an image 1.2 Text Extraction(Gamma Method Rules) Rule 1: If the value of Energy > =0.05, find an instance wherethreshold value is 0.5 from the table. If more than one instances are found, select an instance which has maximum value of Contrast and Energy>=0.05. If there is no instance found, find an instance where threshold value is next nearer to 0.5. The corresponding gamma value of this selected instance is the estimated gamma value. Rule 2: If the value of Energy < 0.05 and the value of Contrast>=1000, find an instance which has the value of Energy>=0.1, the value of Contrast>=1000 and threshold value of 0.5 from the table for the Gamma values 1 to 10. If more than one instance is found, select an instance which has the value of Energy maximum and the value of Contrast > 1000. If there is no such instance found, find an instance in between gamma value of 0.1 and 0.9 such that
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 749 value of the threshold should be nearer or next nearer value of 0.5. The corresponding gamma value of this selected instance if the estimated gamma value. Rule 3: If the value of Energy < 0.05 and the value of Contrast<1000, find an instance which has the value of Energy>=0.1, the value of Contrast is maximum and the maximum contrast value should be greater than 100 for the Gamma values 1 to 10.If no such instance found, find an instance in between gamma value of 0.1 and 1 such that value of the threshold should be nearer or next nearer value of 0.5. The corresponding gamma value of this selected instance is the estimated gamma value. Text Extraction from Image using Gamma Correction Method Use a zero before decimal points: “0.25,” not “.25.” Use “cm3,” not “cc.” (bullet list) 2. Experiment Result 1. Original image 2. Open image 3. Gray Scale of image 4. Binarization of image 3. CONCLUSIONS There are many applications of a text extraction such as Keyword based image search, text based image indexing and retrieval , document analysis, vehicle license detection and recognition, page segmentation , technical paper analysis, street signs, name plates, document coding, object identification, text based video indexing, video content analysis etc. The Gamma Correction approach got the average precision rate of 78% and recall rate of 96%. Gamma Correction method outperforms the existing methods. So, we can retrieved text from any image, improve the quality, accuracy & maintain the database of image. REFERENCES [1]. Siddhartha Brahma , “Text Extraction Using Shape Context Matching”. COS429: Computer Vision. Vol.1, Jan 12, 2006. [2]. Ruini Cao, Chew Lim Tan, “Separation of overlapping text from graphics,” vol.29,no.1, pp.20-31, Jan/Feb 2009.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 03 Issue: 01 | Jan-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 750 [3]. Q. Yuan, C. L. Tan ,“Text Extraction from Gray Scale Document Images Using Edge Information,” proceedings of sixth international conference on document analysis and recognition, pp.302-306, 2001. [4]. Kohei Arai and Herman Tolle, “Automatic E-Comic Content Adaptation,” International Journal of Ubiquitous Computing (IJUC) vol.1, Issue(1), pp1-11, 2010. [5]. Y. F. Pan, X. Hou, and C. Liu. A hybrid approach to detect and localize texts in natural scene images. IEEEvTrans. on Image Processing, 20(3):800–813, 2011. [6]. Chucai Yi and Yingli Tian,”Text Extraction from Scene Images by Character Appearance and Structure Modeling “,Computer Vision and Image Understanding, Volume 117, Issue 2, February 2013, Pages 182–194.