SlideShare a Scribd company logo
CVIP-WM 2017
Aarushi Agrawal1, Prerana Mukherjee2, Siddharth Srivastava2 and Brejesh Lall2
2Department of Electrical Engineering
Indian Institute of Technology, Delhi
1Department of Electrical Engineering
Indian Institute of Technology, Kharagpur
Objective
To develop a novel language agnostic text detection method
utilizing edge enhanced Maximally Stable Extremal Regions in
natural scenes by defining strong characterness measures.
CVIP-WM 2017
• Text co-occurring in images and videos serve as a warehouse for
valuable information for describing images.
• A few interesting applications are
• Extract street names, numbers, textual indications such as
“diversion ahead”
• Autonomous vehicles- follow traffic rules based on road sign
interpretaion
• Indexing and tagging of images
Performing the above tasks is trivial for humans but segregating it against
a challenging background still remains as a complicated task for machines.
CVIP-WM 2017
Introduction
Related Works
• Maximally Stable Extremal Regions (MSERs)
• With Canny Edge Detector
• MSER is applied to the image to determine regions with characters
• Pixels outside of Canny Edges are removed
• With Graph Model
• Apply MSER for generating blobs
• Generate a graph model using the positioning, color etc of graphs
• Then define cost functions to separate foreground and background regions
• Stroke Width Transform
• Finds stroke width for each image pixel
• A stroke is a contiguous part of an image that forms a band of nearly constant
width
CVIP-WM 2017
Related Works
• Feature based techniques
• Histogram of Oriented Gradients
• Gabor based features
• Shape descriptors
• Fourier Transform
• Zernike moments
• Characterness
• Text specific saliency detection method
• Uses saliency cues to accentuate boundary information
CVIP-WM 2017
• We develop a language agnostic text identification framework
using text candidates obtained from edge based MSERs and
combination of various characterness cues. This is followed by a
entropy assisted non-text region rejection strategy. Finally, the
blobs are refined by combining regions with similar stroke width
variance and distribution of characterness cues in respective
regions
• We provide comprehensive evaluation on popular text datases
against recent text detection techniques and show that the
proposed technique provides equivalent or better results.
CVIP-WM 2017
Contributions
Methodology
CVIP-WM 2017
Text candidate generation using eMSERs:
• Generate initial set of text candidates using edge enhanced Maximally Stable Extremal Regions (eMSERs)
approach.
•MSER is a method for blob detection which extracts the covariant regions.
•It aggregates region with similar intensity at various thresholds.
•In order to handle presence of blur, eMSERs are computed over the gradient amplitude based image.
• Two sets of regions are generated: dark and bright; dark regions are those with lower intensity than their
surroundings and vice-versa
• Non text regions are rejected based on geometric properties such as aspect ratio, number of pixels(to
reject noise) and skeleton length.
.
Original Image
Methodology
Lighter side
Darker Side
Elimination of non-text regions:
• Text usually appears on a surrounding having a distinctive intensity.
•Find corresponding image patches, 𝑅, for eMSER blobs. As the patch may contain spurious data, we
obtain binarized image patch 𝑏𝑖 using Otsu's threshold for that region and common region,
𝐶 𝑅𝑖
between 𝑏𝑖 and 𝑅. Retain blob if (𝑏𝑖 ∩ 𝑅 > 90%).
• Define various characterness cues:
•Stroke width variance: For every pixel 𝑝 in the skeletal image of region (𝑟) to the boundary of the
region, 𝑆𝑊(𝑝) distribution is obtained and following are evaluated:
𝑣𝑎𝑟(𝑆𝑊)
𝑚𝑒𝑎𝑛(𝑆𝑊)2
max 𝑆𝑊 −min(𝑆𝑊)
𝐻𝑋𝑊
𝑚𝑜𝑑𝑒(𝑆𝑊)
𝐻𝑋𝑊
•HOG and PHOG: HOG is invariant to geometric and photometric transformations. PHOG helps in
providing a spatial layout for the local shape of the image.
•Entropy: Calculated as Shannon's entropy for the common regions (𝑏𝑖 ∩ 𝑅) given as,
𝐻 =- 𝑖=0
𝑁−1
𝑝𝑖 𝑙𝑜𝑔 𝑝𝑖
where 𝑁 = # gray levels ; 𝑝𝑖 = probability associated to the gray level 𝑖
Initial Blob Binarised image patch Selected individual alphabets ‘w’
and ‘n’.
Methodology
Bounding Box Refinement:
•Characterness cue distribution is defined by computing values for ICDAR 2013 dataset.
•Using above distribution, stroke width distribution and stroke width difference combine
the neighboring candidate regions and aggregate them into one larger text region.
•Combine all the neighboring regions into a single text candidate.
Smaller regions selected as individual blobs Final result after combining them
Methodology
Training and Testing:
Training is performed on ICDAR 2013 dataset while the test set consists of MSRATD and KAIST datasets. This
setting makes the evaluation potentially challenging as well as allows to evaluate the generalization ability of
various techniques.
Qualitative Results
Results
Quantitative Results
Precision Recall F- Measure
Proposed 0.85 0.33 0.46
Characterness [1] 0.53 0.25 0.31
Blob Detection [2] 0.8 0.47 0.55
Epshtein et al. [3] 0.25 0.25 0.25
Chen et al. [4] 0.05 0.05 0.05
TD-ICDAR [5] 0.53 0.52 0.5
Gomez et al. [6] 0.58 0.54 0.56
Precision Recall F- Measure
Proposed 0.8485 0.3299 0.4562
Characterness 0.5299 0.2467 0.3136
Blob Detection 0.8047 0.4716 0.5547
Precision Recall F- Measure
Proposed 0.9545 0.3556 0.4994
Characterness 0.7263 0.3209 0.4083
Blob Detection 0.9091 0.5141 0.6269
Precision Recall F- Measure
Proposed 0.9702 0.3362 0.4838
Characterness 0.8345 0.3043 0.4053
Blob Detection 0.9218 0.4826 0.5985
Precision Recall F- Measure
Proposed 0.9244 0.3407 0.4798
Characterness [1] 0.6969 0.2910 0.3757
Blob Detection [2] 0.8785 0.4898 0.5933
Gomez et al. [6] 0.66 0.78 0.71
Lee et al. [7] 0.69 0.60 0.64
KAIST - Mixed
KAIST - English
KAIST - Korean
KAIST - All
MSRATD
Results
•Proposed a language agnostic text identification scheme using
text candidates obtained from edge based eMSERs.
•Processing steps are used to reject the non-textual blobs and
combine smaller blobs into one larger region by utilizing stronger
characterness measures.
•The effectiveness has been analyzed with precision, recall and F-
measure evaluation measures showing that the proposed scheme
performs better than the traditional text detection schemes.
CVIP-WM 2017
Conclusion
[1] Li, Yao, Wenjing Jia, Chunhua Shen, and Anton van den Hengel. "Characterness: An indicator of text in the wild." IEEE transactions on
image processing 23, no. 4 (2014): 1666-1677.
[2] Jahangiri, Mohammad, and Maria Petrou. "An attention model for extracting components that merit identification." In Image Processing
(ICIP), 2009 16th IEEE International Conference on, pp. 965-968. IEEE, 2009.
[3] Epshtein, Boris, Eyal Ofek, and Yonatan Wexler. "Detecting text in natural scenes with stroke width transform." In Computer Vision and
Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2963-2970. IEEE, 2010.
[4] Chen, Xiangrong, and Alan L. Yuille. "Detecting and reading text in natural scenes." In Computer Vision and Pattern Recognition, 2004.
CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, vol. 2, pp. II-II. IEEE, 2004
[5] Yao, Cong, Xiang Bai, Wenyu Liu, Yi Ma, and Zhuowen Tu. "Detecting texts of arbitrary orientations in natural images." In Computer Vision
and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 1083-1090. IEEE, 2012.
[6] Gomez, Lluis, and Dimosthenis Karatzas. "Multi-script text extraction from natural scenes." In Document Analysis and Recognition (ICDAR),
2013 12th International Conference on, pp. 467-471. IEEE, 2013.
[7] Lee, SeongHun, Min Su Cho, Kyomin Jung, and Jin Hyung Kim. "Scene text extraction with edge constraint and text collinearity." In Pattern
Recognition (ICPR), 2010 20th International Conference on, pp. 3983-3986. IEEE, 2010.
CVIP-WM 2017
References
CVIP-WM 2017
Enhanced characterness for text detection in the wild

More Related Content

PDF
An ensemble classification algorithm for hyperspectral images
PPTX
Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...
PDF
An implementation of novel genetic based clustering algorithm for color image...
PDF
Self-Directing Text Detection and Removal from Images with Smoothing
PPT
Evaluation of Texture in CBIR
PDF
Content-based Image Retrieval Using The knowledge of Color, Texture in Binary...
PDF
Dj31514517
PDF
A CONCERT EVALUATION OF EXEMPLAR BASED IMAGE INPAINTING ALGORITHMS FOR NATURA...
An ensemble classification algorithm for hyperspectral images
Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...
An implementation of novel genetic based clustering algorithm for color image...
Self-Directing Text Detection and Removal from Images with Smoothing
Evaluation of Texture in CBIR
Content-based Image Retrieval Using The knowledge of Color, Texture in Binary...
Dj31514517
A CONCERT EVALUATION OF EXEMPLAR BASED IMAGE INPAINTING ALGORITHMS FOR NATURA...

What's hot (18)

PDF
K018137073
PDF
J017426467
PDF
C04741319
PDF
An Experiment with Sparse Field and Localized Region Based Active Contour Int...
PPTX
various methods for image segmentation
PPT
Segmentation
PDF
Sample Paper Techscribe
PPTX
Flag segmentation, feature extraction & identification using support vector m...
PDF
Ac03401600163.
PDF
Image Inpainting
PDF
Probabilistic model based image segmentation
PDF
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
PDF
Multiexposure Image Fusion
PDF
H0114857
PDF
C1803011419
PDF
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
PDF
A Combined Model for Image Inpainting
K018137073
J017426467
C04741319
An Experiment with Sparse Field and Localized Region Based Active Contour Int...
various methods for image segmentation
Segmentation
Sample Paper Techscribe
Flag segmentation, feature extraction & identification using support vector m...
Ac03401600163.
Image Inpainting
Probabilistic model based image segmentation
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint Recognition
Multiexposure Image Fusion
H0114857
C1803011419
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
A Combined Model for Image Inpainting
Ad

Similar to Enhanced characterness for text detection in the wild (20)

PDF
IRJET- A Survey on MSER Based Scene Text Detection
PDF
IRJET- Devnagari Text Detection
DOCX
JPM1417 Characterness: An Indicator of Text in the Wild
PPTX
Text extraction from natural scene image, a survey
PDF
Text Extraction System by Eliminating Non-Text Regions
PDF
IRJET- Malayalam Text Detection from Natural-Scene Images
PDF
Scene Text Detection of Curved Text Using Gradiant Vector Flow Method
PDF
Methodology for eliminating plain regions from captured images
PDF
E1803012329
PDF
Analysis and Comparison of various Methods for Text Detection from Images usi...
PDF
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
PDF
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
PDF
Text Detection and Recognition in Natural Images
PDF
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
PDF
Anatomical Survey Based Feature Vector for Text Pattern Detection
PDF
Scene text recognition in mobile applications by character descriptor and str...
PDF
Manuscript Character Recognition: Overview of features for the Feature Vector
PDF
Cc31331335
PDF
IRJET- Real-Time Text Reader for English Language
PDF
Ts2 c topic
IRJET- A Survey on MSER Based Scene Text Detection
IRJET- Devnagari Text Detection
JPM1417 Characterness: An Indicator of Text in the Wild
Text extraction from natural scene image, a survey
Text Extraction System by Eliminating Non-Text Regions
IRJET- Malayalam Text Detection from Natural-Scene Images
Scene Text Detection of Curved Text Using Gradiant Vector Flow Method
Methodology for eliminating plain regions from captured images
E1803012329
Analysis and Comparison of various Methods for Text Detection from Images usi...
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
Text Detection and Recognition in Natural Images
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...
Anatomical Survey Based Feature Vector for Text Pattern Detection
Scene text recognition in mobile applications by character descriptor and str...
Manuscript Character Recognition: Overview of features for the Feature Vector
Cc31331335
IRJET- Real-Time Text Reader for English Language
Ts2 c topic
Ad

More from Prerana Mukherjee (10)

PDF
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
PDF
SALIENCY MAP BASED IMPROVED SEGMENTATION
PDF
AUTOMATED BALL TRACKING IN TENNIS VIDEO
PDF
A real-time ball trajectory follower using Robot Operating System
PDF
Adaptive Image Compression Using Saliency and KAZE Features
PDF
ADAPTIVE CRYPTO-STEGANOSYSTEM FOR VIDEOS BASED ON INFORMATION CONTENT AND VIS...
PDF
ADAPTIVE CRYPTO-STEGANOSYSTEM FOR VIDEOS BASED ON INFORMATION CONTENT AND VIS...
PDF
Salient KeypointSelection for Object Representation
PDF
DRIZY- Collaborative Driver Assistance Over Wireless Networks
PDF
imPlag: Detecting Image Plagiarism Using Hierarchical Near Duplicate Retrieval
SALPROP: SALIENT OBJECT PROPOSALS VIA AGGREGATED EDGE CUES
SALIENCY MAP BASED IMPROVED SEGMENTATION
AUTOMATED BALL TRACKING IN TENNIS VIDEO
A real-time ball trajectory follower using Robot Operating System
Adaptive Image Compression Using Saliency and KAZE Features
ADAPTIVE CRYPTO-STEGANOSYSTEM FOR VIDEOS BASED ON INFORMATION CONTENT AND VIS...
ADAPTIVE CRYPTO-STEGANOSYSTEM FOR VIDEOS BASED ON INFORMATION CONTENT AND VIS...
Salient KeypointSelection for Object Representation
DRIZY- Collaborative Driver Assistance Over Wireless Networks
imPlag: Detecting Image Plagiarism Using Hierarchical Near Duplicate Retrieval

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Lesson notes of climatology university.
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
master seminar digital applications in india
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Pre independence Education in Inndia.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
Insiders guide to clinical Medicine.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Final Presentation General Medicine 03-08-2024.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Microbial diseases, their pathogenesis and prophylaxis
Lesson notes of climatology university.
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
O7-L3 Supply Chain Operations - ICLT Program
PPH.pptx obstetrics and gynecology in nursing
Supply Chain Operations Speaking Notes -ICLT Program
master seminar digital applications in india
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
VCE English Exam - Section C Student Revision Booklet
Pre independence Education in Inndia.pdf
RMMM.pdf make it easy to upload and study
Insiders guide to clinical Medicine.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Microbial disease of the cardiovascular and lymphatic systems
Final Presentation General Medicine 03-08-2024.pptx

Enhanced characterness for text detection in the wild

  • 1. CVIP-WM 2017 Aarushi Agrawal1, Prerana Mukherjee2, Siddharth Srivastava2 and Brejesh Lall2 2Department of Electrical Engineering Indian Institute of Technology, Delhi 1Department of Electrical Engineering Indian Institute of Technology, Kharagpur
  • 2. Objective To develop a novel language agnostic text detection method utilizing edge enhanced Maximally Stable Extremal Regions in natural scenes by defining strong characterness measures. CVIP-WM 2017
  • 3. • Text co-occurring in images and videos serve as a warehouse for valuable information for describing images. • A few interesting applications are • Extract street names, numbers, textual indications such as “diversion ahead” • Autonomous vehicles- follow traffic rules based on road sign interpretaion • Indexing and tagging of images Performing the above tasks is trivial for humans but segregating it against a challenging background still remains as a complicated task for machines. CVIP-WM 2017 Introduction
  • 4. Related Works • Maximally Stable Extremal Regions (MSERs) • With Canny Edge Detector • MSER is applied to the image to determine regions with characters • Pixels outside of Canny Edges are removed • With Graph Model • Apply MSER for generating blobs • Generate a graph model using the positioning, color etc of graphs • Then define cost functions to separate foreground and background regions • Stroke Width Transform • Finds stroke width for each image pixel • A stroke is a contiguous part of an image that forms a band of nearly constant width CVIP-WM 2017
  • 5. Related Works • Feature based techniques • Histogram of Oriented Gradients • Gabor based features • Shape descriptors • Fourier Transform • Zernike moments • Characterness • Text specific saliency detection method • Uses saliency cues to accentuate boundary information CVIP-WM 2017
  • 6. • We develop a language agnostic text identification framework using text candidates obtained from edge based MSERs and combination of various characterness cues. This is followed by a entropy assisted non-text region rejection strategy. Finally, the blobs are refined by combining regions with similar stroke width variance and distribution of characterness cues in respective regions • We provide comprehensive evaluation on popular text datases against recent text detection techniques and show that the proposed technique provides equivalent or better results. CVIP-WM 2017 Contributions
  • 8. Text candidate generation using eMSERs: • Generate initial set of text candidates using edge enhanced Maximally Stable Extremal Regions (eMSERs) approach. •MSER is a method for blob detection which extracts the covariant regions. •It aggregates region with similar intensity at various thresholds. •In order to handle presence of blur, eMSERs are computed over the gradient amplitude based image. • Two sets of regions are generated: dark and bright; dark regions are those with lower intensity than their surroundings and vice-versa • Non text regions are rejected based on geometric properties such as aspect ratio, number of pixels(to reject noise) and skeleton length. . Original Image Methodology Lighter side Darker Side
  • 9. Elimination of non-text regions: • Text usually appears on a surrounding having a distinctive intensity. •Find corresponding image patches, 𝑅, for eMSER blobs. As the patch may contain spurious data, we obtain binarized image patch 𝑏𝑖 using Otsu's threshold for that region and common region, 𝐶 𝑅𝑖 between 𝑏𝑖 and 𝑅. Retain blob if (𝑏𝑖 ∩ 𝑅 > 90%). • Define various characterness cues: •Stroke width variance: For every pixel 𝑝 in the skeletal image of region (𝑟) to the boundary of the region, 𝑆𝑊(𝑝) distribution is obtained and following are evaluated: 𝑣𝑎𝑟(𝑆𝑊) 𝑚𝑒𝑎𝑛(𝑆𝑊)2 max 𝑆𝑊 −min(𝑆𝑊) 𝐻𝑋𝑊 𝑚𝑜𝑑𝑒(𝑆𝑊) 𝐻𝑋𝑊 •HOG and PHOG: HOG is invariant to geometric and photometric transformations. PHOG helps in providing a spatial layout for the local shape of the image. •Entropy: Calculated as Shannon's entropy for the common regions (𝑏𝑖 ∩ 𝑅) given as, 𝐻 =- 𝑖=0 𝑁−1 𝑝𝑖 𝑙𝑜𝑔 𝑝𝑖 where 𝑁 = # gray levels ; 𝑝𝑖 = probability associated to the gray level 𝑖 Initial Blob Binarised image patch Selected individual alphabets ‘w’ and ‘n’. Methodology
  • 10. Bounding Box Refinement: •Characterness cue distribution is defined by computing values for ICDAR 2013 dataset. •Using above distribution, stroke width distribution and stroke width difference combine the neighboring candidate regions and aggregate them into one larger text region. •Combine all the neighboring regions into a single text candidate. Smaller regions selected as individual blobs Final result after combining them Methodology
  • 11. Training and Testing: Training is performed on ICDAR 2013 dataset while the test set consists of MSRATD and KAIST datasets. This setting makes the evaluation potentially challenging as well as allows to evaluate the generalization ability of various techniques. Qualitative Results Results
  • 12. Quantitative Results Precision Recall F- Measure Proposed 0.85 0.33 0.46 Characterness [1] 0.53 0.25 0.31 Blob Detection [2] 0.8 0.47 0.55 Epshtein et al. [3] 0.25 0.25 0.25 Chen et al. [4] 0.05 0.05 0.05 TD-ICDAR [5] 0.53 0.52 0.5 Gomez et al. [6] 0.58 0.54 0.56 Precision Recall F- Measure Proposed 0.8485 0.3299 0.4562 Characterness 0.5299 0.2467 0.3136 Blob Detection 0.8047 0.4716 0.5547 Precision Recall F- Measure Proposed 0.9545 0.3556 0.4994 Characterness 0.7263 0.3209 0.4083 Blob Detection 0.9091 0.5141 0.6269 Precision Recall F- Measure Proposed 0.9702 0.3362 0.4838 Characterness 0.8345 0.3043 0.4053 Blob Detection 0.9218 0.4826 0.5985 Precision Recall F- Measure Proposed 0.9244 0.3407 0.4798 Characterness [1] 0.6969 0.2910 0.3757 Blob Detection [2] 0.8785 0.4898 0.5933 Gomez et al. [6] 0.66 0.78 0.71 Lee et al. [7] 0.69 0.60 0.64 KAIST - Mixed KAIST - English KAIST - Korean KAIST - All MSRATD Results
  • 13. •Proposed a language agnostic text identification scheme using text candidates obtained from edge based eMSERs. •Processing steps are used to reject the non-textual blobs and combine smaller blobs into one larger region by utilizing stronger characterness measures. •The effectiveness has been analyzed with precision, recall and F- measure evaluation measures showing that the proposed scheme performs better than the traditional text detection schemes. CVIP-WM 2017 Conclusion
  • 14. [1] Li, Yao, Wenjing Jia, Chunhua Shen, and Anton van den Hengel. "Characterness: An indicator of text in the wild." IEEE transactions on image processing 23, no. 4 (2014): 1666-1677. [2] Jahangiri, Mohammad, and Maria Petrou. "An attention model for extracting components that merit identification." In Image Processing (ICIP), 2009 16th IEEE International Conference on, pp. 965-968. IEEE, 2009. [3] Epshtein, Boris, Eyal Ofek, and Yonatan Wexler. "Detecting text in natural scenes with stroke width transform." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2963-2970. IEEE, 2010. [4] Chen, Xiangrong, and Alan L. Yuille. "Detecting and reading text in natural scenes." In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, vol. 2, pp. II-II. IEEE, 2004 [5] Yao, Cong, Xiang Bai, Wenyu Liu, Yi Ma, and Zhuowen Tu. "Detecting texts of arbitrary orientations in natural images." In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 1083-1090. IEEE, 2012. [6] Gomez, Lluis, and Dimosthenis Karatzas. "Multi-script text extraction from natural scenes." In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on, pp. 467-471. IEEE, 2013. [7] Lee, SeongHun, Min Su Cho, Kyomin Jung, and Jin Hyung Kim. "Scene text extraction with edge constraint and text collinearity." In Pattern Recognition (ICPR), 2010 20th International Conference on, pp. 3983-3986. IEEE, 2010. CVIP-WM 2017 References