SlideShare a Scribd company logo
Scene Text Detection on Images
   using Cellular Automata
  Konstantinos Zagoris and Ioannis Pratikakis




      Image Processing and Multimedia Lab,
Department of Electrical and Computer Engineering,
 Democritus University of Thrace, Xanthi, Greece
    kzagoris@ee.duth.gr, ipratika@ee.duth.gr
Outline
 Introduction
 State of the Art
 Disadvantages
 Architecture of the proposed method
 Canny Edge Detector
 Coordinating Logic Filters (CLF)
 Proposed Cellular Automata Text Detection
  Method
 Evaluation and Experimental Results
Introduction
 Textual information in images or video constitutes
  a very rich source of high-level semantics for
  retrieval and indexing
 It can be acquired as scene text that was
  captured by a video or photo camera as part of a
  scene
 Text detection on natural scenes is still a hard
  task to solve
 Have very high computational cost
State of the Art
 Split in two categories: region-based and texture-
    based
    Region-based algorithms group pixels based on
    common characteristics
   Texture-based methods scan the image at
    different scales using a sliding window and
    classify text areas based on texture information.
   From another perspective, can be divided into
    heuristic-based and machine learning-based
    methods.
   Heuristic-based algorithms segment the image
    into small regions and then group them by some
    constraints
   Machine learning-based methods use directly
Disadvantages
 Many     parameters have to be estimated
  experimentally    condemns       them    to   data
  dependency and lack of generality
 When background is really complex, they
  become computationally expensive.
 Texture-based        techniques cannot catch
  satisfactory text with size bigger of the sliding
  window.
 An increase of the window make these methods
  quite costly. In addition, they still use empirical
  thresholds on specific features therefore they lack
  adaptability.
Proposed Method
 Address the scene text detection problem by
  modeling texture into cellular automata (CA)
  context
 Replace costly image processing operations with
  their equivalent cellular operations
 Eliminate most limitations, such as the empirical
  thresholds and heavy computational procedures
Architecture of the proposed method
Original Image

          Canny Edge
             Map

                 Logical OR
                                          Cellular Automata
                         Logical AND


    Coordinating Logic           Logical OR
        Filters                        Majority State
                                           Rule
                                                    Edge
                                                 Projection
                                                  Filtering
                                                        Final Text
Coordinating Logic Filters (CLF)
 execute coordinate logic operations among the
  pixels of the image
 The     CLF operations is similar to the
  morphological operations, achieving similar
  functionality
 morphology Dilation is the logical OR
 morphology Erosion is the logical AND
Canny Edge Detector
 Detection of the salient image edges
 Use Sobel masks
 thresholding and non-maxima suppression(low
  threshold equal to 20 and high threshold equal to
  100)
 The final edge map is a binarised image with the
  contour pixels set to one (white) and the
  remainder pixels equal to zero (black).
 This approach exploits the fact that text lines
  produce strong vertical edges horizontally aligned
  with a high density.
 gives us the opportunity to detect normal or
Canny Edge Detector
Proposed Cellular Automata
 The proposed CA is considered to be a 2-D lattice
  of cells where every pixel is represented by a cell.
 The CA grid width and height is defined by the
  edge image width and height
 Each cell have two states as the input image is
  binary.
 Taking advantage of the CA flexibility, the
  transition rules are changing and are applied in
  four consecutive steps resulting in four time steps
  CA evolution.
1st Step – Logical OR

1st Step – Logical OR
2nd Step – Logical AND

2nd Step Logical AND
3rd Step – Logical OR

3rd Step – Logical OR
Majority State Rule

4th Step - Majority State Rule
Edge Projection Filtering
 in the high edge density images, the method
  produces a number of false positives
 post-processing filtering is required in order to
  remove them
 filtered them based on horizontal and vertical
  projections
 Areas with mean horizontal and vertical
  projections below a threshold are discarded.
Edge Projection Filtering
Examples
Examples
Evaluation

Evaluation





    1. Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object
    detection and segmentation algorithms. International Journal on Document
    Analysis and Recognition 8(4), 280–296 (2006)
Experimental Results
  In order to showcase the advantages of our
   proposed method, we test it against a machine-
   learning edge based scene text detection system.
  We replace the CLF with the corresponding
   morphological operations (dilation and opening)
   and the majority state rule with the Support
   Vector Machines (SVMs) classifier
Method                   Recall   Precision   Harmonic
                                              Mean
Proposed CA-based        0.7942   0.7462      0.7652
method
Machine-learning based   0.7134   0.5234      0.6038
method
Experimental Results
Mean execution time of each of them for a set images
(15 total) in a Intel Core 2 Quad CPU Q9550
(2.83GHz) machine.

Method                    Mean Execution Time
                          (sec)
Proposed CA-based         2.75 sec
method
Machine-learning based    5.96 sec
method
Conclusions
 A method based on the Cellular Automata was
  presented for the detection of scene text on
  natural images
 Initially, the Canny edge detector is employed in
  order to exposed the dominant edges on the
  image.
 Then a CA is used for the calculation of the
  candidate text areas. Its rules depend on
  Coordinating Logic Filters and on the majority
  state rule
 A post-processing technique based on edge
  projection analysis is employed for the high
  density edge images in order to eliminated the
  false positives.
Ευχαριστώ Πολφ!


  Thank You!

More Related Content

PPTX
Developing Document Image Retrieval System
PPTX
Text extraction using document structure features and support vector machines
PPTX
Automatic Image Annotation
PPT
Color reduction using the combination of the kohonen self organized feature m...
PPTX
Segmentation - based Historical Handwritten Word Spotting using document-spec...
PPTX
MultiModal Retrieval Image
PPTX
Handwritten and Machine Printed Text Separation in Document Images using the ...
PPTX
ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)
Developing Document Image Retrieval System
Text extraction using document structure features and support vector machines
Automatic Image Annotation
Color reduction using the combination of the kohonen self organized feature m...
Segmentation - based Historical Handwritten Word Spotting using document-spec...
MultiModal Retrieval Image
Handwritten and Machine Printed Text Separation in Document Images using the ...
ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)

What's hot (20)

PDF
Self-Directing Text Detection and Removal from Images with Smoothing
PPTX
Self-organizing map
PPTX
Text extraction from images
PDF
Enhanced characterness for text detection in the wild
PDF
Hand Written Digit Classification
PPTX
Image classification with Deep Neural Networks
PDF
201907 AutoML and Neural Architecture Search
PDF
IRJET- Object Detection using Hausdorff Distance
PDF
Btv thesis defense_v1.02-final
PDF
C04741319
PDF
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
PDF
Kernel based similarity estimation and real time tracking of moving
PPTX
Introduction to Convolutional Neural Networks
PDF
O017429398
PPT
Sefl Organizing Map
PDF
Ijetcas14 527
PPTX
Convolutional neural network from VGG to DenseNet
PPTX
Pillar k means
PDF
Understanding Convolutional Neural Networks
Self-Directing Text Detection and Removal from Images with Smoothing
Self-organizing map
Text extraction from images
Enhanced characterness for text detection in the wild
Hand Written Digit Classification
Image classification with Deep Neural Networks
201907 AutoML and Neural Architecture Search
IRJET- Object Detection using Hausdorff Distance
Btv thesis defense_v1.02-final
C04741319
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Kernel based similarity estimation and real time tracking of moving
Introduction to Convolutional Neural Networks
O017429398
Sefl Organizing Map
Ijetcas14 527
Convolutional neural network from VGG to DenseNet
Pillar k means
Understanding Convolutional Neural Networks
Ad

Viewers also liked (18)

PPTX
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
PPTX
Svm based cbir of breast masses on mammograms
PPTX
Content and Metadata Based Image Document Retrieval (in Greek)
PDF
Dynamic Two-Stage Image Retrieval from Large Multimodal Databases
PPTX
Text Detection and Recognition
PPTX
Query expansion based on visual content new
PDF
Presentation iwssip2012
PPTX
Detecting text from natural images with Stroke Width Transform
PPTX
Text Detection From Image
PPTX
Text detection and recognition from natural scenes
PDF
Block Emulation and Computation in One-dimensional Cellular Automata: Breakin...
PDF
Automata Invasion
PPTX
online payment system using Steganography and Visual cryptography
PDF
Text Detection Strategies
PPT
Urban Land Cover Change Detection Analysis and Modelling Spatio-Temporal Grow...
DOCX
Visual Cryptography Industrial Training Report
PPTX
Visual cryptography
PDF
Finite automata
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
Svm based cbir of breast masses on mammograms
Content and Metadata Based Image Document Retrieval (in Greek)
Dynamic Two-Stage Image Retrieval from Large Multimodal Databases
Text Detection and Recognition
Query expansion based on visual content new
Presentation iwssip2012
Detecting text from natural images with Stroke Width Transform
Text Detection From Image
Text detection and recognition from natural scenes
Block Emulation and Computation in One-dimensional Cellular Automata: Breakin...
Automata Invasion
online payment system using Steganography and Visual cryptography
Text Detection Strategies
Urban Land Cover Change Detection Analysis and Modelling Spatio-Temporal Grow...
Visual Cryptography Industrial Training Report
Visual cryptography
Finite automata
Ad

Similar to Scene Text Detection on Images using Cellular Automata (20)

PDF
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
PDF
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
PDF
Classification and Comparison of License Plates Localization Algorithms
PDF
Classification and Comparison of License Plates Localization Algorithms
PDF
Classification and Comparison of License Plates Localization Algorithms
PDF
Classification and Comparison of License Plates Localization Algorithms
PDF
Dj31514517
PDF
Dj31514517
PDF
IEEE 2014 Matlab Projects
PDF
IEEE 2014 Matlab Projects
PDF
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
PDF
The International Journal of Engineering and Science (The IJES)
PDF
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
PDF
Enhancement and Segmentation of Historical Records
DOCX
JPM1407 Exposing Digital Image Forgeries by Illumination Color Classification
PDF
Segmentation of Images by using Fuzzy k-means clustering with ACO
PDF
Design and implementation of video tracking system based on camera field of view
DOCX
Matlab abstract 2016
PDF
Effective Object Detection and Background Subtraction by using M.O.I
PDF
Matlab 2013 14 papers astract
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
Classification and Comparison of License Plates Localization Algorithms
Classification and Comparison of License Plates Localization Algorithms
Classification and Comparison of License Plates Localization Algorithms
Classification and Comparison of License Plates Localization Algorithms
Dj31514517
Dj31514517
IEEE 2014 Matlab Projects
IEEE 2014 Matlab Projects
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
The International Journal of Engineering and Science (The IJES)
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Enhancement and Segmentation of Historical Records
JPM1407 Exposing Digital Image Forgeries by Illumination Color Classification
Segmentation of Images by using Fuzzy k-means clustering with ACO
Design and implementation of video tracking system based on camera field of view
Matlab abstract 2016
Effective Object Detection and Background Subtraction by using M.O.I
Matlab 2013 14 papers astract

Scene Text Detection on Images using Cellular Automata

  • 1. Scene Text Detection on Images using Cellular Automata Konstantinos Zagoris and Ioannis Pratikakis Image Processing and Multimedia Lab, Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece kzagoris@ee.duth.gr, ipratika@ee.duth.gr
  • 2. Outline  Introduction  State of the Art  Disadvantages  Architecture of the proposed method  Canny Edge Detector  Coordinating Logic Filters (CLF)  Proposed Cellular Automata Text Detection Method  Evaluation and Experimental Results
  • 3. Introduction  Textual information in images or video constitutes a very rich source of high-level semantics for retrieval and indexing  It can be acquired as scene text that was captured by a video or photo camera as part of a scene  Text detection on natural scenes is still a hard task to solve  Have very high computational cost
  • 4. State of the Art  Split in two categories: region-based and texture- based  Region-based algorithms group pixels based on common characteristics  Texture-based methods scan the image at different scales using a sliding window and classify text areas based on texture information.  From another perspective, can be divided into heuristic-based and machine learning-based methods.  Heuristic-based algorithms segment the image into small regions and then group them by some constraints  Machine learning-based methods use directly
  • 5. Disadvantages  Many parameters have to be estimated experimentally condemns them to data dependency and lack of generality  When background is really complex, they become computationally expensive.  Texture-based techniques cannot catch satisfactory text with size bigger of the sliding window.  An increase of the window make these methods quite costly. In addition, they still use empirical thresholds on specific features therefore they lack adaptability.
  • 6. Proposed Method  Address the scene text detection problem by modeling texture into cellular automata (CA) context  Replace costly image processing operations with their equivalent cellular operations  Eliminate most limitations, such as the empirical thresholds and heavy computational procedures
  • 7. Architecture of the proposed method Original Image Canny Edge Map Logical OR Cellular Automata Logical AND Coordinating Logic Logical OR Filters Majority State Rule Edge Projection Filtering Final Text
  • 8. Coordinating Logic Filters (CLF)  execute coordinate logic operations among the pixels of the image  The CLF operations is similar to the morphological operations, achieving similar functionality  morphology Dilation is the logical OR  morphology Erosion is the logical AND
  • 9. Canny Edge Detector  Detection of the salient image edges  Use Sobel masks  thresholding and non-maxima suppression(low threshold equal to 20 and high threshold equal to 100)  The final edge map is a binarised image with the contour pixels set to one (white) and the remainder pixels equal to zero (black).  This approach exploits the fact that text lines produce strong vertical edges horizontally aligned with a high density.  gives us the opportunity to detect normal or
  • 11. Proposed Cellular Automata  The proposed CA is considered to be a 2-D lattice of cells where every pixel is represented by a cell.  The CA grid width and height is defined by the edge image width and height  Each cell have two states as the input image is binary.  Taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution.
  • 12. 1st Step – Logical OR 
  • 13. 1st Step – Logical OR
  • 14. 2nd Step – Logical AND 
  • 16. 3rd Step – Logical OR 
  • 17. 3rd Step – Logical OR
  • 19. 4th Step - Majority State Rule
  • 20. Edge Projection Filtering  in the high edge density images, the method produces a number of false positives  post-processing filtering is required in order to remove them  filtered them based on horizontal and vertical projections  Areas with mean horizontal and vertical projections below a threshold are discarded.
  • 25. Evaluation  1. Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. International Journal on Document Analysis and Recognition 8(4), 280–296 (2006)
  • 26. Experimental Results  In order to showcase the advantages of our proposed method, we test it against a machine- learning edge based scene text detection system.  We replace the CLF with the corresponding morphological operations (dilation and opening) and the majority state rule with the Support Vector Machines (SVMs) classifier Method Recall Precision Harmonic Mean Proposed CA-based 0.7942 0.7462 0.7652 method Machine-learning based 0.7134 0.5234 0.6038 method
  • 27. Experimental Results Mean execution time of each of them for a set images (15 total) in a Intel Core 2 Quad CPU Q9550 (2.83GHz) machine. Method Mean Execution Time (sec) Proposed CA-based 2.75 sec method Machine-learning based 5.96 sec method
  • 28. Conclusions  A method based on the Cellular Automata was presented for the detection of scene text on natural images  Initially, the Canny edge detector is employed in order to exposed the dominant edges on the image.  Then a CA is used for the calculation of the candidate text areas. Its rules depend on Coordinating Logic Filters and on the majority state rule  A post-processing technique based on edge projection analysis is employed for the high density edge images in order to eliminated the false positives.