SlideShare a Scribd company logo
Real time OCR
using Tesseract
12BCE094
SHOBHIT CHITTORA
Brief History Of Tesseract
 Open Source OCR engine sponsored by Google since 2006.
 One of the most accurate open source OCR engines currently
available.
 Originally developed by HP between 1985-1994.
 Lot of it is written in C and C++.
TessOCR Architecture
Adaptive Thresholding is Essential
Baselines are rarely perfectly
straight
Spaces between words are tricky
too
 Italics, digits, punctuation all create special-case font-dependent
spacing.
 Fully justified text in narrow columns can have vastly varying spacing
on different lines.
Tesseract Word Recognizer
Outline Approximation
 Polygonal approximation is a double-edged sword.
 Noise and some pertinent information are both lost.
Why it’s called Tesseract?
 Elements of the polygonal approximation, clustered within a
character/font combination.
 x, y position, direction, and length (as a multiple of feature length)
Character Classifier (Features and
Matching)
 Static classifier uses outline fragments as features. Broken characters are
easily recognizable by a small->large matching process in classifier. (This is
slow.)
 Adaptive classifier uses the same technique!
Classifier as Histogram of Gradients
 Quantize character area.
 Compute gradients within.
 Histograms of gradients map to fixed dimension feature vector.
Character Segmentation
 Segmentation Graphs
OCR using Tesseract
Rating and Certainty
 Rating = Distance * Outline length
○ Total rating over a word (or line if you prefer) is normalized
○ Different length transcriptions are fairly comparable
 Certainty = -20 * Distance
○ Measures the absolute classification confidence
○ Surrogate for log probability and is used to decide what needs
more work.
Tesseract Training
Implementation using Tess-two( Tess
port for Android)
 The Tess-two library is an open source port of Tesseract engine for
Android.
 Only the most basic and popular functionalities are ported.
 Things such as deep neutral nets are not ported.
 A lot of tweaking is required to produce desired results.
DEMO
Implementing Real Time OCR and
challenges
 Image processing on memory limited devices is difficult.
 Limited clock speeds to process huge matrices.
 Running the Camera Surface Holder in MainUI and preprocessing
and OCR on user threads.
 Maintaining huge Bitmaps for preprocessing and sending to multiple
threads.
 Avoiding Garbage Collection of important preprocessed data.
Thank You

More Related Content

PDF
Os Raysmith
PPTX
OCR using Tesseract
PPTX
Tasract OCR
PPT
Tesseract OCR Engine - OpenFest 2009
PPTX
Tamil OCR using Tesseract OCR Engine
PPTX
Tesseract OCR Engine
PDF
Entering the Fourth Dimension of OCR with Tesseract - Talk from Voxxed Days B...
PDF
1 intro history
Os Raysmith
OCR using Tesseract
Tasract OCR
Tesseract OCR Engine - OpenFest 2009
Tamil OCR using Tesseract OCR Engine
Tesseract OCR Engine
Entering the Fourth Dimension of OCR with Tesseract - Talk from Voxxed Days B...
1 intro history

What's hot (20)

PDF
State-of-Art Optical Character Recognition case
PPTX
NLP State of the Art | BERT
PDF
5 character classifiers
PPTX
Recurrent Neural Networks for Text Analysis
PPTX
Understanding Autoencoder (Deep Learning Book, Chapter 14)
PPTX
BrailleOCR: An Open Source Document to Braille Converter Application
PDF
BERT - Part 1 Learning Notes of Senthil Kumar
PPTX
BERT introduction
PDF
BERT Finetuning Webinar Presentation
PDF
3 training
PPTX
1909 BERT: why-and-how (CODE SEMINAR)
PPT
Introduction to phython programming
PPT
Lzw coding technique for image compression
PDF
7. Trevor Cohn (usfd) Statistical Machine Translation
PPTX
basics of object oriented programming
PDF
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
PPTX
Object Oriented Programming Languages
PDF
Introduction to Autoencoders
PPT
Introduction to python
PPTX
arithmetic and adaptive arithmetic coding
State-of-Art Optical Character Recognition case
NLP State of the Art | BERT
5 character classifiers
Recurrent Neural Networks for Text Analysis
Understanding Autoencoder (Deep Learning Book, Chapter 14)
BrailleOCR: An Open Source Document to Braille Converter Application
BERT - Part 1 Learning Notes of Senthil Kumar
BERT introduction
BERT Finetuning Webinar Presentation
3 training
1909 BERT: why-and-how (CODE SEMINAR)
Introduction to phython programming
Lzw coding technique for image compression
7. Trevor Cohn (usfd) Statistical Machine Translation
basics of object oriented programming
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
Object Oriented Programming Languages
Introduction to Autoencoders
Introduction to python
arithmetic and adaptive arithmetic coding
Ad

Viewers also liked (13)

PPTX
OCR processing with deep learning: Apply to Vietnamese documents
PPTX
project indesh
PDF
2 architecture anddatastructures
PDF
CS_rapport_final_fr_v3_1
PPT
Gps Navigation Survey
PPTX
Introduction to python for Beginners
PPT
Introduction to Python
PPTX
Python 101: Python for Absolute Beginners (PyTexas 2014)
PPT
Raspberry pi
ODP
Python Presentation
PDF
Déposer une thèse dans TEL ou HAL
PPT
Introduction to Python
OCR processing with deep learning: Apply to Vietnamese documents
project indesh
2 architecture anddatastructures
CS_rapport_final_fr_v3_1
Gps Navigation Survey
Introduction to python for Beginners
Introduction to Python
Python 101: Python for Absolute Beginners (PyTexas 2014)
Raspberry pi
Python Presentation
Déposer une thèse dans TEL ou HAL
Introduction to Python
Ad

Similar to OCR using Tesseract (20)

PPTX
Team-98 research paper presentation.pptx
PDF
IRJET- Image to Text Conversion using Tesseract
PPTX
SAA 2014 Pre-conference Workshop - OCRing with Open Source Tools
PPTX
Ethiopic Scrip OCR App Front End and Backend
PDF
Super Resolution with OCR Optimization
PDF
Entering the Fourth Dimension of OCR with Tesseract
PPTX
An Efficient Arabic Text Spotting from Natural Scenes Images
ODP
Indic OCR
PPTX
Lessons from Indic OCR Development
PDF
8 modernization efforts
PPTX
Web API for ethiopic script optical character recognition
PPTX
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
PDF
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
PPTX
Image to network
PDF
IRJET- Text Extraction from Text Based Image using Android
PDF
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
DOCX
A survey on optical character recognition system
PDF
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
PDF
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
PDF
An Optical Character Recognition Engine For Graphical Processing Units
Team-98 research paper presentation.pptx
IRJET- Image to Text Conversion using Tesseract
SAA 2014 Pre-conference Workshop - OCRing with Open Source Tools
Ethiopic Scrip OCR App Front End and Backend
Super Resolution with OCR Optimization
Entering the Fourth Dimension of OCR with Tesseract
An Efficient Arabic Text Spotting from Natural Scenes Images
Indic OCR
Lessons from Indic OCR Development
8 modernization efforts
Web API for ethiopic script optical character recognition
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Text Recognition Using Tesseract OCR Facilitating Multilingualism: A Review
Image to network
IRJET- Text Extraction from Text Based Image using Android
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
A survey on optical character recognition system
Audio computing Image to Text Synthesizer - A Cutting-Edge Content Generator ...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
An Optical Character Recognition Engine For Graphical Processing Units

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
web development for engineering and engineering
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
additive manufacturing of ss316l using mig welding
PDF
Well-logging-methods_new................
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Project quality management in manufacturing
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Construction Project Organization Group 2.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
web development for engineering and engineering
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
additive manufacturing of ss316l using mig welding
Well-logging-methods_new................
bas. eng. economics group 4 presentation 1.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Project quality management in manufacturing
R24 SURVEYING LAB MANUAL for civil enggi
Construction Project Organization Group 2.pptx
573137875-Attendance-Management-System-original
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Foundation to blockchain - A guide to Blockchain Tech
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx

OCR using Tesseract

  • 1. Real time OCR using Tesseract 12BCE094 SHOBHIT CHITTORA
  • 2. Brief History Of Tesseract  Open Source OCR engine sponsored by Google since 2006.  One of the most accurate open source OCR engines currently available.  Originally developed by HP between 1985-1994.  Lot of it is written in C and C++.
  • 5. Baselines are rarely perfectly straight
  • 6. Spaces between words are tricky too  Italics, digits, punctuation all create special-case font-dependent spacing.  Fully justified text in narrow columns can have vastly varying spacing on different lines.
  • 8. Outline Approximation  Polygonal approximation is a double-edged sword.  Noise and some pertinent information are both lost.
  • 9. Why it’s called Tesseract?  Elements of the polygonal approximation, clustered within a character/font combination.  x, y position, direction, and length (as a multiple of feature length)
  • 10. Character Classifier (Features and Matching)  Static classifier uses outline fragments as features. Broken characters are easily recognizable by a small->large matching process in classifier. (This is slow.)  Adaptive classifier uses the same technique!
  • 11. Classifier as Histogram of Gradients  Quantize character area.  Compute gradients within.  Histograms of gradients map to fixed dimension feature vector.
  • 14. Rating and Certainty  Rating = Distance * Outline length ○ Total rating over a word (or line if you prefer) is normalized ○ Different length transcriptions are fairly comparable  Certainty = -20 * Distance ○ Measures the absolute classification confidence ○ Surrogate for log probability and is used to decide what needs more work.
  • 16. Implementation using Tess-two( Tess port for Android)  The Tess-two library is an open source port of Tesseract engine for Android.  Only the most basic and popular functionalities are ported.  Things such as deep neutral nets are not ported.  A lot of tweaking is required to produce desired results.
  • 17. DEMO
  • 18. Implementing Real Time OCR and challenges  Image processing on memory limited devices is difficult.  Limited clock speeds to process huge matrices.  Running the Camera Surface Holder in MainUI and preprocessing and OCR on user threads.  Maintaining huge Bitmaps for preprocessing and sending to multiple threads.  Avoiding Garbage Collection of important preprocessed data.