SlideShare a Scribd company logo
Binarization of Degraded Document
Images Based on Hierarchical Deep
Supervised Network
Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong
Yang, and Gueesang Lee
Pattern Recognition 74 (2018) 568–586
Presented by:
Tarik Reza Toha
#1017052013
• Problem definition
– What is the research problem?
• Motivation
– Why have the authors done the research?
• Solution approach
– How have the authors solved the problem?
– Be detail on this.
• Subsequent advancements
– What are the subsequent research studies and how have
they further advanced the solution of the problem?
2
Outline
3
Digital Archiving
• Historical documents represent valuable cultural
heritages that need to be protected and preserved
• Automatic analysis of historical-document images
involves:
– Layout analysis
– Text-line and word segmentation
– Optical character recognition (OCR)
• Binary image representation is preferred for document
analysis
– Each pixel is labeled as “text” (1) or “background” (0)
• Binarization of degraded document images is complicated
– Non-uniform intensity
– Complex background
– Bleed through
• Existing solutions use unsupervised approaches and low-
level features
– Difficult to differentiate the text from the non-text components
4
Binary Degraded Document Images
• Global binarization algorithms
– Extracted labeling information is applied to the entire
document images
• Otsu et al., compute a threshold
– minimize the within-class variance
– maximize the between-class variance
• Clustering-based approaches separate the text
through learning of the unsupervised models
• Work well with simple backgrounds and a
uniform intensity
5
Existing Binarization Methods
It cannot be used
for degraded
documents
• Local binarization algorithms
– Predict based on its neighborhood information
• Image binarization is a classification problem
– Unsupervised-classification algorithms
– Supervised learning-based approaches
• parameter-free nature
• no need for pre- or post-processing
– Deep neural network-based approaches
6
Existing Binarization Methods (contd.)
7
Existing Binarization Methods (contd.)
Still noises
and disconnected
strokes exist
Howe’s method vs Vo’s method on DIBCO 2011 dataset
• Hierarchical deep supervised network (DSN)
– Learns different feature levels from image data itself
to classify foreground and background from degraded
document images
• DSN extends traditional convolutional neural
network (CNN) to extract different feature levels
8
Main Contribution
9
Proposed Architecture
Demonstration of the DSN model for dense prediction
10
Proposed Architecture (contd.)
Diagram of the proposed DSN-based document binarization model
11
Global Threshold
12
Global Prediction
13
Experimental Evaluation
Samples of the generated training image patches and ground-
truth binary maps
14
DIBCO 2011 Dataset
15
DIBCO 2013 Dataset
16
DIBCO 2013 Dataset (contd.)
17
DIBCO 2014 Dataset
18
H-DIBCO 2016 Dataset
19
Network Structure Analysis
20
Other Types of Documents
Korean historical document image Chinese historical document image
21
Failure Analysis
• The binarization of degraded document images is
a challenging problem in terms of document
analysis
– DSN is a hierarchical architecture of deep supervised
network that incorporates side layers to improve the
training convergence
• Future work
– Handle the weak information
– Adaptation to music score and paycheck
– Reduce the number of convolutional layers
22
Conclusion
Thank you
Questions are welcome!
23

More Related Content

PPTX
Weave-D - 2nd Progress Evaluation Presentation
PDF
Predicting the future with social media
PDF
Data Science and What It Means to Library and Information Science
PDF
Multimedia mining research – an overview
PDF
Prediction of Student's Performance with Deep Neural Networks
PDF
Introduction to Deep Learning and some Neuroimaging Applications
PDF
Functional and Architectural Requirements for Metadata: Supporting Discovery...
PDF
Reproducibility in human cognitive neuroimaging: a community-­driven data sha...
Weave-D - 2nd Progress Evaluation Presentation
Predicting the future with social media
Data Science and What It Means to Library and Information Science
Multimedia mining research – an overview
Prediction of Student's Performance with Deep Neural Networks
Introduction to Deep Learning and some Neuroimaging Applications
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Reproducibility in human cognitive neuroimaging: a community-­driven data sha...

Similar to Binarization of degraded document images based on hierarchical deep supervised network (20)

PDF
Introduction to Neural Network
PDF
Geo-referenced human-activity-data; access, processing and knowledge extraction
PDF
Lect#1_Pattern_Recognition_PGIT204D_By_Dr_TSSinha.pdf
PPTX
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
PDF
Graph Data Science with Neo4j: Nordics Webinar
PDF
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
PDF
Zejia_CV_final
PPTX
Efficient Neural Network Architecture for Image Classfication
PPT
DataMining and Knowledge Discovery in DB.ppt
PPTX
Adbms 9 complex objects
PDF
“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...
PPTX
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
PPT
Multimedia Data Mining using Deep Learning
PDF
18CS81 IOT MODULE 4 PPT.pdf
PPTX
Talk@rmit 09112017
PPTX
Deep_Learning_Algorithms_Presentation.pptx
PDF
Technologies For Appraising and Managing Electronic Records
PPTX
Linked Data Quality Assessment – daQ and Luzzu
PDF
Qiagram
PDF
Qiagram Slides 2011 05
Introduction to Neural Network
Geo-referenced human-activity-data; access, processing and knowledge extraction
Lect#1_Pattern_Recognition_PGIT204D_By_Dr_TSSinha.pdf
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
Graph Data Science with Neo4j: Nordics Webinar
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
Zejia_CV_final
Efficient Neural Network Architecture for Image Classfication
DataMining and Knowledge Discovery in DB.ppt
Adbms 9 complex objects
“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Multimedia Data Mining using Deep Learning
18CS81 IOT MODULE 4 PPT.pdf
Talk@rmit 09112017
Deep_Learning_Algorithms_Presentation.pptx
Technologies For Appraising and Managing Electronic Records
Linked Data Quality Assessment – daQ and Luzzu
Qiagram
Qiagram Slides 2011 05
Ad

More from Tarik Reza Toha (20)

PPTX
DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources
PDF
An approach towards greening the digital display system
PDF
Many-Objective Performance Enhancement in Computing Clusters
PPTX
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
PPTX
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
PPTX
Automatic Fabric Defect Detection with a Wide-And-Compact Network
PPTX
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
PPTX
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
PPTX
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
PPTX
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
PPTX
PNUTS: Yahoo!’s Hosted Data Serving Platform
PPTX
Path shala
PPTX
Towards Greening the Digital Display System
PPTX
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
PDF
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
PPTX
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
PPTX
Smart Mat: A Low Cost People Counting Solution
PPTX
uReporter, an open public reporting system(SD)
PPTX
uReporter, a social problem reporting system (ISD+DB)
PDF
Euler trails and circuit
DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources
An approach towards greening the digital display system
Many-Objective Performance Enhancement in Computing Clusters
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
PNUTS: Yahoo!’s Hosted Data Serving Platform
Path shala
Towards Greening the Digital Display System
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Smart Mat: A Low Cost People Counting Solution
uReporter, an open public reporting system(SD)
uReporter, a social problem reporting system (ISD+DB)
Euler trails and circuit
Ad

Recently uploaded (20)

PDF
A systematic review of self-coping strategies used by university students to ...
PDF
RMMM.pdf make it easy to upload and study
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Presentation on HIE in infants and its manifestations
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Computing-Curriculum for Schools in Ghana
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Cell Structure & Organelles in detailed.
PDF
O7-L3 Supply Chain Operations - ICLT Program
A systematic review of self-coping strategies used by university students to ...
RMMM.pdf make it easy to upload and study
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
2.FourierTransform-ShortQuestionswithAnswers.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Presentation on HIE in infants and its manifestations
O5-L3 Freight Transport Ops (International) V1.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
VCE English Exam - Section C Student Revision Booklet
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
STATICS OF THE RIGID BODIES Hibbelers.pdf
Pharma ospi slides which help in ospi learning
Chinmaya Tiranga quiz Grand Finale.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Cell Structure & Organelles in detailed.
O7-L3 Supply Chain Operations - ICLT Program

Binarization of degraded document images based on hierarchical deep supervised network

  • 1. Binarization of Degraded Document Images Based on Hierarchical Deep Supervised Network Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong Yang, and Gueesang Lee Pattern Recognition 74 (2018) 568–586 Presented by: Tarik Reza Toha #1017052013
  • 2. • Problem definition – What is the research problem? • Motivation – Why have the authors done the research? • Solution approach – How have the authors solved the problem? – Be detail on this. • Subsequent advancements – What are the subsequent research studies and how have they further advanced the solution of the problem? 2 Outline
  • 3. 3 Digital Archiving • Historical documents represent valuable cultural heritages that need to be protected and preserved • Automatic analysis of historical-document images involves: – Layout analysis – Text-line and word segmentation – Optical character recognition (OCR)
  • 4. • Binary image representation is preferred for document analysis – Each pixel is labeled as “text” (1) or “background” (0) • Binarization of degraded document images is complicated – Non-uniform intensity – Complex background – Bleed through • Existing solutions use unsupervised approaches and low- level features – Difficult to differentiate the text from the non-text components 4 Binary Degraded Document Images
  • 5. • Global binarization algorithms – Extracted labeling information is applied to the entire document images • Otsu et al., compute a threshold – minimize the within-class variance – maximize the between-class variance • Clustering-based approaches separate the text through learning of the unsupervised models • Work well with simple backgrounds and a uniform intensity 5 Existing Binarization Methods It cannot be used for degraded documents
  • 6. • Local binarization algorithms – Predict based on its neighborhood information • Image binarization is a classification problem – Unsupervised-classification algorithms – Supervised learning-based approaches • parameter-free nature • no need for pre- or post-processing – Deep neural network-based approaches 6 Existing Binarization Methods (contd.)
  • 7. 7 Existing Binarization Methods (contd.) Still noises and disconnected strokes exist Howe’s method vs Vo’s method on DIBCO 2011 dataset
  • 8. • Hierarchical deep supervised network (DSN) – Learns different feature levels from image data itself to classify foreground and background from degraded document images • DSN extends traditional convolutional neural network (CNN) to extract different feature levels 8 Main Contribution
  • 9. 9 Proposed Architecture Demonstration of the DSN model for dense prediction
  • 10. 10 Proposed Architecture (contd.) Diagram of the proposed DSN-based document binarization model
  • 13. 13 Experimental Evaluation Samples of the generated training image patches and ground- truth binary maps
  • 20. 20 Other Types of Documents Korean historical document image Chinese historical document image
  • 22. • The binarization of degraded document images is a challenging problem in terms of document analysis – DSN is a hierarchical architecture of deep supervised network that incorporates side layers to improve the training convergence • Future work – Handle the weak information – Adaptation to music score and paycheck – Reduce the number of convolutional layers 22 Conclusion
  • 23. Thank you Questions are welcome! 23