SlideShare a Scribd company logo
Document Representation Refinement
for Precise Region Description
Christian Clausner, Stefan Pletschacher and
Apostolos Antonacopoulos
PRImA Lab, School of Computing, Science and
Engineering, University of Salford,
United Kingdom
Document Page Regions
DATeCH 2014 2
Segmentation,
Classification
• Region (block, zone): Connected area of a
document image with content of a single
specific type
• Examples: Text, graphic, table
Region Representation
• By geometric objects
– Bounding box
– Stack of rectangles
– Polygon
• By pixels
– Bitmap
– Run-length encoding
DATeCH 2014 3
Need for Precise Region Descriptions
• Precise description is crucial for all but the most
trivial document analysis and recognition
applications
• For performance evaluation:
The loss of quality introduced
by imprecise regions can be
bigger than the variation of
accuracy of the actual
recognition method
DATeCH 2014 4
The Situation
• Trend to more precise descriptions, but…
• Output of state-of-the-artOCR systems:
– Stacks of rectangles (ABBYY FineReader Engine 11)
– Bounding boxes (Tesseract OCR 3.02)
• Popular formats for layout analysis and OCR results:
– ALTO XML (boxes, ellipses, polygons (region level only))
– FineReader XML (stacks of rectangles (region level only))
– PAGE XML (polygons for all levels)
– HOCR (boxes)
DATeCH 2014 5
Refinement through Polygonal Fitting
• Applicable to regions that
have child objects in the
document model
• A typical object hierarchy
contains regions, text lines,
words and glyphs (characters)
• Idea: Tightly wrap a polygon
around the child objects
DATeCH 2014 6
Polygonal Fitting Approach
1. Create bitmasks for the child
objects and transfer them to an
empty bitmap
2. Fill the gaps between the child
objects by a smearing approach
3. Optional: Exclude neighbour
regions
4. Trace the contour of the
foreground and create a polygon
DATeCH 2014 7
1 - Transferring Child Object to Bitmap
• Starting point: Polygonal object (e.g. text line,
word, or glyph)
• Lossless conversion to rectangle based interval
representation
• Transferring the rectangles to the target bitmap
DATeCH 2014 8
2 – Smearing Approach
• Goal: Connect all foreground
components in the bitmap by
filling the gaps in-between
1. Alternatingly fill horizontal and
vertical gaps if they are smaller
than a dynamic threshold
(threshold is increased after
each iteration)
2. If necessary, use diagonal
smearing to connect remaining
components
DATeCH 2014 9
3 – Subtraction of Neighbours
• Optional step to avoid
overlap with adjacent
regions
• Simply erase the
corresponding pixels from
the created bitmap
DATeCH 2014 10
4 – Outline Tracing
• Trace the contour of the
foreground component
in the created bitmap
• Create polygon on-the-
fly by adding points for
each change of direction
(corner)
DATeCH 2014 11
Experiments
• Carried out on a dataset
of contemporary
documents consisting of
scanned magazine and
technical article pages
• Processed with Tesseract
OCR 3.02 (open source)
• Exported to PAGE XML
with and without
refinement
DATeCH 2014 12
DATeCH 2014 13
Original (unrefined) Refined
Results
• Measurement of region overlaps (number and
area)
DATeCH 2014 14
Overlapping
Regions
Overlap Area
(Megapixel)
Original
Outlines
621 (45.8%) 19.9
Refined
Outlines
286 (21.1%) 2.5
Impact on Performance Evaluation
• Real-world scenario
• Measure the performance of Tesseract OCR engine
• Evaluation metrics of previous ICDAR page
segmentation competitions
DATeCH 2014 15
Average success rate using originaloutlines 81.1%
Average success rate using refined outlines 84.5%
Average improvementfor all documents 3.4%
Maximumimprovement 22.9%
Conclusion
• Existing geometric region data can be significantly refined by fitting
precise polygons around child objects
• Validity and impact on real-world scenarios has been shown
• Refinement in performance evaluation helps to eliminate problems
that arise from insufficient geometric descriptions → Concentrate
on real issues of OCR methods
• Positive effect on accuracy of presentation/repurposing systems
(highlighting, cropping, article tracking, etc.)
• Approach used in Aletheia ground truth editor and result viewer
(primaresearch.org/tools)
DATeCH 2014 16
DATeCH 2014 17

More Related Content

PPTX
PPSX
Geographical information system unit 5
PPTX
Geographical information system
PPTX
Dfg & sg ppt (1)
PPT
Improvement of Spatial Data Quality Using the Data Conflation
PPT
Iccsa stankuteha180611
PPTX
GIS fundamentals - vector
PPTX
Remote Sensing: Overlay Analysis
Geographical information system unit 5
Geographical information system
Dfg & sg ppt (1)
Improvement of Spatial Data Quality Using the Data Conflation
Iccsa stankuteha180611
GIS fundamentals - vector
Remote Sensing: Overlay Analysis

What's hot (20)

PDF
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
PPT
Gis Concepts 5/5
PPTX
Surface reconstruction using point cloud
PPTX
GEOPROCESSING IN QGIS
PDF
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
PDF
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
PDF
Au 2008 Gs100 1 P Getting Spatial With
PDF
conversion of digital elevation maps to geological information
PDF
Spme 2013 segmentation
PDF
GIS in land suitability mapping
PDF
Spatial data analysis 1
PPTX
Mar 8 single_map_analysis_1
PPSX
Geographical information system unit 6
PPTX
GIS Analysis For Site Remediation
PDF
Spatial Data Model
PDF
QGIS Tutorial 1
PPT
Creating watershed using SRTM DEM
PDF
QGIS Tutorial 2
PPTX
ML whitepaper v0.2
PDF
Graph chi
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Gis Concepts 5/5
Surface reconstruction using point cloud
GEOPROCESSING IN QGIS
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
Au 2008 Gs100 1 P Getting Spatial With
conversion of digital elevation maps to geological information
Spme 2013 segmentation
GIS in land suitability mapping
Spatial data analysis 1
Mar 8 single_map_analysis_1
Geographical information system unit 6
GIS Analysis For Site Remediation
Spatial Data Model
QGIS Tutorial 1
Creating watershed using SRTM DEM
QGIS Tutorial 2
ML whitepaper v0.2
Graph chi
Ad

Viewers also liked (8)

PPT
Kennisbank IMPACT by Lotte Wilms
PPT
Image Enhancement tools by Lotte Wilms
PDF
University library of KU Leuven - Sam Alloing et Demmy Verbecke
PDF
Biblioteca Virtual Miguel de Cervantes - Oskarbi Zubiarrain
PDF
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
PPT
CONCERT IMPACT by Lotte Wilms
PDF
7. Technical development at the Meertens Institute. Marc Kemps Snijders.
PDF
Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein A...
Kennisbank IMPACT by Lotte Wilms
Image Enhancement tools by Lotte Wilms
University library of KU Leuven - Sam Alloing et Demmy Verbecke
Biblioteca Virtual Miguel de Cervantes - Oskarbi Zubiarrain
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
CONCERT IMPACT by Lotte Wilms
7. Technical development at the Meertens Institute. Marc Kemps Snijders.
Datech2014 - Session 5 - Wittgenstein’s Nachlass: WiTTFind and Wittgenstein A...
Ad

Similar to Datech2014-Session1-Document Representation Refinement for Precise Region Description (20)

PDF
2015 10-08 - additive manufacturing software 1
PDF
Algorithmic Techniques for Parametric Model Recovery
PDF
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
PDF
Enterprise Scale Topological Data Analysis Using Spark
PDF
Enterprise Scale Topological Data Analysis Using Spark
PPTX
An Efficient Arabic Text Spotting from Natural Scenes Images
PDF
DaViT.pdf
PPTX
Computer Aided Engineering - Introduction
PPTX
Global Map Matching using BLE Beacons for Indoor Route and Stay Estimation
PPTX
Presentation
PPTX
BarnieMAT
PDF
HP - Jerome Rolia - Hadoop World 2010
PPT
TcpTunnel CAD
PPTX
2015-07-08 Paper 38 - ICVS Talk
PPTX
Lecture 4 Digital terrain modelling.pptx
PPTX
Computer Vision Landscape : Present and Future
PDF
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
PPTX
Module-2-Routing.pptx department of computer
PPT
Facility layout
PPTX
Spatiotemporal analytics
2015 10-08 - additive manufacturing software 1
Algorithmic Techniques for Parametric Model Recovery
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
An Efficient Arabic Text Spotting from Natural Scenes Images
DaViT.pdf
Computer Aided Engineering - Introduction
Global Map Matching using BLE Beacons for Indoor Route and Stay Estimation
Presentation
BarnieMAT
HP - Jerome Rolia - Hadoop World 2010
TcpTunnel CAD
2015-07-08 Paper 38 - ICVS Talk
Lecture 4 Digital terrain modelling.pptx
Computer Vision Landscape : Present and Future
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
Module-2-Routing.pptx department of computer
Facility layout
Spatiotemporal analytics

More from IMPACT Centre of Competence (20)

PDF
Session6 01.helmut schmid
PDF
Session1 03.hsian-an wang
PDF
Session7 03.katrien depuydt
PDF
Session7 02.peter kiraly
PDF
Session6 04.giuseppe celano
PDF
Session6 03.sandra young
PDF
Session6 02.jeremi ochab
PDF
Session5 04.evangelos varthis
PDF
Session5 03.george rehm
PDF
Session5 02.tom derrick
PDF
Session5 01.rutger vankoert
PDF
Session4 04.senka drobac
PDF
Session3 04.arnau baro
PDF
Session3 03.christian clausner
PDF
Session3 02.kimmo ketunnen
PDF
Session3 01.clemens neudecker
PDF
Session2 04.ashkan ashkpour
PDF
Session2 03.juri opitz
PDF
Session2 02.christian reul
PDF
Session2 01.emad mohamed
Session6 01.helmut schmid
Session1 03.hsian-an wang
Session7 03.katrien depuydt
Session7 02.peter kiraly
Session6 04.giuseppe celano
Session6 03.sandra young
Session6 02.jeremi ochab
Session5 04.evangelos varthis
Session5 03.george rehm
Session5 02.tom derrick
Session5 01.rutger vankoert
Session4 04.senka drobac
Session3 04.arnau baro
Session3 03.christian clausner
Session3 02.kimmo ketunnen
Session3 01.clemens neudecker
Session2 04.ashkan ashkpour
Session2 03.juri opitz
Session2 02.christian reul
Session2 01.emad mohamed

Recently uploaded (20)

PPTX
observCloud-Native Containerability and monitoring.pptx
PPT
What is a Computer? Input Devices /output devices
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PPTX
1. Introduction to Computer Programming.pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PPTX
The various Industrial Revolutions .pptx
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid model detection and classification of lung cancer
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
observCloud-Native Containerability and monitoring.pptx
What is a Computer? Input Devices /output devices
1 - Historical Antecedents, Social Consideration.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
1. Introduction to Computer Programming.pptx
DP Operators-handbook-extract for the Mautical Institute
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Zenith AI: Advanced Artificial Intelligence
A contest of sentiment analysis: k-nearest neighbor versus neural network
O2C Customer Invoices to Receipt V15A.pptx
The various Industrial Revolutions .pptx
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
NewMind AI Weekly Chronicles - August'25-Week II
Getting started with AI Agents and Multi-Agent Systems
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Tartificialntelligence_presentation.pptx
Hybrid model detection and classification of lung cancer
Programs and apps: productivity, graphics, security and other tools
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf

Datech2014-Session1-Document Representation Refinement for Precise Region Description

  • 1. Document Representation Refinement for Precise Region Description Christian Clausner, Stefan Pletschacher and Apostolos Antonacopoulos PRImA Lab, School of Computing, Science and Engineering, University of Salford, United Kingdom
  • 2. Document Page Regions DATeCH 2014 2 Segmentation, Classification • Region (block, zone): Connected area of a document image with content of a single specific type • Examples: Text, graphic, table
  • 3. Region Representation • By geometric objects – Bounding box – Stack of rectangles – Polygon • By pixels – Bitmap – Run-length encoding DATeCH 2014 3
  • 4. Need for Precise Region Descriptions • Precise description is crucial for all but the most trivial document analysis and recognition applications • For performance evaluation: The loss of quality introduced by imprecise regions can be bigger than the variation of accuracy of the actual recognition method DATeCH 2014 4
  • 5. The Situation • Trend to more precise descriptions, but… • Output of state-of-the-artOCR systems: – Stacks of rectangles (ABBYY FineReader Engine 11) – Bounding boxes (Tesseract OCR 3.02) • Popular formats for layout analysis and OCR results: – ALTO XML (boxes, ellipses, polygons (region level only)) – FineReader XML (stacks of rectangles (region level only)) – PAGE XML (polygons for all levels) – HOCR (boxes) DATeCH 2014 5
  • 6. Refinement through Polygonal Fitting • Applicable to regions that have child objects in the document model • A typical object hierarchy contains regions, text lines, words and glyphs (characters) • Idea: Tightly wrap a polygon around the child objects DATeCH 2014 6
  • 7. Polygonal Fitting Approach 1. Create bitmasks for the child objects and transfer them to an empty bitmap 2. Fill the gaps between the child objects by a smearing approach 3. Optional: Exclude neighbour regions 4. Trace the contour of the foreground and create a polygon DATeCH 2014 7
  • 8. 1 - Transferring Child Object to Bitmap • Starting point: Polygonal object (e.g. text line, word, or glyph) • Lossless conversion to rectangle based interval representation • Transferring the rectangles to the target bitmap DATeCH 2014 8
  • 9. 2 – Smearing Approach • Goal: Connect all foreground components in the bitmap by filling the gaps in-between 1. Alternatingly fill horizontal and vertical gaps if they are smaller than a dynamic threshold (threshold is increased after each iteration) 2. If necessary, use diagonal smearing to connect remaining components DATeCH 2014 9
  • 10. 3 – Subtraction of Neighbours • Optional step to avoid overlap with adjacent regions • Simply erase the corresponding pixels from the created bitmap DATeCH 2014 10
  • 11. 4 – Outline Tracing • Trace the contour of the foreground component in the created bitmap • Create polygon on-the- fly by adding points for each change of direction (corner) DATeCH 2014 11
  • 12. Experiments • Carried out on a dataset of contemporary documents consisting of scanned magazine and technical article pages • Processed with Tesseract OCR 3.02 (open source) • Exported to PAGE XML with and without refinement DATeCH 2014 12
  • 13. DATeCH 2014 13 Original (unrefined) Refined
  • 14. Results • Measurement of region overlaps (number and area) DATeCH 2014 14 Overlapping Regions Overlap Area (Megapixel) Original Outlines 621 (45.8%) 19.9 Refined Outlines 286 (21.1%) 2.5
  • 15. Impact on Performance Evaluation • Real-world scenario • Measure the performance of Tesseract OCR engine • Evaluation metrics of previous ICDAR page segmentation competitions DATeCH 2014 15 Average success rate using originaloutlines 81.1% Average success rate using refined outlines 84.5% Average improvementfor all documents 3.4% Maximumimprovement 22.9%
  • 16. Conclusion • Existing geometric region data can be significantly refined by fitting precise polygons around child objects • Validity and impact on real-world scenarios has been shown • Refinement in performance evaluation helps to eliminate problems that arise from insufficient geometric descriptions → Concentrate on real issues of OCR methods • Positive effect on accuracy of presentation/repurposing systems (highlighting, cropping, article tracking, etc.) • Approach used in Aletheia ground truth editor and result viewer (primaresearch.org/tools) DATeCH 2014 16