SlideShare a Scribd company logo
Digitizing books
1
STEPS, HINTS, VIDEOS
What is the digitization?
2

 Digitization is the process of converting

information into a digital format. In this format,
information is organized into discrete units of data
that can be separately addressed. This is
the binary data that computers and many devices
with computing capacity can process.
 http://whatis.techtarget.com/definition/0,,sid9_g
ci896692,00.html
 See also: http://en.wikipedia.org/wiki/Digitizing
Steps of digitization
3

 1. Choose the book you want to digitize.
 2. Choose an OCR software (GO!)
 3. Scan your book (Choose the devise. Scanner,








compact device, digital camera, IRIScan) (GO!)
4. Optical Character Recognition (image)
5. Correction (image1) (image2)
6. Save as a text searchable PDF document
See another versions:
http://www.inquisition.ca/en/info/artic/comment_
numeriser.htm
http://dlg.galileo.usg.edu/guide.html#01
Text and images
4

 Text and images can be digitized similarly:

a scanner captures an image (which may be an image
of text) and converts it to an image file, such as
a bitmap. An optical character recognition (OCR)
program analyzes a text image for light and dark
areas in order to identify each alphabetic letter or
numeric digit, and converts each character into
an ASCII code.
Choose an OCR software
5

 There are a lot of softwares to digitize your





documents.
On Wikipedia there is comparison list of optical
character recognition softwares. Check it out!
http://en.wikipedia.org/wiki/List_of_optical_chara
cter_recognition_software
(I recommend you the ABBYY FineReader.)
If you don’t want to buy (or download) a
software, here’s a free online OCR:
http://www.newocr.com/
What is OCR?
6

 OCR (optical character recognition) is the

recognition of printed or written text characters by
a computer. This involves photoscanning of the
text character-by-character, analysis of the
scanned-in image, and then translation of the
character image into character codes, such as
ASCII, commonly used in data processing.
 http://searchciomidmarket.techtarget.com/definition/OCR
 Read more:
http://en.wikipedia.org/wiki/Optical_character_r
ecognition
What is ASCII?
7

 ASCII (American Standard Code for Information

Interchange) is the most common format for
text files in computers and on the Internet. In an
ASCII file, each alphabetic, numeric, or special
character is represented with a 7-bit binary number
(a string of seven 0s or 1s). 128 possible characters
are defined.
 In: http://searchciomidmarket.techtarget.com/definition/ASCII
How to scan the book
8

 With scanner: http://www.wikihow.com/Scan-a






Book
http://www.proportionalreading.com/scan.html
With one compact device:
http://www.ehow.com/how_6950098_scan-bookpdf-format.html
With digital camera:
http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera
With IRIScan:
http://www.youtube.com/watch?v=9bgcDHLe3Xg
Optical Character Recognition
9
Correction image 1
10
Correction image 2
11
Videos
12

 How to digitize a book:










http://www.youtube.com/watch?v=-M95Ob4kIak
How to chop and scan a book:
http://www.youtube.com/watch?v=8tx2JmW_p4c
Scanning text using OCR software:
http://www.youtube.com/watch?v=_SwrGtSY4-c
How to OCR PDFs easily with Acrobat Batch OCR:
http://www.youtube.com/watch?v=V6Iz3U5X-SU
How to digitize a million books
http://www.youtube.com/watch?v=OlKhKyTS23E
How to put a scanned doc into PDF format
13

 http://www.ehow.com/how_8563246_put-scanned-

document-pdf-format.html
 Some OCR softwares include
PDF format to save.
 Have a good reading on
your digital device!



Made by Mario Laskovics (2012.04.03)

More Related Content

PDF
Pikiran Rakyat 10 Maret 2014
PPTX
Digitization
PDF
ICT indicators digital Bangladesh
PPTX
Top Ideas for Digital Bangladesh
PPT
Digitization Process by Audra Eagle Yun
PPTX
The Process of Digitalization - A How-To Guide
PPTX
Roles of ICT in digital bangladesh
PPTX
Advantages and Disadvantages of Technology
Pikiran Rakyat 10 Maret 2014
Digitization
ICT indicators digital Bangladesh
Top Ideas for Digital Bangladesh
Digitization Process by Audra Eagle Yun
The Process of Digitalization - A How-To Guide
Roles of ICT in digital bangladesh
Advantages and Disadvantages of Technology

Similar to Digitizing books (20)

PPTX
Digitization in theory and practice
PPTX
Optical Character Recognition
PDF
50120130406005
DOCX
Optical character recognition IEEE Paper Study
PPTX
Digitization and Digital Preservation: An E-Library Solution.
PPTX
Presentation on OCR
PPTX
Optical character recognition (ocr) ppt
PPTX
Handwriting Recognition
DOCX
Opticalcharacter recognition
PDF
A Review of Optical Character Recognition System for Recognition of Printed Text
PDF
E017322833
PDF
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
PDF
Z04405149151
PDF
ocrppt-140415204404-phpapp01.pdf
PPTX
Optical Character Recognition
PPTX
Optical Character Recognition (OCR)
PDF
Optical Character Recognition (OCR) System
PDF
D017222226
PDF
OPTICAL CHARACTER RECOGNITION USING RBFNN
PPTX
OCR (Optical Character Recognition)
Digitization in theory and practice
Optical Character Recognition
50120130406005
Optical character recognition IEEE Paper Study
Digitization and Digital Preservation: An E-Library Solution.
Presentation on OCR
Optical character recognition (ocr) ppt
Handwriting Recognition
Opticalcharacter recognition
A Review of Optical Character Recognition System for Recognition of Printed Text
E017322833
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
Z04405149151
ocrppt-140415204404-phpapp01.pdf
Optical Character Recognition
Optical Character Recognition (OCR)
Optical Character Recognition (OCR) System
D017222226
OPTICAL CHARACTER RECOGNITION USING RBFNN
OCR (Optical Character Recognition)
Ad

More from Nyugat-magyarországi Egyetem, Savaria Egyetemi Központ (9)

PPTX
A pápai városi könyvtár szervezeti kultúrája
PPTX
PPTX
Tudásmenedzsment konferenciák, tanfolyamok témái
PPTX
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
PPTX
University of Wisconsin-Madison's School of Library and Information Studies
PPTX
Dokumentumok általános megnevezése
A pápai városi könyvtár szervezeti kultúrája
Tudásmenedzsment konferenciák, tanfolyamok témái
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
University of Wisconsin-Madison's School of Library and Information Studies
Dokumentumok általános megnevezése
Ad

Recently uploaded (20)

PDF
RMMM.pdf make it easy to upload and study
PDF
Complications of Minimal Access Surgery at WLH
PDF
Empowerment Technology for Senior High School Guide
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
Classroom Observation Tools for Teachers
PDF
Computing-Curriculum for Schools in Ghana
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Indian roads congress 037 - 2012 Flexible pavement
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
advance database management system book.pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
Lesson notes of climatology university.
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
RMMM.pdf make it easy to upload and study
Complications of Minimal Access Surgery at WLH
Empowerment Technology for Senior High School Guide
Hazard Identification & Risk Assessment .pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Classroom Observation Tools for Teachers
Computing-Curriculum for Schools in Ghana
What if we spent less time fighting change, and more time building what’s rig...
Indian roads congress 037 - 2012 Flexible pavement
Final Presentation General Medicine 03-08-2024.pptx
A systematic review of self-coping strategies used by university students to ...
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
advance database management system book.pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Lesson notes of climatology university.
Supply Chain Operations Speaking Notes -ICLT Program
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf

Digitizing books

  • 2. What is the digitization? 2  Digitization is the process of converting information into a digital format. In this format, information is organized into discrete units of data that can be separately addressed. This is the binary data that computers and many devices with computing capacity can process.  http://whatis.techtarget.com/definition/0,,sid9_g ci896692,00.html  See also: http://en.wikipedia.org/wiki/Digitizing
  • 3. Steps of digitization 3  1. Choose the book you want to digitize.  2. Choose an OCR software (GO!)  3. Scan your book (Choose the devise. Scanner,      compact device, digital camera, IRIScan) (GO!) 4. Optical Character Recognition (image) 5. Correction (image1) (image2) 6. Save as a text searchable PDF document See another versions: http://www.inquisition.ca/en/info/artic/comment_ numeriser.htm http://dlg.galileo.usg.edu/guide.html#01
  • 4. Text and images 4  Text and images can be digitized similarly: a scanner captures an image (which may be an image of text) and converts it to an image file, such as a bitmap. An optical character recognition (OCR) program analyzes a text image for light and dark areas in order to identify each alphabetic letter or numeric digit, and converts each character into an ASCII code.
  • 5. Choose an OCR software 5  There are a lot of softwares to digitize your     documents. On Wikipedia there is comparison list of optical character recognition softwares. Check it out! http://en.wikipedia.org/wiki/List_of_optical_chara cter_recognition_software (I recommend you the ABBYY FineReader.) If you don’t want to buy (or download) a software, here’s a free online OCR: http://www.newocr.com/
  • 6. What is OCR? 6  OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.  http://searchciomidmarket.techtarget.com/definition/OCR  Read more: http://en.wikipedia.org/wiki/Optical_character_r ecognition
  • 7. What is ASCII? 7  ASCII (American Standard Code for Information Interchange) is the most common format for text files in computers and on the Internet. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). 128 possible characters are defined.  In: http://searchciomidmarket.techtarget.com/definition/ASCII
  • 8. How to scan the book 8  With scanner: http://www.wikihow.com/Scan-a    Book http://www.proportionalreading.com/scan.html With one compact device: http://www.ehow.com/how_6950098_scan-bookpdf-format.html With digital camera: http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera With IRIScan: http://www.youtube.com/watch?v=9bgcDHLe3Xg
  • 12. Videos 12  How to digitize a book:         http://www.youtube.com/watch?v=-M95Ob4kIak How to chop and scan a book: http://www.youtube.com/watch?v=8tx2JmW_p4c Scanning text using OCR software: http://www.youtube.com/watch?v=_SwrGtSY4-c How to OCR PDFs easily with Acrobat Batch OCR: http://www.youtube.com/watch?v=V6Iz3U5X-SU How to digitize a million books http://www.youtube.com/watch?v=OlKhKyTS23E
  • 13. How to put a scanned doc into PDF format 13  http://www.ehow.com/how_8563246_put-scanned- document-pdf-format.html  Some OCR softwares include PDF format to save.  Have a good reading on your digital device!  Made by Mario Laskovics (2012.04.03)