Words and numbers
KU Leuven University Library Central Services
Digitisation
Intro: KU Leuven Digitisation
• University Library Central Services
• Digitisation projects and programmes
o Research, education, heritage
o Coordination, facilitation
• Imaging Lab
o Focus on quality
o Focus on innovation
Intro: LIBIS
• IT solutions for collection management
o Archives, libraries, musea
o Development and support for network larger than just KU Leuven
o LIAS
• Solutions for researchers
o Scientific data management, collaboration, sharing
o Multiple environments
• Centre of expertise
• Project oriented
Lines of Work and Issues
• Output formats
• Historical languages: Latin
• Historical languages: Demotic and friends
• Printed statistical tables
• Manuscripts and handwritten materials
• Workflow management
Output formats
• SUCCEED
• OCR engines generate TEI that does not use all features of the standard.
• Reduces the value of OCR-generated TEI as a starting point for research.
• Looking for:
o A way to improve the quality of TEI generated by OCR engines
• Possible input:
o LIBIS expertise and knowhow
Historical languages: Latin
• Course notes by students of the old university of Leuven
• Western Europe: Latin essential for historical research
• Fragmented efforts, hard to track, difficult to establish cooperation
• Looking for:
o Highly automated and accurate OCR = limited manual intervention
o Lexica, NER
• Possible input:
o Text material from different periods and locations
o Academic input: neo-latin, …
KU Leuven - Words and numbers - ICoC
Historical languages: other
• Latin is not the only important historical language
• Precursors of contemporary spoken languages
• No specific projects for now
• Certainly important for our researchers, Hebrew for instance
• Looking for :
o Initiatives we might join
Printed statistical tables
• Recensement général des industries et des métiers (31 octobre 1896)
• Nineteenth-century statistical material
• Very hard to use for research due to sheer size and complexity
• Solution: digitisation followed by OCR
• Output: spreadsheets or functional equivalents
• Looking for:
o Extremely accurate OCR for numeric materials
o correct translation of dense table layout
o Tools for preparation of the digitised images and quality control
• Possible input:
o Digitized source material
o Expertise:Depts of Electrical Engineering, Economic History, Historical Demography
How to deal with complex layout, columns and ciphers?
Manuscripts and handwritten material
• RICH + Bible of Anjou
• Ready to contribute material as content holder
• Working on a programme about letters
Workflow management
• Digicorder + Teamwork
• How do others deal with workflow management?
• Where to position enrichment in digitisation workflow?
• Ready to participate in the production of Webinars
Klik op het pictogram als u een afbeelding wilt toevoegen
Digicorder = tool to manage naming of projects and scans
Created by Diederik Lanoye using Filemaker
Options when creating unique names for scans and corresponding labels
Starting point = object to be digitized
Label = description of part of object or number of page or folio
Names for scans and corresponding labels
Information shown for each scanned image
Teamwork = workflow management tool
Dashboard lists projects, tasks, milestones and responsibilities
Inside a project: tasks on a timeline
Milestones are defined for important moments in the workflow
Often in case of transitions
More information: https://www.teamwork.com/projects/
You never walk alone
o Issues are not specific to KU Leuven
o Sharing expertise to cover all aspects is the only way to go
o Valuable expertise in specific fields
• Neo and humanist Latin
• Historic demography and Economic history
• Imaging
o On our wishlist:
• Cooperation in new and on-going developments
• Exchange of expertise
• Above all: action
Cooperation
• Wiki as a starting point, interesting initiative
• Who wants to join forces?
o Writing projects together
o Searching for funding
• Important:
o Automated
o Accurate
o Scalable and Maintainable
o Cost effective
• digitalisering@bib.kuleuven.be
• Hoping to return to Leuven with names, specific suggestions, and appointments for meetings to
discuss proposals
Appendix: Center for Processing Speech and Images
• The Center for Processing Speech and Images (PSI) is one of the units within
the department of Electrical Engineering (ESAT) at KU Leuven. It is specialized
in computer vision and has object and object class recognition as one of its
most important domains of research. Besides more general goals as scene
understanding, segmentation or invariant object recognition, it has experience
with character recognition in licence plates and automatic recognition of
handwritten music scores for transcription to modern music.
With more than 60 researchers it is one of the biggest research groups of its
kind in Europe and has a lot of experience in national and international projects.
2 professors have received ERC grants of the European Commission and have
won several other prestigious prizes.

More Related Content

PDF
International projects at University of Salamanca
PPT
Text and Data Mining at the Royal Library in the Netherlands
PPTX
Listing exhibitions of the past
PPTX
Digital Exposure of English Place-Names (DEEP) -Stuart Dunn
PPTX
Event-based objectidentification
PPTX
Presentation Library (Erika Meel)
PDF
BVC - Semantic Web - ICoC
International projects at University of Salamanca
Text and Data Mining at the Royal Library in the Netherlands
Listing exhibitions of the past
Digital Exposure of English Place-Names (DEEP) -Stuart Dunn
Event-based objectidentification
Presentation Library (Erika Meel)
BVC - Semantic Web - ICoC

Similar to KU Leuven - Words and numbers - ICoC (20)

PDF
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
PDF
Codifice Pelavicino between Digital Edition and Public History
PDF
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
PPTX
Dag Hensten - Nasjonalmuseet collections online
PPTX
Multimodal Perspectives for Digitised Historical Newspapers
PPTX
I school creative symposium tpl october 6, 2014
PDF
Session5 03.george rehm
PPTX
PDF
Process, not product Experiences from developing a digital interface of arch...
PPT
Naple presentation danish digital library
PDF
{Tech}changes: the technological state of Greek Libraries.
PDF
How to Build a Digital Library
PPTX
Building a digital scholarship centre on the successes of a Library Makerspace
PPT
LIBER, Europeana and the Europeana Newspapers Project
PDF
co:op-READ-Convention Marburg - Günter Mühlberger
PDF
Work Package 4 - Month 6 by Sam Leon
PPTX
Research Software Engineering Inside and Outside the Library
PPTX
Digital Tools in The Classroom: Omeka Workshop (Northeastern University)
PPT
Developing a national digital library stapel - meijers 20160302
Digital Humanities Clinics – Leading Dutch Librarians into DH. Lotte Wilms, N...
Codifice Pelavicino between Digital Edition and Public History
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
Dag Hensten - Nasjonalmuseet collections online
Multimodal Perspectives for Digitised Historical Newspapers
I school creative symposium tpl october 6, 2014
Session5 03.george rehm
Process, not product Experiences from developing a digital interface of arch...
Naple presentation danish digital library
{Tech}changes: the technological state of Greek Libraries.
How to Build a Digital Library
Building a digital scholarship centre on the successes of a Library Makerspace
LIBER, Europeana and the Europeana Newspapers Project
co:op-READ-Convention Marburg - Günter Mühlberger
Work Package 4 - Month 6 by Sam Leon
Research Software Engineering Inside and Outside the Library
Digital Tools in The Classroom: Omeka Workshop (Northeastern University)
Developing a national digital library stapel - meijers 20160302
Ad

More from IMPACT Centre of Competence (20)

PDF
Session6 01.helmut schmid
PDF
Session1 03.hsian-an wang
PDF
Session7 03.katrien depuydt
PDF
Session7 02.peter kiraly
PDF
Session6 04.giuseppe celano
PDF
Session6 03.sandra young
PDF
Session6 02.jeremi ochab
PDF
Session5 04.evangelos varthis
PDF
Session5 02.tom derrick
PDF
Session5 01.rutger vankoert
PDF
Session4 04.senka drobac
PDF
Session3 04.arnau baro
PDF
Session3 03.christian clausner
PDF
Session3 02.kimmo ketunnen
PDF
Session3 01.clemens neudecker
PDF
Session2 04.ashkan ashkpour
PDF
Session2 03.juri opitz
PDF
Session2 02.christian reul
PDF
Session2 01.emad mohamed
PDF
Session1 04.florian fink
Session6 01.helmut schmid
Session1 03.hsian-an wang
Session7 03.katrien depuydt
Session7 02.peter kiraly
Session6 04.giuseppe celano
Session6 03.sandra young
Session6 02.jeremi ochab
Session5 04.evangelos varthis
Session5 02.tom derrick
Session5 01.rutger vankoert
Session4 04.senka drobac
Session3 04.arnau baro
Session3 03.christian clausner
Session3 02.kimmo ketunnen
Session3 01.clemens neudecker
Session2 04.ashkan ashkpour
Session2 03.juri opitz
Session2 02.christian reul
Session2 01.emad mohamed
Session1 04.florian fink
Ad

Recently uploaded (20)

PPT
Geologic Time for studying geology for geologist
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Five Habits of High-Impact Board Members
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Configure Apache Mutual Authentication
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Architecture types and enterprise applications.pdf
Geologic Time for studying geology for geologist
The influence of sentiment analysis in enhancing early warning system model f...
A proposed approach for plagiarism detection in Myanmar Unicode text
Five Habits of High-Impact Board Members
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Getting started with AI Agents and Multi-Agent Systems
2018-HIPAA-Renewal-Training for executives
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
sustainability-14-14877-v2.pddhzftheheeeee
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Convolutional neural network based encoder-decoder for efficient real-time ob...
A review of recent deep learning applications in wood surface defect identifi...
Hindi spoken digit analysis for native and non-native speakers
Configure Apache Mutual Authentication
Enhancing emotion recognition model for a student engagement use case through...
Final SEM Unit 1 for mit wpu at pune .pptx
Zenith AI: Advanced Artificial Intelligence
Architecture types and enterprise applications.pdf

KU Leuven - Words and numbers - ICoC

  • 1. Words and numbers KU Leuven University Library Central Services Digitisation
  • 2. Intro: KU Leuven Digitisation • University Library Central Services • Digitisation projects and programmes o Research, education, heritage o Coordination, facilitation • Imaging Lab o Focus on quality o Focus on innovation
  • 3. Intro: LIBIS • IT solutions for collection management o Archives, libraries, musea o Development and support for network larger than just KU Leuven o LIAS • Solutions for researchers o Scientific data management, collaboration, sharing o Multiple environments • Centre of expertise • Project oriented
  • 4. Lines of Work and Issues • Output formats • Historical languages: Latin • Historical languages: Demotic and friends • Printed statistical tables • Manuscripts and handwritten materials • Workflow management
  • 5. Output formats • SUCCEED • OCR engines generate TEI that does not use all features of the standard. • Reduces the value of OCR-generated TEI as a starting point for research. • Looking for: o A way to improve the quality of TEI generated by OCR engines • Possible input: o LIBIS expertise and knowhow
  • 6. Historical languages: Latin • Course notes by students of the old university of Leuven • Western Europe: Latin essential for historical research • Fragmented efforts, hard to track, difficult to establish cooperation • Looking for: o Highly automated and accurate OCR = limited manual intervention o Lexica, NER • Possible input: o Text material from different periods and locations o Academic input: neo-latin, …
  • 8. Historical languages: other • Latin is not the only important historical language • Precursors of contemporary spoken languages • No specific projects for now • Certainly important for our researchers, Hebrew for instance • Looking for : o Initiatives we might join
  • 9. Printed statistical tables • Recensement général des industries et des métiers (31 octobre 1896) • Nineteenth-century statistical material • Very hard to use for research due to sheer size and complexity • Solution: digitisation followed by OCR • Output: spreadsheets or functional equivalents • Looking for: o Extremely accurate OCR for numeric materials o correct translation of dense table layout o Tools for preparation of the digitised images and quality control • Possible input: o Digitized source material o Expertise:Depts of Electrical Engineering, Economic History, Historical Demography
  • 10. How to deal with complex layout, columns and ciphers?
  • 11. Manuscripts and handwritten material • RICH + Bible of Anjou • Ready to contribute material as content holder • Working on a programme about letters
  • 12. Workflow management • Digicorder + Teamwork • How do others deal with workflow management? • Where to position enrichment in digitisation workflow? • Ready to participate in the production of Webinars
  • 13. Klik op het pictogram als u een afbeelding wilt toevoegen Digicorder = tool to manage naming of projects and scans Created by Diederik Lanoye using Filemaker
  • 14. Options when creating unique names for scans and corresponding labels Starting point = object to be digitized Label = description of part of object or number of page or folio
  • 15. Names for scans and corresponding labels
  • 16. Information shown for each scanned image
  • 17. Teamwork = workflow management tool Dashboard lists projects, tasks, milestones and responsibilities
  • 18. Inside a project: tasks on a timeline
  • 19. Milestones are defined for important moments in the workflow Often in case of transitions More information: https://www.teamwork.com/projects/
  • 20. You never walk alone o Issues are not specific to KU Leuven o Sharing expertise to cover all aspects is the only way to go o Valuable expertise in specific fields • Neo and humanist Latin • Historic demography and Economic history • Imaging o On our wishlist: • Cooperation in new and on-going developments • Exchange of expertise • Above all: action
  • 21. Cooperation • Wiki as a starting point, interesting initiative • Who wants to join forces? o Writing projects together o Searching for funding • Important: o Automated o Accurate o Scalable and Maintainable o Cost effective • digitalisering@bib.kuleuven.be • Hoping to return to Leuven with names, specific suggestions, and appointments for meetings to discuss proposals
  • 22. Appendix: Center for Processing Speech and Images • The Center for Processing Speech and Images (PSI) is one of the units within the department of Electrical Engineering (ESAT) at KU Leuven. It is specialized in computer vision and has object and object class recognition as one of its most important domains of research. Besides more general goals as scene understanding, segmentation or invariant object recognition, it has experience with character recognition in licence plates and automatic recognition of handwritten music scores for transcription to modern music. With more than 60 researchers it is one of the biggest research groups of its kind in Europe and has a lot of experience in national and international projects. 2 professors have received ERC grants of the European Commission and have won several other prestigious prizes.