SlideShare a Scribd company logo
2010 CALA MW Annual Conference



          Cleveland, Ohio and Western Reserve
              Digital Text Collection Project




                                Suzhen Chen
                              Richard Wisneski

              Kevin Smith Library, Case Western Reserve University
                                   May 22, 2010
Institution
Case Western Reserve University (CRWU)
 Founded in 1967 (federation of Case Institute of
Technology founded in 1881 and Western Reserve
University founded in 1826)
 A private research university in northeast Ohio
 ~10,000 students

Kelvin Smith Library
 Main library of CWRU
 ~ 1.7 million volumes
 ~ 60 library staff
Project (What)
Cleveland, Ohio and Western Reserve Digital Text
  Collection project
 A collection of digital resources of history of
  Cleveland, Ohio and Western Reserve date from
  early 19th century to early 20th century
 The collection covers various topics including
  women of Cleveland, religion, housing etc.
 About 100 text files added to the collection, more
  to be added including some manuscripts
Project(Why)
   Online representation of the collection

   Provide resources for historians, scholars and
   others who are interested in Cleveland, OH and
   Western Reserve

   Serve the learning and teaching purpose


   Promote scholarly communication


   Long term preservation of regional history
Metadata standard
                            Intended
                             users




     Project                             Types of
     needs                               materials
                           Metadata
                           Standard


               Subjects,
                Genre                  Preservation
                                          needs
                  …
TEI: Text Encoding Initiative

  A consortium that develops the standard
 for representing texts in digital form.
  Maintain encoding guidelines for text
  Often applies to humanities, social
 science and linguistics
Example:
Projects from other institutions:
Shakespeare Quartos Archive
Newton Manuscript Project (University of Sussex)
Early American Digital Archive (University of Maryland)
Example of TEI Header
<titleStmt>
 <editionStmt>
 <publicationStmt>

…
Example of an encoded text
TEI Metadata Standard
 Mark up specific genres such as prose, verse,
drama
 Mark content structure such as paragraphs,
divisions
 Mark up feature of a text such as quotations,
footnotes etc.
 Mark up texts for literary and linguistic analysis
Example




http://tbe.kantl.be/TBE/TBE.htm?page=examples
Example of an encoded text
                              <lg>
                              <l rend="font-size(110%) indent(-60)">"Fury said to</l>
                              <l rend="font-size(100%) indent(-40px)">a mouse, That</l>
                              <l rend="font-size(100%) indent(0px)">he met</l>
                              <l rend="font-size(100%) indent(10px)">in the</l>
                              <l rend="font-size(100%) indent(20px)">house,</l>
                              <l rend="font-size(100%) indent(17px)">'Let us</l>
                              <l rend="font-size(100%) indent(5px)">both go</l>
                              <l rend="font-size(100%) indent(-7px)">to law:</l>
                              <l rend="font-size(100%) indent(-23px)"><hi
                              rend="italic">I</hi> will</l>
                              <l rend="font-size(100%) indent(-26px)">prosecute</l>
                              <l rend="font-size(90%) indent(-40px)"><hi
                              rend="italic">you.</hi> —</l>
                              <l rend="font-size(90%) indent(-30px)">Come, I'll</l>
                              <l rend="font-size(90%) indent(-20px)">take no</l>
                              <l rend="font-size(90%) indent(-7px)">denial;</l>
                              …
                              </lg>
http://tbe.kantl.be/TBE/TBE.htm?page=examples
TEI Metadata Standard
Provides various manifestations of a text or audio
Independent of applications
TEI is extensible
Accommodate encoding methods for data processing
needs and analysis
For better description, organization and classification
of information
Implementation
Staff
Funding
Time management
…
Cleveland, Ohio and Western Reserve
Digital Text Collection project
                       Establish
  Finalize the        workflows,        Trainings are
    project           policies and        provided
                      procedures

                               Run through
                             optical character
             Text files
                               recognition
             scanned
                             software – Abbyy
                                FineReader
Procedures
 Spelling check for the texts


 Create TEI headers

 Bibliographic description,
 revisions, source of text

 Encode the text


 Quality control
Implementation
   Workshops
   In-house documentations for best practice
   Standards
   On line resources
   Examples for completed work
   Assistance from supervisor
   Learn from each other
Example of an In-house
documentation
Cleveland, Ohio and Western Reserve
 Digital Text Collection project

 For future metadata conversion, exchange,
  facilitate metadata harvesting and
  federated search
 Facilitate metadata sharing and cross-
  collection searching
Future Improvement
Make text searchable through web
Have hyper linked, referenced electronic
 resources
Resources
 WWP Guide to Scholarly Text Encoding:
  http://www.wwp.brown.edu/encoding/guide/index.ht
  ml
 Teach Yourself TEI: http://www.tei-
  c.org/Support/Learn/tutorials.xml
 A Gentle Introduction to XML: http://www.tei-
  c.org/release/doc/tei-p4-doc/html/SG.htmlA
 A Companion to Digital Literary Studies:
  http://www.digitalhumanities.org/companion/DLS/
References
 TEI: Text Encoding Initiative, “TEI: Text Encoding
  Initiative,” 2010, http://www.tei-c.org/index.xml

 International Federation of Library Associations and
  Institutions. Cataloging Section. “Functional
  Requirements for Bibliographic Records: Final Report,”
  1998,
  http://www.ifla.org.proxy2.library.uiuc.edu/VII/s13/frbr/frb
  r.htm
 TEI By Example Project, “TEI By Example Project,”
  2010, http://tbe.kantl.be/TBE/TBE.htm?page=examples

…

More Related Content

PDF
Data for Research (DfR) service
PPTX
One day workshop Linked Data and Semantic Web
PPTX
Linked Data: principles and examples
PPTX
Current metadata landscape in the library world (Getaneh Alemu)
PPTX
semantic web & natural language
PPTX
General Introduction for Semantic Web and Linked Open Data
PPTX
Web engineering - An overview about HTML
PPT
Academic Resources & Activities Center - Viewing Trend of Academic Libraries ...
Data for Research (DfR) service
One day workshop Linked Data and Semantic Web
Linked Data: principles and examples
Current metadata landscape in the library world (Getaneh Alemu)
semantic web & natural language
General Introduction for Semantic Web and Linked Open Data
Web engineering - An overview about HTML
Academic Resources & Activities Center - Viewing Trend of Academic Libraries ...

Viewers also liked (7)

PPT
Public Services - PEOPLE First - Li Fu, Jing Shi & Yi Han.
PPT
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
PPT
Chinese Values Guiding Principles for Today’s Library Service - Dr. Li Tze-ch...
PPTX
Missouri Digital Heritage: Missouri’s State Wide Digital Portal - Haiying Qian
PPTX
Academic Library & Economic Development Center Collaborate to Assist Local In...
PPTX
Library as Learning Hub - Jing Xu
PDF
Study: The Future of VR, AR and Self-Driving Cars
Public Services - PEOPLE First - Li Fu, Jing Shi & Yi Han.
Gaining Weight for Good Reason: Analysis of Fuller Bibliographic Records in S...
Chinese Values Guiding Principles for Today’s Library Service - Dr. Li Tze-ch...
Missouri Digital Heritage: Missouri’s State Wide Digital Portal - Haiying Qian
Academic Library & Economic Development Center Collaborate to Assist Local In...
Library as Learning Hub - Jing Xu
Study: The Future of VR, AR and Self-Driving Cars
Ad

Similar to Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski (20)

PPTX
Electronic Records
PPT
Xml Case Learns 2008
PPT
Wisneski TeI workshop 2009-2010
PPT
Digital Humanities Research
PPTX
Digital collections and humanities research
PDF
Natural Language Processing Tools for the Digital Humanities
PPTX
Digital History Presentation
PPT
Spoken Word - BBC Presentation
PPTX
Metadata: a library perspective
PPTX
The Speech Recognition Virtual Kitchen
PDF
Greenstone aib 16_feb12_casarosa
PPT
Intro to Digitization Projects
PPTX
Language and technology
PPTX
UCD Digital Library: Creating online access to historical and contemporary co...
PDF
Handout for Applying Digital Library Metadata Standards
PPT
Building and Managing Online Communities
PPTX
Internet Archive and Open Library
PPT
Genre discovery in corpus management systems (2004)
PDF
Thewiki4opentech
PPTX
Using semantics to improve interactive information access
Electronic Records
Xml Case Learns 2008
Wisneski TeI workshop 2009-2010
Digital Humanities Research
Digital collections and humanities research
Natural Language Processing Tools for the Digital Humanities
Digital History Presentation
Spoken Word - BBC Presentation
Metadata: a library perspective
The Speech Recognition Virtual Kitchen
Greenstone aib 16_feb12_casarosa
Intro to Digitization Projects
Language and technology
UCD Digital Library: Creating online access to historical and contemporary co...
Handout for Applying Digital Library Metadata Standards
Building and Managing Online Communities
Internet Archive and Open Library
Genre discovery in corpus management systems (2004)
Thewiki4opentech
Using semantics to improve interactive information access
Ad

Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

  • 1. 2010 CALA MW Annual Conference Cleveland, Ohio and Western Reserve Digital Text Collection Project Suzhen Chen Richard Wisneski Kevin Smith Library, Case Western Reserve University May 22, 2010
  • 2. Institution Case Western Reserve University (CRWU)  Founded in 1967 (federation of Case Institute of Technology founded in 1881 and Western Reserve University founded in 1826)  A private research university in northeast Ohio  ~10,000 students Kelvin Smith Library  Main library of CWRU  ~ 1.7 million volumes  ~ 60 library staff
  • 3. Project (What) Cleveland, Ohio and Western Reserve Digital Text Collection project  A collection of digital resources of history of Cleveland, Ohio and Western Reserve date from early 19th century to early 20th century  The collection covers various topics including women of Cleveland, religion, housing etc.  About 100 text files added to the collection, more to be added including some manuscripts
  • 4. Project(Why) Online representation of the collection Provide resources for historians, scholars and others who are interested in Cleveland, OH and Western Reserve Serve the learning and teaching purpose Promote scholarly communication Long term preservation of regional history
  • 5. Metadata standard Intended users Project Types of needs materials Metadata Standard Subjects, Genre Preservation needs …
  • 6. TEI: Text Encoding Initiative  A consortium that develops the standard for representing texts in digital form.  Maintain encoding guidelines for text  Often applies to humanities, social science and linguistics
  • 7. Example: Projects from other institutions: Shakespeare Quartos Archive Newton Manuscript Project (University of Sussex) Early American Digital Archive (University of Maryland)
  • 8. Example of TEI Header <titleStmt>  <editionStmt>  <publicationStmt> …
  • 9. Example of an encoded text
  • 10. TEI Metadata Standard  Mark up specific genres such as prose, verse, drama  Mark content structure such as paragraphs, divisions  Mark up feature of a text such as quotations, footnotes etc.  Mark up texts for literary and linguistic analysis
  • 12. Example of an encoded text <lg> <l rend="font-size(110%) indent(-60)">"Fury said to</l> <l rend="font-size(100%) indent(-40px)">a mouse, That</l> <l rend="font-size(100%) indent(0px)">he met</l> <l rend="font-size(100%) indent(10px)">in the</l> <l rend="font-size(100%) indent(20px)">house,</l> <l rend="font-size(100%) indent(17px)">'Let us</l> <l rend="font-size(100%) indent(5px)">both go</l> <l rend="font-size(100%) indent(-7px)">to law:</l> <l rend="font-size(100%) indent(-23px)"><hi rend="italic">I</hi> will</l> <l rend="font-size(100%) indent(-26px)">prosecute</l> <l rend="font-size(90%) indent(-40px)"><hi rend="italic">you.</hi> —</l> <l rend="font-size(90%) indent(-30px)">Come, I'll</l> <l rend="font-size(90%) indent(-20px)">take no</l> <l rend="font-size(90%) indent(-7px)">denial;</l> … </lg> http://tbe.kantl.be/TBE/TBE.htm?page=examples
  • 13. TEI Metadata Standard Provides various manifestations of a text or audio Independent of applications TEI is extensible Accommodate encoding methods for data processing needs and analysis For better description, organization and classification of information
  • 15. Cleveland, Ohio and Western Reserve Digital Text Collection project Establish Finalize the workflows, Trainings are project policies and provided procedures Run through optical character Text files recognition scanned software – Abbyy FineReader
  • 16. Procedures Spelling check for the texts Create TEI headers Bibliographic description, revisions, source of text Encode the text Quality control
  • 17. Implementation  Workshops  In-house documentations for best practice  Standards  On line resources  Examples for completed work  Assistance from supervisor  Learn from each other
  • 18. Example of an In-house documentation
  • 19. Cleveland, Ohio and Western Reserve Digital Text Collection project  For future metadata conversion, exchange, facilitate metadata harvesting and federated search  Facilitate metadata sharing and cross- collection searching
  • 20. Future Improvement Make text searchable through web Have hyper linked, referenced electronic resources
  • 21. Resources  WWP Guide to Scholarly Text Encoding: http://www.wwp.brown.edu/encoding/guide/index.ht ml  Teach Yourself TEI: http://www.tei- c.org/Support/Learn/tutorials.xml  A Gentle Introduction to XML: http://www.tei- c.org/release/doc/tei-p4-doc/html/SG.htmlA  A Companion to Digital Literary Studies: http://www.digitalhumanities.org/companion/DLS/
  • 22. References  TEI: Text Encoding Initiative, “TEI: Text Encoding Initiative,” 2010, http://www.tei-c.org/index.xml  International Federation of Library Associations and Institutions. Cataloging Section. “Functional Requirements for Bibliographic Records: Final Report,” 1998, http://www.ifla.org.proxy2.library.uiuc.edu/VII/s13/frbr/frb r.htm  TEI By Example Project, “TEI By Example Project,” 2010, http://tbe.kantl.be/TBE/TBE.htm?page=examples …