SlideShare a Scribd company logo
Dragging old data forward:
finding yourself an RDA Helper
Terry Reese, Gray Family Chair for Innovative Library Services
Email: terry.reese@oregonstate.edu
Vehicle for Research -- MarcEdit

• MarcEdit
   • http://people.oregonstate.edu/~reeset/marcedit




1
January 28, 2013
http://tardthegrumpycat.tumblr.com/page/2




2
January 28, 2013
Common Questions I hear

• What about the GMD?
• We code all our data in RDA, how do we deal with other
  peoples?
• What do we do with bulk data loads? Vendor data?
• Do we care about Legacy Data?
• My library has been encoding records with RDA fields for over
  a year and now they are incomplete. I have thousands – what
  can I do?
• WHAT ABOUT THE GMD?


3
January 28, 2013
So what is the RDA Helper?

• It’s a proof of concept to demonstrate that:

   1.      Most current RDA fields can be derived from existing data
   2.      Migration paths for legacy/bulk data can and should exist
   3.      Abbreviation expansion maybe isn’t as straightforward as we would
           like
   4.      GMD data can be automatically generating from existing RDA data
   5.      Vehicle for experimentation




4
January 28, 2013
Scope of the project

• RDA helper has been limited to looking at practical
  implementation of RDA elements into MARC
   • Looking specifically at:
       •   336/337/338 field groups
       •   344/345/346/347 field groups
       •   380/381 field groups
       •   Evaluating the 260
       •   Processing Abbreviation Expansion
       •   GMD processing


• Determine how easy 3rd-party development/engagement with
  the RDA standard/metadata community will be going forward.
5
January 28, 2013
http://talkingleadership.wordpress.com/2012/05/01/building-a-feedback-relationship/



6
January 28, 2013
Hitting a brick wall




                   http://www.flickr.com/photos/camknows/8374910613/


7
January 28, 2013
Mining the Data

• Does the data already exist in MARC records?
   • Yes and no – while much of the data can be extrapolated, the generation of
     many new RDA specific fields requires evaluation of multiple data points.

• The most important data points?
   • LDR/007/008 – with these three data points, you can generate most RDA
     specific field data.
   • GMD
   • 856
   • 300
   • 130
   • 240
   • 730
   • 740


8
January 28, 2013
Mining the Data

• Abbreviation Expansion is challenging
   • Real-world data is simply real-world crazy

   • Simple Example:
           =300    $a1 v.
           =300    $a1 vol.
           =300    $aOne v.
           =300    $a1 vols.
           =300    $aV.
           =300    $av.
           =300    $a12 v.




9
January 28, 2013
So how does this thing work?

• RDA Helper
   • http://www.youtube.com/watch?v=cqLMPp9vZVM&feature=player_embedded




10
January 28, 2013
So why create something like this at all?

• Admittedly, most of the promise behind RDA isn’t going to be
  found in these first baby steps in MARC, but…
   • To demonstrate that much of this initial work can be done automagically
     and that much of the data in our existing hybrid environments can be
     moved forward.
   • To provide a testable implementation for catalogers who are still
     uncomfortable with what these changes mean.
   • To support public libraries, many of which utilizing ILS systems that rely
     on data that that is going away like the GMD to create more user-friendly
     interfaces.
   • To support vendors that provide MARC records and offer a simplified
     path for moving their data forward.

11
January 28, 2013
Going forward




           http://www.flickr.com/photos/jannem/2079422115/sizes/z/in/photostream/

12
January 28, 2013
Thank you

Contact Information:
Terry Reese
Email: terry.reese@oregonstate.edu
Work: 541.737.6384

Getting MarcEdit:
http://people.oregonstate.edu/~reeset/marcedit




13
January 28, 2013

More Related Content

PPTX
Music Research 822 - 2016
PDF
Building the Enterprise Data Lake: A look at architecture
PDF
Big Data and Fast Data – Big and Fast Combined, is it Possible?
PPTX
The key to unlocking the Value in the IoT? Managing the Data!
PDF
Big Data and Fast Data - big and fast combined, is it possible?
PPT
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
PPTX
Dw 07032018-dr pl pradhan
PPTX
Big Data, Hadoop, NoSQL and more ...
Music Research 822 - 2016
Building the Enterprise Data Lake: A look at architecture
Big Data and Fast Data – Big and Fast Combined, is it Possible?
The key to unlocking the Value in the IoT? Managing the Data!
Big Data and Fast Data - big and fast combined, is it possible?
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Dw 07032018-dr pl pradhan
Big Data, Hadoop, NoSQL and more ...

Similar to Dragging old data forward: finding yourself an RDA Helper (20)

PPTX
Differences between data lakes and datawarehouse
PPTX
Introduction to Data Science.pptx
PDF
Treasure Data Cloud Strategy
PDF
The Economics of SQL on Hadoop
PPTX
Data Analytic Technology Platforms: Options and Tradeoffs
PPTX
Data Science presentation for explanation of numpy and pandas
PDF
Foundation for Success: How Big Data Fits in an Information Architecture
PDF
OSMC 2019 | How to improve database Observability by Charles Judith
PPTX
big data processing.pptx
DOCX
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
PPTX
Designing analytics for big data
PDF
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
PDF
التنقيب في البيانات - Data Mining
PPTX
Big Data and Hadoop
PPTX
SoftServe BI/BigData Workshop in Utah
PPTX
Data Science Machine Lerning Bigdat.pptx
PPTX
What is spatial sql
PPTX
Making RDA Easy(er) with MarcEdit
PPTX
Data Mart Lake Ware.pptx
PPTX
Ledingkart Meetup #4: Data pipeline @ lk
Differences between data lakes and datawarehouse
Introduction to Data Science.pptx
Treasure Data Cloud Strategy
The Economics of SQL on Hadoop
Data Analytic Technology Platforms: Options and Tradeoffs
Data Science presentation for explanation of numpy and pandas
Foundation for Success: How Big Data Fits in an Information Architecture
OSMC 2019 | How to improve database Observability by Charles Judith
big data processing.pptx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
Designing analytics for big data
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...
التنقيب في البيانات - Data Mining
Big Data and Hadoop
SoftServe BI/BigData Workshop in Utah
Data Science Machine Lerning Bigdat.pptx
What is spatial sql
Making RDA Easy(er) with MarcEdit
Data Mart Lake Ware.pptx
Ledingkart Meetup #4: Data pipeline @ lk
Ad

More from Terry Reese (20)

PPTX
MarcEdit Shelter-In-Place Webinar 8: Automated editing through scripts and to...
PPTX
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
PPTX
MarcEdit Shelter-In-Place Webinar 6: Regular Expressions and .NET, A Primer
PPTX
MarcEdit Shelter-In-Place Webinar 5.5: Transliterations in MarcEdit
PPTX
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
PPTX
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
PPTX
MarcEdit Shelter-in-place Webinar 2.5: Getting Started with MarcEdit Mac
PPTX
Working with the MarcEditor
PPTX
Slides from the NASIG 2018 Preconference
PPTX
Making complicated processes simple: a look at how MarcEdit 7 is expanding th...
PPTX
Rejoining the Information access landscape
PPTX
Open metadata, open systems…redrawing the library metadata landscape
PPTX
Getting Started with Regular Expressions In MarcEdit
PPTX
Fitting MarcEdit into the library software ecosystem
PPTX
Thinking about Preservation: OSUL Content Manage Workflow
PDF
The world beyond MARC: let’s focus on asking the right questions
PPTX
Reframing Public Housing: Visualization and Data Analytics in History
PPTX
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
PPTX
Preparing Catalogers for Linked data
PPTX
Harnessing the Lifecycle: Planning and Implementing a Strategic Digital Coll...
MarcEdit Shelter-In-Place Webinar 8: Automated editing through scripts and to...
MarcEdit Shelter-In-Place Webinar 7: Making Regular Expressions work for you ...
MarcEdit Shelter-In-Place Webinar 6: Regular Expressions and .NET, A Primer
MarcEdit Shelter-In-Place Webinar 5.5: Transliterations in MarcEdit
MarcEdit Shelter-In-Place Webinar 5: Working with MarcEdit's Linked Data Fram...
MarcEdit Shelter-In-Place Webinar 4: Merging, Clustering, and Integrations…oh...
MarcEdit Shelter-in-place Webinar 2.5: Getting Started with MarcEdit Mac
Working with the MarcEditor
Slides from the NASIG 2018 Preconference
Making complicated processes simple: a look at how MarcEdit 7 is expanding th...
Rejoining the Information access landscape
Open metadata, open systems…redrawing the library metadata landscape
Getting Started with Regular Expressions In MarcEdit
Fitting MarcEdit into the library software ecosystem
Thinking about Preservation: OSUL Content Manage Workflow
The world beyond MARC: let’s focus on asking the right questions
Reframing Public Housing: Visualization and Data Analytics in History
#mashcat: Evolving MarcEdit: Leveraging Semantic Data in MarcEdit
Preparing Catalogers for Linked data
Harnessing the Lifecycle: Planning and Implementing a Strategic Digital Coll...
Ad

Recently uploaded (20)

PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Institutional Correction lecture only . . .
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
01-Introduction-to-Information-Management.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Insiders guide to clinical Medicine.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
VCE English Exam - Section C Student Revision Booklet
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Cell Structure & Organelles in detailed.
Microbial diseases, their pathogenesis and prophylaxis
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Institutional Correction lecture only . . .
PPH.pptx obstetrics and gynecology in nursing
Basic Mud Logging Guide for educational purpose
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Abdominal Access Techniques with Prof. Dr. R K Mishra
01-Introduction-to-Information-Management.pdf
O7-L3 Supply Chain Operations - ICLT Program
Renaissance Architecture: A Journey from Faith to Humanism
Week 4 Term 3 Study Techniques revisited.pptx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Insiders guide to clinical Medicine.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx

Dragging old data forward: finding yourself an RDA Helper

  • 1. Dragging old data forward: finding yourself an RDA Helper Terry Reese, Gray Family Chair for Innovative Library Services Email: terry.reese@oregonstate.edu
  • 2. Vehicle for Research -- MarcEdit • MarcEdit • http://people.oregonstate.edu/~reeset/marcedit 1 January 28, 2013
  • 4. Common Questions I hear • What about the GMD? • We code all our data in RDA, how do we deal with other peoples? • What do we do with bulk data loads? Vendor data? • Do we care about Legacy Data? • My library has been encoding records with RDA fields for over a year and now they are incomplete. I have thousands – what can I do? • WHAT ABOUT THE GMD? 3 January 28, 2013
  • 5. So what is the RDA Helper? • It’s a proof of concept to demonstrate that: 1. Most current RDA fields can be derived from existing data 2. Migration paths for legacy/bulk data can and should exist 3. Abbreviation expansion maybe isn’t as straightforward as we would like 4. GMD data can be automatically generating from existing RDA data 5. Vehicle for experimentation 4 January 28, 2013
  • 6. Scope of the project • RDA helper has been limited to looking at practical implementation of RDA elements into MARC • Looking specifically at: • 336/337/338 field groups • 344/345/346/347 field groups • 380/381 field groups • Evaluating the 260 • Processing Abbreviation Expansion • GMD processing • Determine how easy 3rd-party development/engagement with the RDA standard/metadata community will be going forward. 5 January 28, 2013
  • 8. Hitting a brick wall http://www.flickr.com/photos/camknows/8374910613/ 7 January 28, 2013
  • 9. Mining the Data • Does the data already exist in MARC records? • Yes and no – while much of the data can be extrapolated, the generation of many new RDA specific fields requires evaluation of multiple data points. • The most important data points? • LDR/007/008 – with these three data points, you can generate most RDA specific field data. • GMD • 856 • 300 • 130 • 240 • 730 • 740 8 January 28, 2013
  • 10. Mining the Data • Abbreviation Expansion is challenging • Real-world data is simply real-world crazy • Simple Example: =300 $a1 v. =300 $a1 vol. =300 $aOne v. =300 $a1 vols. =300 $aV. =300 $av. =300 $a12 v. 9 January 28, 2013
  • 11. So how does this thing work? • RDA Helper • http://www.youtube.com/watch?v=cqLMPp9vZVM&feature=player_embedded 10 January 28, 2013
  • 12. So why create something like this at all? • Admittedly, most of the promise behind RDA isn’t going to be found in these first baby steps in MARC, but… • To demonstrate that much of this initial work can be done automagically and that much of the data in our existing hybrid environments can be moved forward. • To provide a testable implementation for catalogers who are still uncomfortable with what these changes mean. • To support public libraries, many of which utilizing ILS systems that rely on data that that is going away like the GMD to create more user-friendly interfaces. • To support vendors that provide MARC records and offer a simplified path for moving their data forward. 11 January 28, 2013
  • 13. Going forward http://www.flickr.com/photos/jannem/2079422115/sizes/z/in/photostream/ 12 January 28, 2013
  • 14. Thank you Contact Information: Terry Reese Email: terry.reese@oregonstate.edu Work: 541.737.6384 Getting MarcEdit: http://people.oregonstate.edu/~reeset/marcedit 13 January 28, 2013

Editor's Notes

  • #3: I’ve found over the past couple years giving workshops on metadata processing, that talking about RDA is like talking about Religion and Politics. It can really bring out the crazy.
  • #4: I wish I was kidding about the GMD
  • #5: Experimentation – treating specific fields as objects for purposes of validation.
  • #6: RDA Helper was designed for practical usage. Now, there are a lot of concepts related to RDA that exist outside of MARC. The RDA Helper is definitely concerned with how these concepts are related into MARC.
  • #8: OSU gives me a lot of indirect support when it comes to my work around MarcEdit. Because of that – I usually find that I spend close to 2-3k a year to access ISO standards documents. These are international standards documents and as a developer, I don’t like it, but I think of it as the cost of doing business. However, I was unprepared to have to do the same to access what should be an open library standard. The library community is going to have to deal with RDA in some form – but I do worry that this specification will be dead on arrival for communities outside the library if we insist on keeping it behind a paywall.
  • #9: Is the data already there?You can use other data elements, but as you move down the tree, the ability to extrapolate data correctly becomes more difficult.
  • #10: You can us the expansion lists as a guide, but in testing, people create their own abbreviations, they are applied unevenly,
  • #12: OSU is in this boat – our primary cataloger is on sabbatical and our technicians haven’t been formally trained. This tool gives them the ability to look process records and start seeing what the data might look like