SlideShare a Scribd company logo
Metadata for Data Rescue and Data at Risk William L. Anderson, John L. Faundeen,  Jane Greenberg, Fraser Taylor PV2011, Toulouse, 17 November 2011  Presented by Nico Carver In collaboration with the DARi SILS Student Learning Circle
Outline Major Questions Metadata Scheme Design Case Study Next Steps Acknowledgements Questions/Comments
Major Questions informing Research  Where is at-risk data? How are scientists using historic data? How do we define at-risk?  “ 8 inch floppy” Retrieved from:  http://johnkingworld.com/aplus/images/storage-8inch-floppy.jpg How do others define at-risk? What must be done to rescue  data-at-risk?
Major Question informing Scheme Design What is essential metadata for describing  data-at-risk and aiding in data rescue?
Metadata requirements • Be applicable across a range of disciplines  and scientific research areas.  • Sufficiently support the data rescue mission.
Functions of the Inventory Function Initial Metadata Properties  Describe data of scientific value that is at-risk of being lost, unused, or destroyed. 1. Science area 2. Nature of data 3. Date or date-span 4. Location of original 5. Present location Act as a starting point for the data rescue mission. 6. Expected future 7. Risk level
Metadata Frameworks Useful for Data-at-Risk DARTG Chair Elizabeth Griffin’s initial  proposed DARTG metadata properties Metadata Property Science area 2.  Nature of data 3.  Date or date-span 4.  Location of original 5.  Present location 6.  Expected future 7.  Risk level
Metadata Frameworks Useful for Data-at-Risk U.S.Geological Service: “Create a Rescue Request”, URL:  http://eros.usgs.gov/government/archive_rescue/archive_request.php
Metadata Frameworks Useful for Data-at-Risk “ Growing the Vocabuary” http://dublincore.org/resources/training/frd_20091217/Tutorial_FRD_baker-1.pdf
Metadata Frameworks Useful for Data-at-Risk “ The PREMIS Data Dictionary” http://www.loc.gov/standards/premis/v2/premis-dd-2-1.pdf
Data-at-Risk Inventory (DARI) Metadata Scheme: guiding principles • Simple • Broadly applicable • Extensible
DARI Metadata Scheme (current) DARTG DARI Metadata, Version 1.0  Metadata Element Name Element Description Research Area(s) The domains represented by DARTG experts and the more general category of “Other”. Title  The name associated with the collection. Physical form of the data Paper, photograph, specimen, record book, magnetic tape, etc. Content and context of the data History, topic, etc. -- if known Name of current holder Institution, organization or individual. Dates associated with data  Time period when data were collected. Size Extent, volume, size. Data condition Stable, deteriorating, etc. Risk level Poor storage conditions, limited storage time, etc. Known access and restrictions Public domain, private collection, etc. Notes Any additional information. Contact information  Address or other contact information for the institution, organization or individual.
Case Study: introduction
Case Study: implementation
Case Study: Results 7 Dataset Descriptions total.  5 out of 7 were completed unassisted using the metadata template 13.5 out of 16 metadata elements considered useful on average (85%) 4 out of 5 scientists said they would use the inventory again
Case Study: conclusions The purpose of the inventory had to be more clearly stated on the website Instructions for filling out the web form had to be simple, but clear 3 metadata properties were determined unnecessary, 4 properties were altered for clarity  The remaining metadata properties were successful in their ability to cut across scientific disciplines while fully describing data-at-risk
Next Steps Complete focus groups and surveys at UNC- Chapel Hill and elsewhere to determine possible use cases  Disseminate information and generate interest for the inventory and the Data-at-Risk project Finalize the inventory design and start populating it
Submit a description: http://ibiblio.org/data-at-risk/contribution
Questions/ Comments? Acknowledgements: The University of North Carolina Center for Global Initiatives’ support of the Data At Risk Inventory SILS Student Learning Circle The Council for Scientific and Technical Data And the following people for their leadership, guidance, and assistance: Bill Anderson, School of Information, University of Texas at Austin; Jane Greenberg, School of Information and Library Science; Elizabeth Griffin, Herzberg Institute of Astrophysics; Dav Robertson, National Institute of Environmental Health Sciences, NIH; and Paul Jones & John Reuning, ibiblio, University of North Carolina at Chapel Hill.

More Related Content

PPTX
Introduction to Data Management
PDF
Data management (1)
PPTX
Introduction to open-data
PPTX
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
PPTX
Introduction to data management
PPTX
Developing data services: a tale from two Oregon universities
PPTX
Why should researchers care about data curation?
PPTX
Martone grethe
Introduction to Data Management
Data management (1)
Introduction to open-data
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
Introduction to data management
Developing data services: a tale from two Oregon universities
Why should researchers care about data curation?
Martone grethe

What's hot (20)

PPTX
Introduction to research data management; Lecture 01 for GRAD521
PDF
NIH Data Sharing Plan Workshop - Handout
PPTX
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
PPTX
BioPharma and FAIR Data, a Collaborative Advantage
PDF
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
PPTX
Leveraging publication metadata to help overcome the data ingest bottleneck
PDF
Funders and Publishers: Agents of Change
PDF
Data sharing as part of the research workflow
PDF
CDL Tools for DataCite 2014
PPTX
Data Literacy: Creating and Managing Reserach Data
PDF
Investigating plant systems using data integration and network analysis
PDF
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
PPTX
Introduction to data management
PDF
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
PPTX
The Dryad Digital Repository: Published evolutionary data as part of the gre...
PDF
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
PDF
Data Matters for AGU Early Career Conference
PDF
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
PPT
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
PPTX
Research data and scholarly publications: going from casual acquaintances to ...
Introduction to research data management; Lecture 01 for GRAD521
NIH Data Sharing Plan Workshop - Handout
NY Prostate Cancer Conference - P.A. Fearn - Session 1: Data management for p...
BioPharma and FAIR Data, a Collaborative Advantage
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Leveraging publication metadata to help overcome the data ingest bottleneck
Funders and Publishers: Agents of Change
Data sharing as part of the research workflow
CDL Tools for DataCite 2014
Data Literacy: Creating and Managing Reserach Data
Investigating plant systems using data integration and network analysis
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Introduction to data management
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
Data Matters for AGU Early Career Conference
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Research data and scholarly publications: going from casual acquaintances to ...
Ad

Viewers also liked (13)

PPTX
RDAP13 Jian Qin: Functional and Architectural Requirements for Metadata
PPTX
Linking Scientific Metadata (presented at DC2010)
PPTX
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
PDF
Data Science and What It Means to Library and Information Science
PDF
Functional and Architectural Requirements for Metadata: Supporting Discovery...
PPTX
LIS457 - metadata for beginners - kaile glick
PPT
Metadata For Catalogers (introductions)
PPTX
Scientific Data Cataloging Framework
PDF
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
PPTX
Publishing Linked Data 3/5 Semtech2011
PPTX
It's All About the Metadata
PPT
Does metadata matter?
PPTX
LIS 653, Session 4-B: Introduction to Descriptive Metadata
RDAP13 Jian Qin: Functional and Architectural Requirements for Metadata
Linking Scientific Metadata (presented at DC2010)
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
Data Science and What It Means to Library and Information Science
Functional and Architectural Requirements for Metadata: Supporting Discovery...
LIS457 - metadata for beginners - kaile glick
Metadata For Catalogers (introductions)
Scientific Data Cataloging Framework
KnowledgeCoin : recognizing and rewarding metadata integration and sharing ...
Publishing Linked Data 3/5 Semtech2011
It's All About the Metadata
Does metadata matter?
LIS 653, Session 4-B: Introduction to Descriptive Metadata
Ad

Similar to Metadata for Data Rescue and Data at Risk (20)

PPT
Metadata for digital long-term preservation
PDF
Unit 1.4 Research
PDF
Christine borgman keynote
PPT
Open Data and Institutional Repositories
PPTX
DataCite: the Perfect Complement to CrossRef
PPTX
Supporting research life cycle librarians
PPT
Data curation issues for repositories
PPT
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
PPT
Where is the opportunity for libraries in the collaborative data infrastructure?
PDF
Oais Based Information Flow Esther Conway
PDF
Dc sheridan dlf_2011_final
PDF
Preservation metadata
PPTX
FSCI Data Discovery
PPT
Preservation Metadata, Michael Day, DCC
PPT
Trm Vilnius Metadata New
PDF
Metadata 2020 Vivo Conference 2018
PPT
Preservation Metadata
PDF
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
PPT
香港六合彩
PPTX
Data Selection & Triage
Metadata for digital long-term preservation
Unit 1.4 Research
Christine borgman keynote
Open Data and Institutional Repositories
DataCite: the Perfect Complement to CrossRef
Supporting research life cycle librarians
Data curation issues for repositories
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Where is the opportunity for libraries in the collaborative data infrastructure?
Oais Based Information Flow Esther Conway
Dc sheridan dlf_2011_final
Preservation metadata
FSCI Data Discovery
Preservation Metadata, Michael Day, DCC
Trm Vilnius Metadata New
Metadata 2020 Vivo Conference 2018
Preservation Metadata
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
香港六合彩
Data Selection & Triage

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Lesson notes of climatology university.
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
GDM (1) (1).pptx small presentation for students
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Cell Types and Its function , kingdom of life
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Pharma ospi slides which help in ospi learning
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
Supply Chain Operations Speaking Notes -ICLT Program
Lesson notes of climatology university.
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
01-Introduction-to-Information-Management.pdf
Final Presentation General Medicine 03-08-2024.pptx
TR - Agricultural Crops Production NC III.pdf
Cell Structure & Organelles in detailed.
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
GDM (1) (1).pptx small presentation for students
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
O5-L3 Freight Transport Ops (International) V1.pdf
O7-L3 Supply Chain Operations - ICLT Program
Microbial disease of the cardiovascular and lymphatic systems
Cell Types and Its function , kingdom of life
Computing-Curriculum for Schools in Ghana
Pharma ospi slides which help in ospi learning
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Microbial diseases, their pathogenesis and prophylaxis

Metadata for Data Rescue and Data at Risk

  • 1. Metadata for Data Rescue and Data at Risk William L. Anderson, John L. Faundeen, Jane Greenberg, Fraser Taylor PV2011, Toulouse, 17 November 2011 Presented by Nico Carver In collaboration with the DARi SILS Student Learning Circle
  • 2. Outline Major Questions Metadata Scheme Design Case Study Next Steps Acknowledgements Questions/Comments
  • 3. Major Questions informing Research Where is at-risk data? How are scientists using historic data? How do we define at-risk? “ 8 inch floppy” Retrieved from: http://johnkingworld.com/aplus/images/storage-8inch-floppy.jpg How do others define at-risk? What must be done to rescue data-at-risk?
  • 4. Major Question informing Scheme Design What is essential metadata for describing data-at-risk and aiding in data rescue?
  • 5. Metadata requirements • Be applicable across a range of disciplines and scientific research areas. • Sufficiently support the data rescue mission.
  • 6. Functions of the Inventory Function Initial Metadata Properties Describe data of scientific value that is at-risk of being lost, unused, or destroyed. 1. Science area 2. Nature of data 3. Date or date-span 4. Location of original 5. Present location Act as a starting point for the data rescue mission. 6. Expected future 7. Risk level
  • 7. Metadata Frameworks Useful for Data-at-Risk DARTG Chair Elizabeth Griffin’s initial proposed DARTG metadata properties Metadata Property Science area 2. Nature of data 3. Date or date-span 4. Location of original 5. Present location 6. Expected future 7. Risk level
  • 8. Metadata Frameworks Useful for Data-at-Risk U.S.Geological Service: “Create a Rescue Request”, URL: http://eros.usgs.gov/government/archive_rescue/archive_request.php
  • 9. Metadata Frameworks Useful for Data-at-Risk “ Growing the Vocabuary” http://dublincore.org/resources/training/frd_20091217/Tutorial_FRD_baker-1.pdf
  • 10. Metadata Frameworks Useful for Data-at-Risk “ The PREMIS Data Dictionary” http://www.loc.gov/standards/premis/v2/premis-dd-2-1.pdf
  • 11. Data-at-Risk Inventory (DARI) Metadata Scheme: guiding principles • Simple • Broadly applicable • Extensible
  • 12. DARI Metadata Scheme (current) DARTG DARI Metadata, Version 1.0 Metadata Element Name Element Description Research Area(s) The domains represented by DARTG experts and the more general category of “Other”. Title The name associated with the collection. Physical form of the data Paper, photograph, specimen, record book, magnetic tape, etc. Content and context of the data History, topic, etc. -- if known Name of current holder Institution, organization or individual. Dates associated with data Time period when data were collected. Size Extent, volume, size. Data condition Stable, deteriorating, etc. Risk level Poor storage conditions, limited storage time, etc. Known access and restrictions Public domain, private collection, etc. Notes Any additional information. Contact information Address or other contact information for the institution, organization or individual.
  • 15. Case Study: Results 7 Dataset Descriptions total. 5 out of 7 were completed unassisted using the metadata template 13.5 out of 16 metadata elements considered useful on average (85%) 4 out of 5 scientists said they would use the inventory again
  • 16. Case Study: conclusions The purpose of the inventory had to be more clearly stated on the website Instructions for filling out the web form had to be simple, but clear 3 metadata properties were determined unnecessary, 4 properties were altered for clarity The remaining metadata properties were successful in their ability to cut across scientific disciplines while fully describing data-at-risk
  • 17. Next Steps Complete focus groups and surveys at UNC- Chapel Hill and elsewhere to determine possible use cases Disseminate information and generate interest for the inventory and the Data-at-Risk project Finalize the inventory design and start populating it
  • 18. Submit a description: http://ibiblio.org/data-at-risk/contribution
  • 19. Questions/ Comments? Acknowledgements: The University of North Carolina Center for Global Initiatives’ support of the Data At Risk Inventory SILS Student Learning Circle The Council for Scientific and Technical Data And the following people for their leadership, guidance, and assistance: Bill Anderson, School of Information, University of Texas at Austin; Jane Greenberg, School of Information and Library Science; Elizabeth Griffin, Herzberg Institute of Astrophysics; Dav Robertson, National Institute of Environmental Health Sciences, NIH; and Paul Jones & John Reuning, ibiblio, University of North Carolina at Chapel Hill.

Editor's Notes

  • #14: Topics to cover: ibiblio omeka prototyping schedule purpose