SlideShare a Scribd company logo
POLICIES FOR DATA SHARING, ACCESS AND REUSE MacKenzie Smith MIT, ARL, CC
NSF DMP GUIDELINES WANT Policies  for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements Policies  and provisions for re-use, re-distribution, and the production of derivatives RDAP Summit  ©2011, MacKenzie Smith
WHAT IS DRIVING THIS? Scientific progress requires  international, interdisciplinary interoperability,  including frictionless data integration at large-scales (e.g. the Web) Data interoperability  includes technical  issues (data integration, protocols) social  issues (scientific norms, credit mechanisms or lack thereof) legal  issues (incompatible laws and policies for data and databases) RDAP Summit  ©2011, MacKenzie Smith
DATA USE/REUSE/REDISTRIBUTION  Data use : Using research data for the current research purpose/activity to infer   new knowledge about the research subject. Data re-use : Using research data for a research purpose/activity  other than  that   for which it was intended. Howard, T., Darlington, M., Ball, A., Culley, S., McMahon, C., 2010. Understanding and Characterizing Engineering Research Data for its Better Management. Project Report. Bath, UK: University of Bath, ERIM Project Document.  erim2rep100420mjd10 RDAP Summit  ©2011, MacKenzie Smith
DATA USE/REUSE/REDISTRIBUTION  Data purposing : Making research data available and fit for the  current  research   activity. Data re-purposing : Making research data available and fit for a future   known   research activity Data re-use:  Managing research data such that it will be   available for a  future  unknown  research activity. RDAP Summit  ©2011, MacKenzie Smith
SUPPORTING DATA REUSE Future users unknown, potentially interdisciplinary You don’t know them and  they don’t know you (or what you/your discipline expects) Data documentation and policies need to be clear, not  require  contact or ad hoc negotiations  (what if you’ve moved or you’re dead?) RDAP Summit  ©2011, MacKenzie Smith
INTERNATIONAL COLLABORATIONS If I participate in a collaborative international research project, do I need to be concerned with data management policies established by institutions outside the United States? Yes . There may be cases where data management plans are affected by formal data protocols established by large international research consortia or set forth in formal science and technology agreements signed by the United States Government and foreign counterparts. Be sure to discuss this issue with your sponsored projects office (or equivalent) and your international research partner when first planning your collaboration. RDAP Summit  ©2011, MacKenzie Smith
DATA LICENSING  IN US US Gov data in the Public Domain explicit rights statement rare Factual data not copyrightable in the US creativity matters, ‘sweat of the brow’ does not not much legal precedent in science  generally not known by users EULAs in place for many data archives all different, varying practicality, hard to enforce RDAP Summit  ©2011, MacKenzie Smith
CREATIVE COMMONS Tools for data sharing towards Web-scale interoperability (e.g. Linked Open Data) CC0 or CC-By Public Domain mark Best practice for URI-based attribution  (e.g. to avoid attribution stacking) RDAP Summit  ©2011, MacKenzie Smith
CREATIVE COMMONS CC0 waives copyright and associated rights  (e.g. data rights)  where applicable Important for interoperability with legal jurisdictions that have sui generis data rights (e.g. Europe) CC-By-SA bad for interoperability CC-By with attribution  via URI (Aus and NZ examples) Attribution stacking RDAP Summit  ©2011, MacKenzie Smith
ISSUES Licenses  Attribution  Persistent IDs Provenance Metadata  Registries RDAP Summit  ©2011, MacKenzie Smith
WHAT DO RESEARCHERS WANT? SUPPLY SIDE CREDIT  CONTROL CONFIDENCE (in appropriate use of their data) and sometimes… IP but always…  FUNDING RDAP Summit  ©2011, MacKenzie Smith
WHAT DO RESEARCHERS WANT? DEMAND SIDE Easy reuse of their own data Easy discovery of and access to outside data Easier integration/interoperability of their own, other data (i.e. “re-purposing”) RDAP Summit  ©2011, MacKenzie Smith
HOW CAN RESEARCHERS ACHIEVE THAT? Standard copyright licenses or waivers Standards terms & conditions (EULA) …  via their institutional repository! Researchers want good advice, have zero interest in complex legal issues IRs can establish practices that help researchers achieve their goals with low effort RDAP Summit  ©2011, MacKenzie Smith
DMP BOILERPLATE Sharing .  Project data will be made publicly accessible/downloadable from the  university’s data archive website  (via a standard Web UI) as … Once located on the archive website, image sets will be downloadable via standard Web protocols (i.e. http).  Included in the associated metadata for each image set will be rights information such as copyright and licensing terms for use and reuse of the data . Each image set will be assigned a unique, persistent URI (web identifier, resolvable as a URL) for use in citations. The university’s data archive uses Handles for persistent URIs. RDAP Summit  ©2011, MacKenzie Smith
DMP BOILERPLATE Licensing .  Images, even scientific research images generated by scanners, may be subject to copyright in the U.S., so images produced by the project will be collected and  shared using a Creative Commons license, specifically CC-BY (i.e. with attribution to the copyright owner,  who is the Principal Investigator for this project, with the approval of the university’s IP counsel) . By using the CC-BY license, we are authorizing all interested researchers to use the image data produced by this project in whatever manner they choose, as long as they cite the Principal Investigator as the source of the data.  RDAP Summit  ©2011, MacKenzie Smith
DMP BOILERPLATE Licensing, cont. Metadata  associated with the image sets will be released under a CC0 license (public domain dedication) since it is normally not copyrightable and we want it to be reusable in new contexts (e.g. Google indexes). With these licensing terms, future researchers will be able to combine the image data and associated metadata produced by this project with data produced from their own or other projects, to create super- or sub-sets of images needed for their own research (i.e. “derivative” datasets).   RDAP Summit  ©2011, MacKenzie Smith
DMP BOILERPLATE In the university’s central data archive,  researchers will be able to determine the rights assigned to the project’s data via the metadata displayed in the UI for the dataset (i.e. in the rights fields of the relevant catalog record for the dataset) .  The archive’s search interface supports filtering searches by rights category (e.g. Public Domain, CC-BY, embargoed) so that researchers can search for only data that they may reuse in their own research. RDAP Summit  ©2011, MacKenzie Smith
CONCLUSION IRs serving as data archives can  Standardize institutional data policies Encourage OA Lower barriers to researchers to comply with NSF intent DMPs encourage use of IR over time, reassure NSF of consistent practice RDAP Summit  ©2011, MacKenzie Smith

More Related Content

PPT
Altman RDAP11 Policy-based Data Management
PPT
Global registries initiative frumkin omodei
PPT
Rots RDAP11 Data Archives in Federal Agencies
PPTX
EPSRC Policy Compliance: What researchers need to know
PPTX
Research Data Management: Why is it important?
PPTX
Overcoming obstacles to sharing data about human subjects
PPT
User engagement in research data curation
PPTX
Research data spring: extending the OPD to cover RDM
Altman RDAP11 Policy-based Data Management
Global registries initiative frumkin omodei
Rots RDAP11 Data Archives in Federal Agencies
EPSRC Policy Compliance: What researchers need to know
Research Data Management: Why is it important?
Overcoming obstacles to sharing data about human subjects
User engagement in research data curation
Research data spring: extending the OPD to cover RDM

What's hot (20)

PPTX
Certifying CISER! A Data Seal of Approval Case Study
PPTX
Tijerina-RDA-NISO-Task Groups-sept11
PDF
Borgman - Privacy, Policy and Data Governance in the University
PPTX
D4Science Data infrastructure: a facilitator for a FAIR data management
PPT
Managing sensitive data at the University of Bristol
PDF
"Cool" metadata for FAIR data
PPTX
INSTRUCT - Integrated Structural Biology Infrastructure
PPTX
Jisc Research data shared service overview and update - May 2016
PPTX
PPTX
Standardising research data policies, research data network
PDF
OU Library Research Support webinar: Data sharing
PPT
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
PPTX
FAIR data
PDF
Data discovery and sharing at UCLH
PDF
Valen Metadata and the [Data] Repository
PPTX
Recognising data sharing
PPTX
LIBER Webinar: Are the FAIR Data Principles really fair?
PPTX
A discovery service for UK research data
PPTX
‘Good, better, best’? Examining the range and rationales of institutional dat...
PDF
Levine - Data Curation; Ethics and Legal Considerations
Certifying CISER! A Data Seal of Approval Case Study
Tijerina-RDA-NISO-Task Groups-sept11
Borgman - Privacy, Policy and Data Governance in the University
D4Science Data infrastructure: a facilitator for a FAIR data management
Managing sensitive data at the University of Bristol
"Cool" metadata for FAIR data
INSTRUCT - Integrated Structural Biology Infrastructure
Jisc Research data shared service overview and update - May 2016
Standardising research data policies, research data network
OU Library Research Support webinar: Data sharing
OAIS: What is it and Where is it Going? - Don Sawyer (2002)
FAIR data
Data discovery and sharing at UCLH
Valen Metadata and the [Data] Repository
Recognising data sharing
LIBER Webinar: Are the FAIR Data Principles really fair?
A discovery service for UK research data
‘Good, better, best’? Examining the range and rationales of institutional dat...
Levine - Data Curation; Ethics and Legal Considerations
Ad

Similar to Smith RDAP11 NSF Data Management Plan Case Studies (20)

PPTX
What is a DMP
PPTX
NIH Data Summit - The NIH Data Commons
PDF
Data Management Planning - 02/21/13
PPT
PDF
Linked Data: Opportunities for Entrepreneurs
PPTX
Data Management and Horizon 2020
PPT
EPA OEI Linked Data Process
PPTX
Hughes RDAP11 Data Publication Repositories
PDF
Tag.bio: Self Service Data Mesh Platform
PDF
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
PPTX
Standard Safeguarding Dataset - overview for CSCDUG.pptx
PPTX
Introduction to RDM for Geoscience PhD Students
PDF
The state of global research data initiatives: observations from a life on th...
PPTX
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPT
Data management plans
PDF
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
PPTX
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
PPTX
Research Data Management: An Introduction to the Basics
What is a DMP
NIH Data Summit - The NIH Data Commons
Data Management Planning - 02/21/13
Linked Data: Opportunities for Entrepreneurs
Data Management and Horizon 2020
EPA OEI Linked Data Process
Hughes RDAP11 Data Publication Repositories
Tag.bio: Self Service Data Mesh Platform
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Standard Safeguarding Dataset - overview for CSCDUG.pptx
Introduction to RDM for Geoscience PhD Students
The state of global research data initiatives: observations from a life on th...
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Data management plans
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
Research Data Management: An Introduction to the Basics
Ad

More from ASIS&T (20)

PPTX
RDAP 16: Sustaining Research Data Services (Panel 2: Sustainability)
PPTX
RDAP 16: Sustainability of data infrastructure: The history of science scienc...
DOCX
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
PPTX
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
PPTX
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
PPTX
RDAP 16: If I could turn back time: Looking back on 2+ years of DMP consultin...
PDF
RDAP 16: Data Management Plan Perspectives (Panel 5, DMPs and Public Access)
PDF
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
PDF
RDAP 16 Poster: Interpreting Local Data Policies in Practice
PDF
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
PPTX
RDAP 16 Poster: Responding to Data Management and Sharing Requirements in the...
PPTX
RDAP 16 Lightning: Spreading the love: Bringing data management training to s...
PPTX
RDAP 16 Lightning: RDM Discussion Group: How'd that go?
PPT
RDAP 16 Lightning: Data Practices and Perspectives of Atmospheric and Enginee...
PDF
RDAP 16 Lightning: Working Across Cultures: Data Librarian as Knowledge Broker
PPTX
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
PPT
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
PPTX
RDAP 16 Lightning: Personas as a Policy Development Tool for Research Data
PPTX
RDAP 16 Lightning: Growing Data in Utah: A Model for Statewide Collaboration
PPTX
RDAP 16: Building Without a Plan: How do you assess structural strength? (Pan...
RDAP 16: Sustaining Research Data Services (Panel 2: Sustainability)
RDAP 16: Sustainability of data infrastructure: The history of science scienc...
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
RDAP 16: If I could turn back time: Looking back on 2+ years of DMP consultin...
RDAP 16: Data Management Plan Perspectives (Panel 5, DMPs and Public Access)
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
RDAP 16 Poster: Interpreting Local Data Policies in Practice
RDAP 16 Poster: Measuring adoption of Electronic Lab Notebooks and their impa...
RDAP 16 Poster: Responding to Data Management and Sharing Requirements in the...
RDAP 16 Lightning: Spreading the love: Bringing data management training to s...
RDAP 16 Lightning: RDM Discussion Group: How'd that go?
RDAP 16 Lightning: Data Practices and Perspectives of Atmospheric and Enginee...
RDAP 16 Lightning: Working Across Cultures: Data Librarian as Knowledge Broker
RDAP 16 Lightning: An Open Science Framework for Solving Institutional Challe...
RDAP 16 Lightning: Quantifying Needs for a University Research Repository Sys...
RDAP 16 Lightning: Personas as a Policy Development Tool for Research Data
RDAP 16 Lightning: Growing Data in Utah: A Model for Statewide Collaboration
RDAP 16: Building Without a Plan: How do you assess structural strength? (Pan...

Smith RDAP11 NSF Data Management Plan Case Studies

  • 1. POLICIES FOR DATA SHARING, ACCESS AND REUSE MacKenzie Smith MIT, ARL, CC
  • 2. NSF DMP GUIDELINES WANT Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements Policies and provisions for re-use, re-distribution, and the production of derivatives RDAP Summit ©2011, MacKenzie Smith
  • 3. WHAT IS DRIVING THIS? Scientific progress requires international, interdisciplinary interoperability, including frictionless data integration at large-scales (e.g. the Web) Data interoperability includes technical issues (data integration, protocols) social issues (scientific norms, credit mechanisms or lack thereof) legal issues (incompatible laws and policies for data and databases) RDAP Summit ©2011, MacKenzie Smith
  • 4. DATA USE/REUSE/REDISTRIBUTION Data use : Using research data for the current research purpose/activity to infer new knowledge about the research subject. Data re-use : Using research data for a research purpose/activity other than that for which it was intended. Howard, T., Darlington, M., Ball, A., Culley, S., McMahon, C., 2010. Understanding and Characterizing Engineering Research Data for its Better Management. Project Report. Bath, UK: University of Bath, ERIM Project Document. erim2rep100420mjd10 RDAP Summit ©2011, MacKenzie Smith
  • 5. DATA USE/REUSE/REDISTRIBUTION Data purposing : Making research data available and fit for the current research activity. Data re-purposing : Making research data available and fit for a future known research activity Data re-use: Managing research data such that it will be available for a future unknown research activity. RDAP Summit ©2011, MacKenzie Smith
  • 6. SUPPORTING DATA REUSE Future users unknown, potentially interdisciplinary You don’t know them and they don’t know you (or what you/your discipline expects) Data documentation and policies need to be clear, not require contact or ad hoc negotiations (what if you’ve moved or you’re dead?) RDAP Summit ©2011, MacKenzie Smith
  • 7. INTERNATIONAL COLLABORATIONS If I participate in a collaborative international research project, do I need to be concerned with data management policies established by institutions outside the United States? Yes . There may be cases where data management plans are affected by formal data protocols established by large international research consortia or set forth in formal science and technology agreements signed by the United States Government and foreign counterparts. Be sure to discuss this issue with your sponsored projects office (or equivalent) and your international research partner when first planning your collaboration. RDAP Summit ©2011, MacKenzie Smith
  • 8. DATA LICENSING IN US US Gov data in the Public Domain explicit rights statement rare Factual data not copyrightable in the US creativity matters, ‘sweat of the brow’ does not not much legal precedent in science generally not known by users EULAs in place for many data archives all different, varying practicality, hard to enforce RDAP Summit ©2011, MacKenzie Smith
  • 9. CREATIVE COMMONS Tools for data sharing towards Web-scale interoperability (e.g. Linked Open Data) CC0 or CC-By Public Domain mark Best practice for URI-based attribution (e.g. to avoid attribution stacking) RDAP Summit ©2011, MacKenzie Smith
  • 10. CREATIVE COMMONS CC0 waives copyright and associated rights (e.g. data rights) where applicable Important for interoperability with legal jurisdictions that have sui generis data rights (e.g. Europe) CC-By-SA bad for interoperability CC-By with attribution via URI (Aus and NZ examples) Attribution stacking RDAP Summit ©2011, MacKenzie Smith
  • 11. ISSUES Licenses Attribution Persistent IDs Provenance Metadata Registries RDAP Summit ©2011, MacKenzie Smith
  • 12. WHAT DO RESEARCHERS WANT? SUPPLY SIDE CREDIT CONTROL CONFIDENCE (in appropriate use of their data) and sometimes… IP but always… FUNDING RDAP Summit ©2011, MacKenzie Smith
  • 13. WHAT DO RESEARCHERS WANT? DEMAND SIDE Easy reuse of their own data Easy discovery of and access to outside data Easier integration/interoperability of their own, other data (i.e. “re-purposing”) RDAP Summit ©2011, MacKenzie Smith
  • 14. HOW CAN RESEARCHERS ACHIEVE THAT? Standard copyright licenses or waivers Standards terms & conditions (EULA) … via their institutional repository! Researchers want good advice, have zero interest in complex legal issues IRs can establish practices that help researchers achieve their goals with low effort RDAP Summit ©2011, MacKenzie Smith
  • 15. DMP BOILERPLATE Sharing . Project data will be made publicly accessible/downloadable from the university’s data archive website (via a standard Web UI) as … Once located on the archive website, image sets will be downloadable via standard Web protocols (i.e. http). Included in the associated metadata for each image set will be rights information such as copyright and licensing terms for use and reuse of the data . Each image set will be assigned a unique, persistent URI (web identifier, resolvable as a URL) for use in citations. The university’s data archive uses Handles for persistent URIs. RDAP Summit ©2011, MacKenzie Smith
  • 16. DMP BOILERPLATE Licensing . Images, even scientific research images generated by scanners, may be subject to copyright in the U.S., so images produced by the project will be collected and shared using a Creative Commons license, specifically CC-BY (i.e. with attribution to the copyright owner, who is the Principal Investigator for this project, with the approval of the university’s IP counsel) . By using the CC-BY license, we are authorizing all interested researchers to use the image data produced by this project in whatever manner they choose, as long as they cite the Principal Investigator as the source of the data. RDAP Summit ©2011, MacKenzie Smith
  • 17. DMP BOILERPLATE Licensing, cont. Metadata associated with the image sets will be released under a CC0 license (public domain dedication) since it is normally not copyrightable and we want it to be reusable in new contexts (e.g. Google indexes). With these licensing terms, future researchers will be able to combine the image data and associated metadata produced by this project with data produced from their own or other projects, to create super- or sub-sets of images needed for their own research (i.e. “derivative” datasets).   RDAP Summit ©2011, MacKenzie Smith
  • 18. DMP BOILERPLATE In the university’s central data archive, researchers will be able to determine the rights assigned to the project’s data via the metadata displayed in the UI for the dataset (i.e. in the rights fields of the relevant catalog record for the dataset) . The archive’s search interface supports filtering searches by rights category (e.g. Public Domain, CC-BY, embargoed) so that researchers can search for only data that they may reuse in their own research. RDAP Summit ©2011, MacKenzie Smith
  • 19. CONCLUSION IRs serving as data archives can Standardize institutional data policies Encourage OA Lower barriers to researchers to comply with NSF intent DMPs encourage use of IR over time, reassure NSF of consistent practice RDAP Summit ©2011, MacKenzie Smith