SlideShare a Scribd company logo
How to Comply with Grants:
Writing Data Management Plans
and Providing Public Access
Margaret Henderson
Director, Research Data Management
mehenderson@vcu.edu
February 2016
How to Comply with Grants: Writing Data Management Plans and Providing Public Access
Research Data: Recorded information, regardless of form or the media on which it
may be recorded, which constitute the original observations and methods of a study
and the analyses of these original data that are necessary for reconstruction and
evaluation of the Report(s) of a study made by one or more Investigators. Research
Data also includes all such recorded information gathered in anticipation of a Report.
Research Data differ among disciplines. The term may include but is not limited to
technical information, computer software, laboratory and other notebooks, printouts,
worksheets, other media, survey, memoranda, evaluations, notes, databases, clinical
case history records, study protocols, statistics, findings, conclusions, samples, physical
collections, other supporting materials created or gathered in the course of the
Research, Tangible Research Property, unique Research resources such as synthetic
compounds, organisms, cell lines, viruses, cell products, cloned DNA as well as genetic
sequences and mapping information, crystallographic coordinates, plants, animals and
spectroscopic data, and other compilations formed by selecting and assembling
preexisting materials in a unique way. The term does not include information
incidental to research administration such as financial, administrative, cost or pricing,
or management information.
http://www.policy.vcu.edu/sites/default/files/Research%20Data%20Ownership%2C%20Retention%2C%20Access%20and%20Securty.pdf
While VCU Owns the Data..
Principal Investigator is the Data Steward and is
responsible for the integrity, preservation and security
of Research Data.
Data Management Plan
Outlines how a researcher will:
• collect
• organize
• back up
• storing
• share
the data for a project, and indicates who the
data steward will be.
Case 1
Case 2
http://retractionwatch.com/2016/01/05/a-new-excuse-for-data-fabrication-my-notebook-blew-into-a-manure-pit/
Case 3
http://blogs.nature.com/ofschemesandmemes/2014/04/08/imagine-not-getting-the-phd-youd-been-working-towards-datadramas/
FEDERAL REQUIREMENTS
NIH Public Access Policy
SEC. 218. The Director of the National Institutes of Health shall require that all
investigators funded by the NIH submit or have submitted for them to the
National Library of Medicine’s PubMed Central an electronic version of their
final peer-reviewed manuscripts upon acceptance for publication, to be
made publicly available no later than 12 months after the official date of
publication: Provided, That the NIH shall implement the public access policy
in a manner consistent with copyright law.
https://publicaccess.nih.gov/
NIH Data Sharing Policy
“Data should be made as widely and freely
available as possible while safeguarding the
privacy of participants, and protecting confidential
and proprietary data. To facilitate data sharing,
investigators submitting a research application
requesting $500,000 or more of direct costs in any
single year to NIH on or after October 1, 2003 are
expected to include a plan for sharing final research
data for research purposes, or state why data
sharing is not possible. “
http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm
NIH Genomic Data Sharing Policy
“Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome-
Wide Association Studies (GWAS) (effective January 2015)
• “For the purposes of this policy, a genome-wide association study is
defined as any study of genetic variation across the entire human genome
that is designed to identify genetic associations with observable traits
(such as blood pressure or weight), or the presence or absence of a
disease or condition.”
• Applies to all NIH-funded research that generates large-scale human or
non-human genomic data, as well as the use of those data for subsequent
research.
• Requires “Genomic Data Sharing Plan”.
• Allows for expenses in project budget.
• Requires public availability of data in a “timely manner.”
• Recommends NIH-funded or third-party repositories for deposition.
http://grants.nih.gov/grants/guide/notice-files/NOT-OD-07-088.html
NSF Policies
NSF Data Sharing Policy
Investigators are expected to share with other researchers, at no more than
incremental cost and within a reasonable time, the primary data, samples,
physical collections and other supporting materials created or gathered in
the course of work under NSF grants. Grantees are expected to encourage
and facilitate such sharing. See Award & Administration Guide (AAG) Chapter
VI.D.4. http://www.nsf.gov/bfa/dias/policy/dmp.jsp
NSF Data Management Plan Requirements
Proposals submitted or due on or after January 18, 2011, must include a
supplementary document of no more than two pages labeled “Data
Management Plan”. This supplementary document should describe how the
proposal will conform to NSF policy on the dissemination and sharing of
research results. See Grant Proposal Guide (GPG) Chapter II.C.2.j for full
policy implementation. https://www.nsf.gov/eng/general/dmp.jsp
NSF Policies
NSF Data Sharing Policy
Investigators are expected to share with other researchers, at no more than
incremental cost and within a reasonable time, the primary data, samples,
physical collections and other supporting materials created or gathered
in the course of work under NSF grants. Grantees are expected to
encourage and facilitate such sharing. See Award & Administration Guide
(AAG) Chapter VI.D.4. http://www.nsf.gov/bfa/dias/policy/dmp.jsp
NSF Data Management Plan Requirements
Proposals submitted or due on or after January 18, 2011, must include a
supplementary document of no more than two pages labeled “Data
Management Plan”. This supplementary document should describe how the
proposal will conform to NSF policy on the dissemination and sharing of
research results. See Grant Proposal Guide (GPG) Chapter II.C.2.j for full
policy implementation. https://www.nsf.gov/eng/general/dmp.jsp
Slide courtesy of Amanda Whitmire
OSTP Memorandum
Increasing Access to the Results of Federally Funded Scientific
Research -February 22, 2013
“ensuring that, … the direct results of federally funded scientific
research are made available to and useful for the public,
industry, and the scientific community. Such results include peer-
reviewed publications and digital data.”
“develop plans to make the results of federally-funded research
publically available free of charge within 12 months after
original publication.”
https://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research
THE PLANS SO FAR
http://guides.library.vcu.edu/publicaccess
Department of Health and Human Services
Guiding Principles and Common Approach for Enhancing Public
Access to the Results of Research Funded by HHS Operating
Divisions (February 2015)
Provides continuity across Operating Divisions:
NIH, CDC, FDA, AHRQ, and ASPR (voluntary)
• PubMed Central will serve as the repository for
publications funded by HHS Operating Divisions
• HHS will enable stakeholders to petition to shorten the
12-month maximum embargo period for publications
• HHS will take a common, stepped approach to establishing
data policy and infrastructure: assessment, inventory,
DMPs, pilots, training
NIH – National Institutes of Health
Plan for Increasing Access to Scientific Publications and Digital
Scientific Data from NIH Funded Scientific Research
(February 2015)
• In effect for publications; policy changes to support data to be
implemented by December 2015
• Applies to research/researchers funded wholly or in part by
the NIH
NIH
Publications
• Peer-reviewed scientific articles
• Deposit of final peer-reviewed
manuscript into PMC
• Upon acceptance, with maximum
12-month embargo
• Include appropriate costs in
proposals
• Reporting through eRA Commons
and My NCBI
• Withholding of funds
Data
• Unclassified digital scientific
research data
• Submission of DMP; deposit of
data into appropriate, existing,
publicly accessible repositories,
including NIH data repositories
• Upon acceptance for publication
(will explore)
• Include appropriate costs in
proposals
• Utilize existing reporting
structures
• “enforcement actions” including
withholding of funds
CDC – Centers for Disease Control and Prevention
CDC Plan for Increasing Access to Scientific Publications and
Digital Scientific Data Generated with CDC Funding
(January 2015)
• In effect for publications; October 2015 (FY 2016 funding
cycle) for data
• Applies to research/researchers funded by the CDC (intra- and
extramural)
• CDC Stacks http://stacks.cdc.gov/ for public health
information
CDC
Publications
• Peer-reviewed research
publications
• Deposit of final peer-reviewed
manuscript into NIHMS (CDS
Stacks, PMC)
• Upon acceptance, with maximum
12-month embargo
• eClearance scientific clearance
system; existing reporting
structures (Office of Science
Quality)
Data
• Unclassified digital scientific
research data
• Submission of DMP (generic CDC
template); deposit of data into
suitable platform (tbd)
• Upon acceptance for publication
or within 30 months of collection
• Include appropriate costs in
proposals
• Utilize existing reporting
structures
• Reduction or restriction of funds,
award termination, negative
influence on future awards
FDA – Food and Drug Administration
Plan to Increase Access to Results of FDA-Funded
Scientific Research (February 2015)
• Effective October 2015
• Applies to research/researchers funded wholly or
in part by the FDA (intra- and extramural)
https://open.fda.gov/ open-source APIs (application
program interface)
FDA
Publications
• Peer-reviewed scientific
articles
• Deposit of final peer-reviewed
manuscript into PMC
• With maximum 12-month
embargo*
• Include appropriate costs in
proposals
• Utilize existing reporting
structures
• Termination of contract or
grant; withholding of funds
Data
• Digitally formatted scientific
data resulting from
unclassified research
• Submission of DMP; deposit of
data into discipline-specific
repositories
• Upon acceptance for
publication
• Include appropriate costs in
proposals
• Utilize existing reporting
structures
• Termination of contract or
grant; withholding of funds
AHRQ – Agency for Healthcare
Research and Quality
AHRQ Public Access to Federally Funded Research
(February 2015)
• Effective February 2015 for publications; October
2015 for data
• Applies to all research/researchers funded wholly or
in part by AHRQ (Intramural, extramural, or contract
researchers)
AHRQ
Publications
• Peer-reviewed scholarly
research articles
• Deposit of final peer-
reviewed manuscript into
PMC
• Maximum 12-month
embargo*
• Include appropriate costs in
proposals and applications
• Utilize existing reporting
structures (NIH)
• Withholding of funding
Data
• Unclassified research dataˡ
• Submission of DMP; deposit
of data to AHRQ▪ or other
repository
• Upon acceptance for
publication
• Include appropriate costs in
proposals and applications
• Utilize existing reporting
structures
• Negative influence on
future funding
ASPR – Assistant Secretary for
Preparedness and Response
Public Access to Federally Funded Research:
Publications and Data (February 2015)
• Effective October 2015
• Applies to Researchers funded wholly or in part by
ASPR
ASPR
Publications
• Peer-reviewed scholarly
research articles
• Deposit of final peer-reviewed
manuscript into PMC
• Maximum 12-month
embargo*
• Appropriate costs included in
proposals and applications
• Existing reporting structures
(NIH)
• Withholding of funding
Data
• Digital scientific dataˡ
• Submission of DMP; deposit of
data to recognized scientific
repository
• 30 months from creation of
data set, or upon publication
• Appropriate costs included in
proposals and applications
• Staff and peer review; existing
reporting structures
• Negative influence on future
funding
DOD
• Plan
• SPARC overview http://sparc.arl.org/blog/dod-releases-draft-public-
access-plan
• Starting FY 2015
• Completed 4th quarter FY 2106
• Allows for inclusion of costs in proposals
• Requirements: DMPs cover data sharing and data are available before
making subsequent awards
• Plan on compliance monitor and certification tokens
• Encourages authors to negotiate their copyright for papers
DTIC (Defense Technical Information Center): repository for full-text of peer-
reviewed author final manuscripts or publisher versions of research articles
and repository for metadata of digital scientific data sets.
DOD
Publications
• Peer-reviewed scholarly
publications arising from
unclassified, publicly releasable
research and programs.
• Deposit peer-reviewed scholarly
publications into DOD public
access archive system (DTIC)
• DOD will establish a system to
enable the submission of final,
peer-reviewed manuscripts.
• minimum 12 month embargo
Data
• Digitally formatted data arising
from unclassified, publicly
releasable research and
programs.
• Decentralized approach to data
storage.
• Require the submission of data
management plans.
• Allow for inclusion of costs for
data management and access.
• Will establish a system to enable
the identification, attribution,
(federated) storage, and access of
digital data.
DOE
• Public Access Plan
• Already started for Office of Science grants
• PAGES - a web-based portal that will provide free
public access to accepted peer-reviewed manuscripts
or published scientific journal articles within 12 months
of publication
• Data management plan requirements and guidance
• “unclassified and unrestricted”
• “Not all data need to be shared or preserved. The costs
and benefits of doing so should be considered in data
management planning.”
DOE
Publications
• Classified or protected data and
research will not be publicly
available.
• Public Access Gateway for Energy
and Science (PAGES) maintained
by Office of Scientific and
Technical Information (OSTI)
• Deposit abstract and metadata
and link to Version of Record
(publisher pdf, final MS in
repository, or DOE dark archive)
• within 12 months of publication
Data
• Office of Science started DMP
requirement in July 2014
(supports ⅔ of R&D)
• Other offices start Oct. 1, 2015
• DMP required and merits
evaluated.
• Covers unclassified and
unrestricted digital research data,
i.e. digital data required to
validate findings.
• Enterprise Data Inventory ->
Public Data Listing -> populates
data.gov
NSF
Plan – Starting January 2016
Public Access to Results of NSF-funded Research overview
SPARC Comments http://sparc.arl.org/blog/nsf-releases-incremental-plan-for-public-
access
“deposit final accepted manuscripts (or published articles) into the Department of
Energy’s “PAGES” repository – a dark archive – with public access to be provided via
links to publisher’s websites.”
“NSF plan will extend to papers published in “juried conference proceedings,” as well
as peer-reviewed journals, and the agency notes it intends to eventually include other
types of NSF-supported grey literature and educational materials under the final
policy”
“No indication of how the kinds of productive reuse (computation, text and data
mining etc.) set out by the White House Directive will be facilitated…”
NSF
Publications
• Working with DOE/OSTI to
create a version of PAGES for
NSF papers.(i.e. paper will
need to be available from
publisher or IR)
• Voluntary deposit will start
December 2015.
• Need persistent ID and
machine-readable metadata.
• No more than 12 month
embargo.
Data
• All proposals need 2 page
DMP that will be part of merit
review and be monitored.
• Data in appropriate repository,
metadata required.
• Funds available to prepare
data for sharing.
• Exploring data deposit at time
of paper publication.
• No time frame for
preservation
NASA
• Plan
• Started February 2015
• Completed by October 2015
• SPARC Overview http://sparc.arl.org/blog/nasa-public-access-plan-available-uses-
nihs-pmc-platform
“The analysis compared the merits of the NIH PubMed Central (PMC) database, the
DOE’s Public Access Gateway for Energy and Science (PAGES) system, and the
Clearinghouse for the Open Research of the United States (CHORUS) platform
proposed by the publishing industry. Ultimately, NASA opted to work with NIH’s PMC
database.” (SPARC emphasis!)
DMP FAQ ROSES (Research Opportunities in Space and Earth Science)
http://science.nasa.gov/researchers/sara/faqs/dmp-faq-roses/
“First of all, be reassured that we are not going to force you to reveal your precious
proprietary data prior to publication. No personal, proprietary or ITAR data is
included.”
NASA
Publications
• NIH PMC will provide a
NASA‐branded portal to the full
functionality of the PMC system.
• 12 month embargo. Publishers
can petition for longer.
• Publications cited in reports must
be in repository.
Data
• At a minimum required DMP for
ROSES must promise to release
the data needed to reproduce
figures, tables and other
representations in publications,
at time of publication or within
reasonable time period.
• Publication should provide link to
data.
• Only the data used to support,
validate, and corroborate
published research findings are
required to be shared, per this
plan. Preliminary data, trial data,
etc. are not included.
• NASA will develop a data catalog
WHAT DOES THIS MEAN FOR YOU?
Public Access to Peer Reviewed Articles
Check Author’s Rights :
• work with the publisher before any publication
rights are transferred to ensure that all conditions of
the …public access policy can be met.
• advised not to sign any agreements with publishers
that do not allow the author to comply with the
…public access initiative.
*SPARC provides an author addendum that will allow
deposit in repositories: http://sparcopen.org/our-
work/author-rights/#addendum
Data Management Plans
• All agencies will require a data
management plan.
• “Not all data need to be shared or
preserved. The costs and benefits of doing
so should be considered in data
management planning.” DOE third principle
http://science.energy.gov/funding-opportunities/digital-data-management/
• DOE and NSF have indicated they will review
and evaluate DMPs
Data Sharing
•Digitally formatted data arising from unclassified, publicly
releasable research and programs.
•Decentralized approach to data storage.
•Allow for inclusion of costs for data management and access.
•Will establish a system to enable the identification, attribution,
(federated) storage, and access of digital data.
From NASA FAQ
•“First of all, be reassured that we are not going to force you to
reveal your precious proprietary data prior to publication. No
personal, proprietary or ITAR data is included.”
http://science.nasa.gov/researchers/sara/faqs/dmp-faq-roses/
BUT
NIH and NSF still have policies that cover more
than digital data.
NSF includes specimens, software, etc.
http://creativecommons.org/licenses/by-nc-nd/3.0/ http://theupturnedmicroscope.com/
Why NIH Wants Data Sharing
Data sharing achieves many important goals for the scientific
community, such as
• reinforcing open scientific inquiry
• encouraging diversity of analysis and opinion,
• promoting new research, testing of new or alternative hypotheses
and methods of analysis
• supporting studies on data collection methods and measurement
• facilitating education of new researchers
• enabling the exploration of topics not envisioned by the initial
investigators
• permitting the creation of new datasets by combining data from
multiple sources.
Why Funders Want Planning and
Public Access
• Make useful data/knowledge available to be developed into
something commercial.
• Show good stewardship of taxpayer/donor funds.
• Contribute to transparency and reproducibility of research.
• Leverages return on research investment
• Creates tool to manage research portfolio
• Avoids funding duplicative research
• Encourages greater interaction with results of funded
research
Benefits to Planning
• Find and understand data when it is needed.
• Less likely to be missing data or notes when it comes time to
process and analyze results.
• All project staff are aware of what they need to do when doing
research and collecting data.
• There is continuity if project staff leave or new researchers join.
• Avoid unnecessary duplication e.g. re-collecting or re-working data.
• Permissions and ownership are understood so there should be no
impediments to publication.
• Data underlying publications are maintained, allowing for validation
of results.
• Data can be found if it is needed for sharing.
• Data is available if needed for Freedom of Information Act (FOIA)
requests or to resolve intellectual property issues.
Benefits to Open or Public Access
• Increases the visibility, readership and impact of
author’s works
• Enhances interdisciplinary research
• Accelerates the pace of research, discovery and
innovation
• Leads to new collaborations between data users and
data creators
• Improves research and leads to better science
• Increases citations*
* A study by Piwowar, Day and Fridsma showed a 69% increase in citation,
http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0000308
Data Citation
• Force 11 developed data citation format,
Elsevier/Mendeley working to insure datasets have
DOIs and are cited.
https://www.elsevier.com/connect/data-citation-is-
becoming-real-with-force11-and-elsevier
• Data Citation Index available from VCU Libraries, and
other alternative metrics, e.g. downloads in Scholars
Compass, are available for data (and software).
WRITING YOUR DMP
What
Describe
Share
Reuse
Preserve
Data types, samples, software, other materials.
Standards, metadata, if applicable. Readme or Data Dictionary
How will you share/make public your data?
What can be done with your data? Licenses can help.
How long and where data will be kept.
What Data types, samples, software, other materials.
Examples of Research Data
• Documents (text, Word), spreadsheets
• Laboratory notebooks, field notebooks, diaries
• Questionnaires, transcripts, codebooks
• Audiotapes, videotapes
• Photographs, films
• Protein or genetic sequences
• Spectra
• Test responses
• Slides, artifacts, specimens, samples
• Collection of digital objects acquired and generated during the process of research
• Database contents (video, audio, text, images)
• Models, algorithms, scripts
• Contents of an application (input, output, logfiles for analysis software, simulation
software, schemas)
• Methodologies and workflows
• Standard operating procedures and protocols
Describe Standards, metadata, if applicable. Readme or Data Dictionary
https://twitter.com/DGuarch/status/663049353007931392
Describe – Document Your Data
• Readme File in all Folders
• Metadata
• Data Dictionary
Readme Files
• Names + contact information for people associated with the
project
• List of files, including a description of their relationship to one
another
• Copyright + licensing information
• Limitations of the data
• Funding sources / institutional support
• Any information necessary for someone with no knowledge of
your research to understand and / or replicate your work,
including methods.
Metadata
• Machine readable version of Readme
• Descriptive – describes object in question,
whole dataset and each element of the set
• Administrative – preservation, IP rights
• Structural – physical and logical structure of
digital object
• Metadata Standards Directory
http://rd-alliance.github.io/metadata-directory/
http://datadryad.org/resource/doi:10.5061/dryad.jg05d
How to Comply with Grants: Writing Data Management Plans and Providing Public Access
How to Comply with Grants: Writing Data Management Plans and Providing Public Access
All Points Alone Points
Data Dictionary
• Define terms used
• If measurements are made, gives units and
explains exactly how measured or calculated
• How item is recorded, especially when there
are multiple options, e.g. date
https://docs.google.com/spreadsheets/d/1PYOhBh6bglh6BkQFlpvNLOwlpzvQyguWAG8AkQMtU0s/edit#gid=0
Share How will you share/make public your data?
Data Types to Share
• NIH - Final Research Data - Recorded factual material
commonly accepted in the scientific community as necessary
to document and support research findings. (spreadsheets,
images, scans of written notes if applicable, etc.)
• OSTP - Digitally formatted data arising from unclassified,
publicly releasable research and programs.
• NSF – all types, documents, videos, software, etc.
• Should also mention who is responsible for data, e.g. PI or
somebody else in group.
Sharing Exclusions
• preliminary analyses,
• drafts of scientific papers,
• plans for future research,
• peer reviews,
• communications with colleagues,
• Trade secrets, commercial information, materials necessary
to be held confidential by a researcher until they are
published, or similar information which is protected under
law,
• Personnel and medical information and similar information
that could be used to identify a particular person in a
research study.
Why Share?
• Helps to avoid duplication, thereby reducing costs and wasted
effort.
• Promotes scientific integrity and debate. See Collins and Tabak
article in Science on NIH plans to enhance reproducibility.
• Enables scrutiny of research findings and allows for validation of
results.
• Leads to new collaborations between data users and data creators.
• Improves research and leads to better science.
• Enables the exploration of topics not envisioned by the initial
investigators.
• Permits the creation of new datasets by combining data from
multiple sources.
• Increases citations. A study by Piwowar, Day and Fridsma showed a
69% increase in citations.
http://scholarscompass.vcu.edu/
Sharing
• VCU Libraries provides storage
and sharing of data,
publications, and other
materials.
• Persistent URL and Google
indexing will make your work
easily available.
• You must have copyright for
any submission.
Other Ways to Share Data
Upload to open repository; general, subject, or
institutional.
• figshare http://figshare.com/
• Zenodo https://zenodo.org/
• Open Science Framework https://osf.io/
• DataVerse http://dataverse.org/
• Search Registry of Research Data Repositories
http://www.re3data.org/
Supplemental file with journal article or link to
the upload.
– Be sure to check the contract.
– Will the data be available to the public as per
OSTP if grant funded?
– Will the rights conflict with institutional ownership
of the data?
– Journals often use Figshare or Dryad
http://datadryad.org/
Sharing Sensitive Data
http://iom.nationalacademies.org/Reports/2015/Sharing-Clinical-Trial-Data.aspx
Sensitive Data Access
• Researchers must request access to database,
explaining research and providing IRB
approval forms, e.g. registry
or
• Data must be deidentified or anonymized in
some way before being made publicly
available.
http://transparency.efpia.eu/responsible-data-sharing/efpia-clinical-trial-data-portal-gateway
Reuse What can be done with your data? Licenses can help.
Public vs Open Access
Public
• free of cost to read
• not free to use or reuse
• usually not final version
• often embargoed
• journal generally owns
copyright
Open
• free of cost to read
• free to use or reuse, no
copyright or licensing
restrictions
• no embargos
• author retains copyright
• see Peter Suber for
more information
http://legacy.earlham.edu/~peters/fos/overview.htm
Reuse – License Your Data
• Creative Commons licenses
https://creativecommons.org/licenses/
or use license chooser
https://creativecommons.org/choose/
• Open Data Commons
http://opendatacommons.org/
• Pantone Principles
http://pantonprinciples.org/
Preserve How long and where data will be kept.
Preserve
• How long must the data be kept?
– Minimum 5 years after publication or final grant
report.
– Check grant.
• What is the long-term value of the data?
– If it will be in a subject repository, you can say
indefinitely.
Storage vs Backup
storage = working files
The files you access regularly and change frequently. In
general, losing your storage means losing current
versions of the data.
backup = regular process of copying data separate from
storage.
You don’t really need it until you lose data, but when
you need to restore a file it will be the most important
process you have in place.
Rule of 3
Keep THREE copies of your data –
TWO onsite –
ONE offsite
Example – One: Laptop – Two: External hard drive –
Three: Cloud storage
This ensures that your storage and backup is not all in
the same place – that’s too risky!
http://dataabinitio.com/?p=320
Where to Preserve Data
• Google Drive (faculty and staff only)
http://guides.library.vcu.edu/data/GoogleDrive
• Subject Repository, e.g. ICPSR
• Scholars Compass
• Government Repository, e.g. NCBI
Not every repository will preserve data for the
time period required by grants. Check the
contract.
Don’t Forget Print
• Set a schedule to scan lab notebooks and other print
materials (makes for a good back up and easier to share
data within group).
• Print original should have similar security to digital data (i.e.
good, secure storage and labelling of files).
Stored Example
Final MS for deposit
Data to support figure and images
ARE YOU DONE YET?
NSF Current Guidance
1. the types of data, samples, physical collections, software, curriculum
materials, and other materials to be produced in the course of the
project;
2. the standards to be used for data and metadata format and content
(where existing standards are absent or deemed inadequate, this should
be documented along with any proposed solutions or remedies);
3. policies for access and sharing including provisions for appropriate
protection of privacy, confidentiality, security, intellectual property, or
other rights or requirements;
4. policies and provisions for re-use, re-distribution, and the production of
derivatives; and
5. plans for archiving data, samples, and other research products, and for
preservation of access to them
http://www.nsf.gov/pubs/policydocs/pappguide/nsf15001/gpg_2.jsp#IIC2j
DMPTool
https://dmptool.org/
Help is Available
• Guides
– Research Data Management http://guides.library.vcu.edu/data
– DMPTool guide http://guides.library.vcu.edu/dmptool
– Comply with Public Access Mandates
http://guides.library.vcu.edu/publicaccess
• Consultations
• Training
• Contact me:
Margaret Henderson, MLIS, AHIP
Associate Professor
Director, Research Data Management
VCU Libraries
(804)628-2714
mehenderson@vcu.edu
https://www.flickr.com/photos/travelinlibrarian/223839049 by Michael Sauers

More Related Content

PPTX
Inroads into Data: Getting Involved in Data at Your Institution
PPTX
Compliance: Data Management Plans and Public Access to Data
PDF
NSF Data Requirements and Changing Federal Requirements for Research
PPTX
Al aposter mhenderson2015
PDF
Va sla nov 15 final
PPTX
NSF Data Management Requirements 101
PDF
Federal funder mandates
PDF
Research Data Management: How will Northwestern address new sharing requireme...
Inroads into Data: Getting Involved in Data at Your Institution
Compliance: Data Management Plans and Public Access to Data
NSF Data Requirements and Changing Federal Requirements for Research
Al aposter mhenderson2015
Va sla nov 15 final
NSF Data Management Requirements 101
Federal funder mandates
Research Data Management: How will Northwestern address new sharing requireme...

What's hot (20)

PPTX
Racm april29 ostp
PPTX
Computational Research day 2015
PPTX
Introduction to data management
PPTX
Research Data Management for SOE
PPTX
Data Literacy: Creating and Managing Reserach Data
PPTX
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
PDF
You down with dmp yeah you know me!
PPTX
DataONE Education Module 03: Data Management Planning
PPTX
Overview and library support for data management/sharing
PPTX
Data Services presentation for Psychology
PPTX
Building and providing data management services a framework for everyone!
PPTX
Introduction to data management
PPTX
Publishing perspectives on data management & future directions
PPTX
Data Services/ICPSR presentation for School of Education
PDF
Data Management Lab: Session 1 Slides
PDF
dkNET Webinar: dkNET Hypothesis Center Live Demo 09/24/2021
PPTX
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
PPTX
NIH Big Data to Knowledge (BD2K)
PPTX
DataONE Education Module 07: Metadata
PDF
Data Management Lab: Data management plan instructions
Racm april29 ostp
Computational Research day 2015
Introduction to data management
Research Data Management for SOE
Data Literacy: Creating and Managing Reserach Data
RDAP 16: DMPs and Public Access: An NIH Perspective (Panel 5, DMPs and Public...
You down with dmp yeah you know me!
DataONE Education Module 03: Data Management Planning
Overview and library support for data management/sharing
Data Services presentation for Psychology
Building and providing data management services a framework for everyone!
Introduction to data management
Publishing perspectives on data management & future directions
Data Services/ICPSR presentation for School of Education
Data Management Lab: Session 1 Slides
dkNET Webinar: dkNET Hypothesis Center Live Demo 09/24/2021
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
NIH Big Data to Knowledge (BD2K)
DataONE Education Module 07: Metadata
Data Management Lab: Data management plan instructions
Ad

Similar to How to Comply with Grants: Writing Data Management Plans and Providing Public Access (20)

PPTX
Research Data Management for Clinical Trials and Quality Improvement
PDF
Open data oct 2013
PPTX
Ostp memo henderson_reznik-zellen_april2015
PPTX
Helping Your Researches Get the Credit They Deserve
PDF
Research Data Management: Part 1, Principles & Responsibilities
PPTX
Data management federal requirements 9 2015
PDF
NIH Data Sharing Plan Workshop - Slides
PPTX
Library resources and services for grant development
PPT
Overview of Emerging Requirements for Data Management of Federally Funded Res...
PPTX
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
PDF
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
PPTX
Funder requirements for Data Management Plans
PDF
Alain Frey Research Data for universities and information producers
PPTX
RDAP14: OSTP Panel NIH’s Update Public Access
PPTX
Magle data curation in libraries
PDF
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
PPTX
Research Data Management Services at UWA (November 2015)
PPTX
EPSRC research data expectations and research software management
PDF
Praetzellis "Data Management Planning and Tools"
Research Data Management for Clinical Trials and Quality Improvement
Open data oct 2013
Ostp memo henderson_reznik-zellen_april2015
Helping Your Researches Get the Credit They Deserve
Research Data Management: Part 1, Principles & Responsibilities
Data management federal requirements 9 2015
NIH Data Sharing Plan Workshop - Slides
Library resources and services for grant development
Overview of Emerging Requirements for Data Management of Federally Funded Res...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
Funder requirements for Data Management Plans
Alain Frey Research Data for universities and information producers
RDAP14: OSTP Panel NIH’s Update Public Access
Magle data curation in libraries
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
Research Data Management Services at UWA (November 2015)
EPSRC research data expectations and research software management
Praetzellis "Data Management Planning and Tools"
Ad

More from Margaret Henderson (10)

PDF
Final long version notes for Preparing Health Sciences Students for Real Worl...
PPTX
Preparing Health Sciences Students for Real World Information Gathering Using...
PPTX
Ps rwebinar january2019final
DOCX
NNLM SEA webinar June 2018 script
DOCX
Script for MIS webinar 2016 - RDM for Clinical Trials and Quality Improvement
PDF
Notes for Inroads into Data
DOCX
Rdap panel script
PPTX
M henderson rdap2014
DOCX
Ehr presentation script for blog
PPTX
Connecting eh rdataquad12
Final long version notes for Preparing Health Sciences Students for Real Worl...
Preparing Health Sciences Students for Real World Information Gathering Using...
Ps rwebinar january2019final
NNLM SEA webinar June 2018 script
Script for MIS webinar 2016 - RDM for Clinical Trials and Quality Improvement
Notes for Inroads into Data
Rdap panel script
M henderson rdap2014
Ehr presentation script for blog
Connecting eh rdataquad12

Recently uploaded (20)

PDF
Item # 5 - 5307 Broadway St final review
PPTX
Part II LGU Accreditation of CSOs and Selection of Reps to LSBs ver2.pptx
PPTX
Robotics_Presentation.pptxdhdrhdrrhdrhdrhdrrh
PPTX
LUNG CANCER PREDICTION MODELING USING ARTIFICIAL NEURAL NETWORK.pptx
PDF
eVerify Overview and Detailed Instructions to Set up an account
PPTX
20231018_SRP Tanzania_IRC2023 FAO side event.pptx
PPTX
Workshop-Session-1-LGU-WFP-Formulation.pptx
PPTX
Presentatio koos kokos koko ossssn5.pptx
PPTX
BHARATIYA NAGARIKA SURAKSHA SAHMITA^J2023 (1).pptx
PDF
PPT Item # 5 - 5307 Broadway St (Final Review).pdf
PDF
CXPA Finland Webinar - Modern Components of Service Quality - Alec Dalton - ...
PDF
PPT Item # 9 - FY 2025-26 Proposed Budget.pdf
DOCX
EAPP.docxdffgythjyuikuuiluikluikiukuuuuuu
PDF
UNEP/ UNEA Plastic Treaty Negotiations Report of Inc 5.2 Geneva
PDF
Concept_Note_-_GoAP_Primary_Sector_-_The_Great_Rural_Reset_-_Updated_18_June_...
PPTX
Core Humanitarian Standard Presentation by Abraham Lebeza
PDF
Item # 8 - 218 Primrose Place variance req.
PPTX
Empowering Teens with Essential Life Skills 🚀
PPT
The Central Civil Services (Leave Travel Concession) Rules, 1988, govern the ...
PDF
PPT Items # 6&7 - 900 Cambridge Oval Right-of-Way
Item # 5 - 5307 Broadway St final review
Part II LGU Accreditation of CSOs and Selection of Reps to LSBs ver2.pptx
Robotics_Presentation.pptxdhdrhdrrhdrhdrhdrrh
LUNG CANCER PREDICTION MODELING USING ARTIFICIAL NEURAL NETWORK.pptx
eVerify Overview and Detailed Instructions to Set up an account
20231018_SRP Tanzania_IRC2023 FAO side event.pptx
Workshop-Session-1-LGU-WFP-Formulation.pptx
Presentatio koos kokos koko ossssn5.pptx
BHARATIYA NAGARIKA SURAKSHA SAHMITA^J2023 (1).pptx
PPT Item # 5 - 5307 Broadway St (Final Review).pdf
CXPA Finland Webinar - Modern Components of Service Quality - Alec Dalton - ...
PPT Item # 9 - FY 2025-26 Proposed Budget.pdf
EAPP.docxdffgythjyuikuuiluikluikiukuuuuuu
UNEP/ UNEA Plastic Treaty Negotiations Report of Inc 5.2 Geneva
Concept_Note_-_GoAP_Primary_Sector_-_The_Great_Rural_Reset_-_Updated_18_June_...
Core Humanitarian Standard Presentation by Abraham Lebeza
Item # 8 - 218 Primrose Place variance req.
Empowering Teens with Essential Life Skills 🚀
The Central Civil Services (Leave Travel Concession) Rules, 1988, govern the ...
PPT Items # 6&7 - 900 Cambridge Oval Right-of-Way

How to Comply with Grants: Writing Data Management Plans and Providing Public Access

  • 1. How to Comply with Grants: Writing Data Management Plans and Providing Public Access Margaret Henderson Director, Research Data Management mehenderson@vcu.edu February 2016
  • 3. Research Data: Recorded information, regardless of form or the media on which it may be recorded, which constitute the original observations and methods of a study and the analyses of these original data that are necessary for reconstruction and evaluation of the Report(s) of a study made by one or more Investigators. Research Data also includes all such recorded information gathered in anticipation of a Report. Research Data differ among disciplines. The term may include but is not limited to technical information, computer software, laboratory and other notebooks, printouts, worksheets, other media, survey, memoranda, evaluations, notes, databases, clinical case history records, study protocols, statistics, findings, conclusions, samples, physical collections, other supporting materials created or gathered in the course of the Research, Tangible Research Property, unique Research resources such as synthetic compounds, organisms, cell lines, viruses, cell products, cloned DNA as well as genetic sequences and mapping information, crystallographic coordinates, plants, animals and spectroscopic data, and other compilations formed by selecting and assembling preexisting materials in a unique way. The term does not include information incidental to research administration such as financial, administrative, cost or pricing, or management information. http://www.policy.vcu.edu/sites/default/files/Research%20Data%20Ownership%2C%20Retention%2C%20Access%20and%20Securty.pdf
  • 4. While VCU Owns the Data.. Principal Investigator is the Data Steward and is responsible for the integrity, preservation and security of Research Data.
  • 5. Data Management Plan Outlines how a researcher will: • collect • organize • back up • storing • share the data for a project, and indicates who the data steward will be.
  • 10. NIH Public Access Policy SEC. 218. The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine’s PubMed Central an electronic version of their final peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication: Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law. https://publicaccess.nih.gov/
  • 11. NIH Data Sharing Policy “Data should be made as widely and freely available as possible while safeguarding the privacy of participants, and protecting confidential and proprietary data. To facilitate data sharing, investigators submitting a research application requesting $500,000 or more of direct costs in any single year to NIH on or after October 1, 2003 are expected to include a plan for sharing final research data for research purposes, or state why data sharing is not possible. “ http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm
  • 12. NIH Genomic Data Sharing Policy “Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome- Wide Association Studies (GWAS) (effective January 2015) • “For the purposes of this policy, a genome-wide association study is defined as any study of genetic variation across the entire human genome that is designed to identify genetic associations with observable traits (such as blood pressure or weight), or the presence or absence of a disease or condition.” • Applies to all NIH-funded research that generates large-scale human or non-human genomic data, as well as the use of those data for subsequent research. • Requires “Genomic Data Sharing Plan”. • Allows for expenses in project budget. • Requires public availability of data in a “timely manner.” • Recommends NIH-funded or third-party repositories for deposition. http://grants.nih.gov/grants/guide/notice-files/NOT-OD-07-088.html
  • 13. NSF Policies NSF Data Sharing Policy Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4. http://www.nsf.gov/bfa/dias/policy/dmp.jsp NSF Data Management Plan Requirements Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled “Data Management Plan”. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. See Grant Proposal Guide (GPG) Chapter II.C.2.j for full policy implementation. https://www.nsf.gov/eng/general/dmp.jsp
  • 14. NSF Policies NSF Data Sharing Policy Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4. http://www.nsf.gov/bfa/dias/policy/dmp.jsp NSF Data Management Plan Requirements Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled “Data Management Plan”. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. See Grant Proposal Guide (GPG) Chapter II.C.2.j for full policy implementation. https://www.nsf.gov/eng/general/dmp.jsp Slide courtesy of Amanda Whitmire
  • 15. OSTP Memorandum Increasing Access to the Results of Federally Funded Scientific Research -February 22, 2013 “ensuring that, … the direct results of federally funded scientific research are made available to and useful for the public, industry, and the scientific community. Such results include peer- reviewed publications and digital data.” “develop plans to make the results of federally-funded research publically available free of charge within 12 months after original publication.” https://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research
  • 16. THE PLANS SO FAR http://guides.library.vcu.edu/publicaccess
  • 17. Department of Health and Human Services Guiding Principles and Common Approach for Enhancing Public Access to the Results of Research Funded by HHS Operating Divisions (February 2015) Provides continuity across Operating Divisions: NIH, CDC, FDA, AHRQ, and ASPR (voluntary) • PubMed Central will serve as the repository for publications funded by HHS Operating Divisions • HHS will enable stakeholders to petition to shorten the 12-month maximum embargo period for publications • HHS will take a common, stepped approach to establishing data policy and infrastructure: assessment, inventory, DMPs, pilots, training
  • 18. NIH – National Institutes of Health Plan for Increasing Access to Scientific Publications and Digital Scientific Data from NIH Funded Scientific Research (February 2015) • In effect for publications; policy changes to support data to be implemented by December 2015 • Applies to research/researchers funded wholly or in part by the NIH
  • 19. NIH Publications • Peer-reviewed scientific articles • Deposit of final peer-reviewed manuscript into PMC • Upon acceptance, with maximum 12-month embargo • Include appropriate costs in proposals • Reporting through eRA Commons and My NCBI • Withholding of funds Data • Unclassified digital scientific research data • Submission of DMP; deposit of data into appropriate, existing, publicly accessible repositories, including NIH data repositories • Upon acceptance for publication (will explore) • Include appropriate costs in proposals • Utilize existing reporting structures • “enforcement actions” including withholding of funds
  • 20. CDC – Centers for Disease Control and Prevention CDC Plan for Increasing Access to Scientific Publications and Digital Scientific Data Generated with CDC Funding (January 2015) • In effect for publications; October 2015 (FY 2016 funding cycle) for data • Applies to research/researchers funded by the CDC (intra- and extramural) • CDC Stacks http://stacks.cdc.gov/ for public health information
  • 21. CDC Publications • Peer-reviewed research publications • Deposit of final peer-reviewed manuscript into NIHMS (CDS Stacks, PMC) • Upon acceptance, with maximum 12-month embargo • eClearance scientific clearance system; existing reporting structures (Office of Science Quality) Data • Unclassified digital scientific research data • Submission of DMP (generic CDC template); deposit of data into suitable platform (tbd) • Upon acceptance for publication or within 30 months of collection • Include appropriate costs in proposals • Utilize existing reporting structures • Reduction or restriction of funds, award termination, negative influence on future awards
  • 22. FDA – Food and Drug Administration Plan to Increase Access to Results of FDA-Funded Scientific Research (February 2015) • Effective October 2015 • Applies to research/researchers funded wholly or in part by the FDA (intra- and extramural) https://open.fda.gov/ open-source APIs (application program interface)
  • 23. FDA Publications • Peer-reviewed scientific articles • Deposit of final peer-reviewed manuscript into PMC • With maximum 12-month embargo* • Include appropriate costs in proposals • Utilize existing reporting structures • Termination of contract or grant; withholding of funds Data • Digitally formatted scientific data resulting from unclassified research • Submission of DMP; deposit of data into discipline-specific repositories • Upon acceptance for publication • Include appropriate costs in proposals • Utilize existing reporting structures • Termination of contract or grant; withholding of funds
  • 24. AHRQ – Agency for Healthcare Research and Quality AHRQ Public Access to Federally Funded Research (February 2015) • Effective February 2015 for publications; October 2015 for data • Applies to all research/researchers funded wholly or in part by AHRQ (Intramural, extramural, or contract researchers)
  • 25. AHRQ Publications • Peer-reviewed scholarly research articles • Deposit of final peer- reviewed manuscript into PMC • Maximum 12-month embargo* • Include appropriate costs in proposals and applications • Utilize existing reporting structures (NIH) • Withholding of funding Data • Unclassified research dataˡ • Submission of DMP; deposit of data to AHRQ▪ or other repository • Upon acceptance for publication • Include appropriate costs in proposals and applications • Utilize existing reporting structures • Negative influence on future funding
  • 26. ASPR – Assistant Secretary for Preparedness and Response Public Access to Federally Funded Research: Publications and Data (February 2015) • Effective October 2015 • Applies to Researchers funded wholly or in part by ASPR
  • 27. ASPR Publications • Peer-reviewed scholarly research articles • Deposit of final peer-reviewed manuscript into PMC • Maximum 12-month embargo* • Appropriate costs included in proposals and applications • Existing reporting structures (NIH) • Withholding of funding Data • Digital scientific dataˡ • Submission of DMP; deposit of data to recognized scientific repository • 30 months from creation of data set, or upon publication • Appropriate costs included in proposals and applications • Staff and peer review; existing reporting structures • Negative influence on future funding
  • 28. DOD • Plan • SPARC overview http://sparc.arl.org/blog/dod-releases-draft-public- access-plan • Starting FY 2015 • Completed 4th quarter FY 2106 • Allows for inclusion of costs in proposals • Requirements: DMPs cover data sharing and data are available before making subsequent awards • Plan on compliance monitor and certification tokens • Encourages authors to negotiate their copyright for papers DTIC (Defense Technical Information Center): repository for full-text of peer- reviewed author final manuscripts or publisher versions of research articles and repository for metadata of digital scientific data sets.
  • 29. DOD Publications • Peer-reviewed scholarly publications arising from unclassified, publicly releasable research and programs. • Deposit peer-reviewed scholarly publications into DOD public access archive system (DTIC) • DOD will establish a system to enable the submission of final, peer-reviewed manuscripts. • minimum 12 month embargo Data • Digitally formatted data arising from unclassified, publicly releasable research and programs. • Decentralized approach to data storage. • Require the submission of data management plans. • Allow for inclusion of costs for data management and access. • Will establish a system to enable the identification, attribution, (federated) storage, and access of digital data.
  • 30. DOE • Public Access Plan • Already started for Office of Science grants • PAGES - a web-based portal that will provide free public access to accepted peer-reviewed manuscripts or published scientific journal articles within 12 months of publication • Data management plan requirements and guidance • “unclassified and unrestricted” • “Not all data need to be shared or preserved. The costs and benefits of doing so should be considered in data management planning.”
  • 31. DOE Publications • Classified or protected data and research will not be publicly available. • Public Access Gateway for Energy and Science (PAGES) maintained by Office of Scientific and Technical Information (OSTI) • Deposit abstract and metadata and link to Version of Record (publisher pdf, final MS in repository, or DOE dark archive) • within 12 months of publication Data • Office of Science started DMP requirement in July 2014 (supports ⅔ of R&D) • Other offices start Oct. 1, 2015 • DMP required and merits evaluated. • Covers unclassified and unrestricted digital research data, i.e. digital data required to validate findings. • Enterprise Data Inventory -> Public Data Listing -> populates data.gov
  • 32. NSF Plan – Starting January 2016 Public Access to Results of NSF-funded Research overview SPARC Comments http://sparc.arl.org/blog/nsf-releases-incremental-plan-for-public- access “deposit final accepted manuscripts (or published articles) into the Department of Energy’s “PAGES” repository – a dark archive – with public access to be provided via links to publisher’s websites.” “NSF plan will extend to papers published in “juried conference proceedings,” as well as peer-reviewed journals, and the agency notes it intends to eventually include other types of NSF-supported grey literature and educational materials under the final policy” “No indication of how the kinds of productive reuse (computation, text and data mining etc.) set out by the White House Directive will be facilitated…”
  • 33. NSF Publications • Working with DOE/OSTI to create a version of PAGES for NSF papers.(i.e. paper will need to be available from publisher or IR) • Voluntary deposit will start December 2015. • Need persistent ID and machine-readable metadata. • No more than 12 month embargo. Data • All proposals need 2 page DMP that will be part of merit review and be monitored. • Data in appropriate repository, metadata required. • Funds available to prepare data for sharing. • Exploring data deposit at time of paper publication. • No time frame for preservation
  • 34. NASA • Plan • Started February 2015 • Completed by October 2015 • SPARC Overview http://sparc.arl.org/blog/nasa-public-access-plan-available-uses- nihs-pmc-platform “The analysis compared the merits of the NIH PubMed Central (PMC) database, the DOE’s Public Access Gateway for Energy and Science (PAGES) system, and the Clearinghouse for the Open Research of the United States (CHORUS) platform proposed by the publishing industry. Ultimately, NASA opted to work with NIH’s PMC database.” (SPARC emphasis!) DMP FAQ ROSES (Research Opportunities in Space and Earth Science) http://science.nasa.gov/researchers/sara/faqs/dmp-faq-roses/ “First of all, be reassured that we are not going to force you to reveal your precious proprietary data prior to publication. No personal, proprietary or ITAR data is included.”
  • 35. NASA Publications • NIH PMC will provide a NASA‐branded portal to the full functionality of the PMC system. • 12 month embargo. Publishers can petition for longer. • Publications cited in reports must be in repository. Data • At a minimum required DMP for ROSES must promise to release the data needed to reproduce figures, tables and other representations in publications, at time of publication or within reasonable time period. • Publication should provide link to data. • Only the data used to support, validate, and corroborate published research findings are required to be shared, per this plan. Preliminary data, trial data, etc. are not included. • NASA will develop a data catalog
  • 36. WHAT DOES THIS MEAN FOR YOU?
  • 37. Public Access to Peer Reviewed Articles Check Author’s Rights : • work with the publisher before any publication rights are transferred to ensure that all conditions of the …public access policy can be met. • advised not to sign any agreements with publishers that do not allow the author to comply with the …public access initiative. *SPARC provides an author addendum that will allow deposit in repositories: http://sparcopen.org/our- work/author-rights/#addendum
  • 38. Data Management Plans • All agencies will require a data management plan. • “Not all data need to be shared or preserved. The costs and benefits of doing so should be considered in data management planning.” DOE third principle http://science.energy.gov/funding-opportunities/digital-data-management/ • DOE and NSF have indicated they will review and evaluate DMPs
  • 39. Data Sharing •Digitally formatted data arising from unclassified, publicly releasable research and programs. •Decentralized approach to data storage. •Allow for inclusion of costs for data management and access. •Will establish a system to enable the identification, attribution, (federated) storage, and access of digital data. From NASA FAQ •“First of all, be reassured that we are not going to force you to reveal your precious proprietary data prior to publication. No personal, proprietary or ITAR data is included.” http://science.nasa.gov/researchers/sara/faqs/dmp-faq-roses/
  • 40. BUT NIH and NSF still have policies that cover more than digital data. NSF includes specimens, software, etc.
  • 42. Why NIH Wants Data Sharing Data sharing achieves many important goals for the scientific community, such as • reinforcing open scientific inquiry • encouraging diversity of analysis and opinion, • promoting new research, testing of new or alternative hypotheses and methods of analysis • supporting studies on data collection methods and measurement • facilitating education of new researchers • enabling the exploration of topics not envisioned by the initial investigators • permitting the creation of new datasets by combining data from multiple sources.
  • 43. Why Funders Want Planning and Public Access • Make useful data/knowledge available to be developed into something commercial. • Show good stewardship of taxpayer/donor funds. • Contribute to transparency and reproducibility of research. • Leverages return on research investment • Creates tool to manage research portfolio • Avoids funding duplicative research • Encourages greater interaction with results of funded research
  • 44. Benefits to Planning • Find and understand data when it is needed. • Less likely to be missing data or notes when it comes time to process and analyze results. • All project staff are aware of what they need to do when doing research and collecting data. • There is continuity if project staff leave or new researchers join. • Avoid unnecessary duplication e.g. re-collecting or re-working data. • Permissions and ownership are understood so there should be no impediments to publication. • Data underlying publications are maintained, allowing for validation of results. • Data can be found if it is needed for sharing. • Data is available if needed for Freedom of Information Act (FOIA) requests or to resolve intellectual property issues.
  • 45. Benefits to Open or Public Access • Increases the visibility, readership and impact of author’s works • Enhances interdisciplinary research • Accelerates the pace of research, discovery and innovation • Leads to new collaborations between data users and data creators • Improves research and leads to better science • Increases citations* * A study by Piwowar, Day and Fridsma showed a 69% increase in citation, http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0000308
  • 46. Data Citation • Force 11 developed data citation format, Elsevier/Mendeley working to insure datasets have DOIs and are cited. https://www.elsevier.com/connect/data-citation-is- becoming-real-with-force11-and-elsevier • Data Citation Index available from VCU Libraries, and other alternative metrics, e.g. downloads in Scholars Compass, are available for data (and software).
  • 48. What Describe Share Reuse Preserve Data types, samples, software, other materials. Standards, metadata, if applicable. Readme or Data Dictionary How will you share/make public your data? What can be done with your data? Licenses can help. How long and where data will be kept.
  • 49. What Data types, samples, software, other materials.
  • 50. Examples of Research Data • Documents (text, Word), spreadsheets • Laboratory notebooks, field notebooks, diaries • Questionnaires, transcripts, codebooks • Audiotapes, videotapes • Photographs, films • Protein or genetic sequences • Spectra • Test responses • Slides, artifacts, specimens, samples • Collection of digital objects acquired and generated during the process of research • Database contents (video, audio, text, images) • Models, algorithms, scripts • Contents of an application (input, output, logfiles for analysis software, simulation software, schemas) • Methodologies and workflows • Standard operating procedures and protocols
  • 51. Describe Standards, metadata, if applicable. Readme or Data Dictionary
  • 53. Describe – Document Your Data • Readme File in all Folders • Metadata • Data Dictionary
  • 54. Readme Files • Names + contact information for people associated with the project • List of files, including a description of their relationship to one another • Copyright + licensing information • Limitations of the data • Funding sources / institutional support • Any information necessary for someone with no knowledge of your research to understand and / or replicate your work, including methods.
  • 55. Metadata • Machine readable version of Readme • Descriptive – describes object in question, whole dataset and each element of the set • Administrative – preservation, IP rights • Structural – physical and logical structure of digital object • Metadata Standards Directory http://rd-alliance.github.io/metadata-directory/
  • 60. Data Dictionary • Define terms used • If measurements are made, gives units and explains exactly how measured or calculated • How item is recorded, especially when there are multiple options, e.g. date
  • 62. Share How will you share/make public your data?
  • 63. Data Types to Share • NIH - Final Research Data - Recorded factual material commonly accepted in the scientific community as necessary to document and support research findings. (spreadsheets, images, scans of written notes if applicable, etc.) • OSTP - Digitally formatted data arising from unclassified, publicly releasable research and programs. • NSF – all types, documents, videos, software, etc. • Should also mention who is responsible for data, e.g. PI or somebody else in group.
  • 64. Sharing Exclusions • preliminary analyses, • drafts of scientific papers, • plans for future research, • peer reviews, • communications with colleagues, • Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law, • Personnel and medical information and similar information that could be used to identify a particular person in a research study.
  • 65. Why Share? • Helps to avoid duplication, thereby reducing costs and wasted effort. • Promotes scientific integrity and debate. See Collins and Tabak article in Science on NIH plans to enhance reproducibility. • Enables scrutiny of research findings and allows for validation of results. • Leads to new collaborations between data users and data creators. • Improves research and leads to better science. • Enables the exploration of topics not envisioned by the initial investigators. • Permits the creation of new datasets by combining data from multiple sources. • Increases citations. A study by Piwowar, Day and Fridsma showed a 69% increase in citations.
  • 66. http://scholarscompass.vcu.edu/ Sharing • VCU Libraries provides storage and sharing of data, publications, and other materials. • Persistent URL and Google indexing will make your work easily available. • You must have copyright for any submission.
  • 67. Other Ways to Share Data Upload to open repository; general, subject, or institutional. • figshare http://figshare.com/ • Zenodo https://zenodo.org/ • Open Science Framework https://osf.io/ • DataVerse http://dataverse.org/ • Search Registry of Research Data Repositories http://www.re3data.org/
  • 68. Supplemental file with journal article or link to the upload. – Be sure to check the contract. – Will the data be available to the public as per OSTP if grant funded? – Will the rights conflict with institutional ownership of the data? – Journals often use Figshare or Dryad http://datadryad.org/
  • 70. Sensitive Data Access • Researchers must request access to database, explaining research and providing IRB approval forms, e.g. registry or • Data must be deidentified or anonymized in some way before being made publicly available.
  • 72. Reuse What can be done with your data? Licenses can help.
  • 73. Public vs Open Access Public • free of cost to read • not free to use or reuse • usually not final version • often embargoed • journal generally owns copyright Open • free of cost to read • free to use or reuse, no copyright or licensing restrictions • no embargos • author retains copyright • see Peter Suber for more information http://legacy.earlham.edu/~peters/fos/overview.htm
  • 74. Reuse – License Your Data • Creative Commons licenses https://creativecommons.org/licenses/ or use license chooser https://creativecommons.org/choose/ • Open Data Commons http://opendatacommons.org/ • Pantone Principles http://pantonprinciples.org/
  • 75. Preserve How long and where data will be kept.
  • 76. Preserve • How long must the data be kept? – Minimum 5 years after publication or final grant report. – Check grant. • What is the long-term value of the data? – If it will be in a subject repository, you can say indefinitely.
  • 77. Storage vs Backup storage = working files The files you access regularly and change frequently. In general, losing your storage means losing current versions of the data. backup = regular process of copying data separate from storage. You don’t really need it until you lose data, but when you need to restore a file it will be the most important process you have in place.
  • 78. Rule of 3 Keep THREE copies of your data – TWO onsite – ONE offsite Example – One: Laptop – Two: External hard drive – Three: Cloud storage This ensures that your storage and backup is not all in the same place – that’s too risky! http://dataabinitio.com/?p=320
  • 79. Where to Preserve Data • Google Drive (faculty and staff only) http://guides.library.vcu.edu/data/GoogleDrive • Subject Repository, e.g. ICPSR • Scholars Compass • Government Repository, e.g. NCBI Not every repository will preserve data for the time period required by grants. Check the contract.
  • 80. Don’t Forget Print • Set a schedule to scan lab notebooks and other print materials (makes for a good back up and easier to share data within group). • Print original should have similar security to digital data (i.e. good, secure storage and labelling of files).
  • 81. Stored Example Final MS for deposit Data to support figure and images
  • 82. ARE YOU DONE YET?
  • 83. NSF Current Guidance 1. the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project; 2. the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies); 3. policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements; 4. policies and provisions for re-use, re-distribution, and the production of derivatives; and 5. plans for archiving data, samples, and other research products, and for preservation of access to them http://www.nsf.gov/pubs/policydocs/pappguide/nsf15001/gpg_2.jsp#IIC2j
  • 85. Help is Available • Guides – Research Data Management http://guides.library.vcu.edu/data – DMPTool guide http://guides.library.vcu.edu/dmptool – Comply with Public Access Mandates http://guides.library.vcu.edu/publicaccess • Consultations • Training • Contact me: Margaret Henderson, MLIS, AHIP Associate Professor Director, Research Data Management VCU Libraries (804)628-2714 mehenderson@vcu.edu

Editor's Notes

  • #11: Recommended in 2004 by House Appropriations Committee, they recommended 6 month embargo, voluntary 2005 (per Peter Suber); mandatory requirement 2008; funding withheld 2013.
  • #12: This policy took effect earlier (2003) than public access but it is limited to larger grants so it isn’t as well known. Policy requires Data Sharing Plan to describe how final research data will be shared, or explain why data sharing is not possible. •Applies to any projects funded by NIH over 500K since 2003
  • #13: As data sharing becomes the norm, there will be more an more policies to make sure privacy and other ethical concerns are taken into account. Genomic Data Sharing Policy expands on previously implemented, long-standing policies to make data it funds publicly available in a timely manner. Genome wide association studies (GWAS) had data sharing policies initially implemented in 2007. No details on what should be in the genomic data sharing plan. “... Whole genome information, when combined with clinical and other phenotype data, offers the potential for increased understanding of basic biological processes affecting human health, improvement in the prediction of disease and patient care, and ultimately the realization of the promise of personalized medicine. In addition, rapid advances in understanding the patterns of human genetic variation and maturing high-throughput, cost-effective methods for genotyping are providing powerful research tools for identifying genetic variants that contribute to health and disease.”
  • #18: strategy for data policy development is “pragmatic” SPARC statement (March 2, 2015): assessment, inventory, DMPs, pilots, training; “setting a new default mode for research across the HHS” HHS indicates it will provide the means for both the public and internal operating or staff divisions to petition for shorter embargo periods – particularly for articles resulting from funding initiatives that are considered to have important scientific, public health or societal value requiring rapid communication. This is particularly important considering the variety of public health-related research conducted by the NIH, CDC, and FDA
  • #19: “To the extent allowed by law, this Policy [NIH Public Access Policy] meets all the requirements of the OSTP Directive.” (p 4) “NIH intends to make public access to digital scientific data the standard for all NIH-funded research.” -- “expanding its data sharing policy beyond current requirement…” “This [policy] outlines current NIH policies, programs, and procedures that support the overall goals of the OSTP memorandum and identifies further steps that may be taken to fulfill these goals in a more comprehensive manner to ensure public access to digital scientific data.” (p 24) “
  • #20: data sharing plan and data management plan are different. Existing NIH policies establish expectations for data sharing (2007 FDA Amendment Act requiring applicable clinical trials to go to clinicaltrials.gov; 2003 NIH Data Sharing Policy; 2002 NIH Intramural Policy on large Database Sharing; 2014 NIH Genomic Data Sharing Policy; Grants Policy Statement requiring final progress reports to describe sharable data). DMP is modification to 2003 NIH Data Sharing Policy. Note that some funding mechanisms (training grants) may be exempted. deposit to existing repositories “before considering other means of making data available.” NIH will develop guidance for key elements to be included in a DMP; determining which data should be prioritized for preservation (6b); finding acceptable repositories not funded by NIH (8b); NIH will expand its database of existing repositories for example NIH Data Science – The Commons http://datascience.nih.gov/commons (as per Philip Bourne)
  • #21: In effect for publications per the Public Access to CDC Funded Publications Policy, which was issued in July 2013: http://www.cdc.gov/maso/Policy/policy596.pdf Like the NIH, seeks to update and expand existing policies to comply with the objectives outlined in the OSTP memo.
  • #22: funding for publications not mentioned in this policy; measures for failure to comply for publications not mentioned in this policy Section C discusses the data covered, following the OMB-A110 definition and including microarray and aggregated data, quantitative measurements, survey and interview data, observational data, environmental data (p 18), and the data exempted; CDC determines what is research or non-research data; DMP will be assessed during proposal review and quality may affect scores -- DMP is aggregate of the Resource Sharing Plan and the Translation Plans sections of FOA/NOA; CDC will develop an online data registry via DMP templates
  • #23: FDA Transparency Initiative: Increasing Public Access to FDA's Compliance and Enforcement Data currently focused on datasets in the following areas: Adverse events. FDA’s publically available drug adverse event and medication error reports, and medical device adverse event reports. Recalls. Enforcement report data, containing information gathered from public notices about certain recalls of FDA-regulated products. Labeling. Structured Product Labeling (SPL) data for FDA-regulated human prescription drug, OTC drug and biological product labeling.
  • #24: SPARC notes emphasis on roles and responsiblities in policy, as well as scope - which defines what they consider to be data.
  • #25: SPARC notes emphasis on roles and responsiblities in policy, as well as scope - which defines what they consider to be data.
  • #26: *embargo will be subject to the HHS petition process ˡ OMB Circular A-110 definition of data;  scope statement (§2) identifies field, lab, and other data as specifically in scope, making other kinds of data such as models and code conditionally appropriate upon context, and other kinds out of scope (PII, proprietary, critical infrastructure, public use data). ▪ AHRQ will contract to develop a commercial repository and develop a data discovery index. IRs would  not be sufficient for publications; IRs may be used as “publicly accessible database,” but not mentioned in policy document. Overview from SPARC Overview from Scholarly Kitchen
  • #38: Also, check any rights when data is attached to a publication by the journal, or using journal recommended repository.