SlideShare a Scribd company logo
The value of data curation as
part of the publishing process
Varsha Khodiyar, PhD
Biocuration 2019
Antarcticameltdowncoulddoublesealevelrise
1
Data curation as part of publishing / 10th April 2019
A brief history of data curation at Springer Nature
• Scientific Data launched
May 2014
• Novel manuscript format,
the Data Descriptor
• Focus on data generation
and data peer review
• Machine readable metadata
file generated by in house
curators (ISA-Tab format) for
each published Data
Descriptor
www.nature.com/scientificdata
2
Data curation as part of publishing / 10th April 2019
Data Descriptors have human and machine
understandable components
Human readable
representation of
study
i.e. article (HTML &
PDF)
Human readable
representation of
study
i.e. article (HTML &
PDF)
3
Data curation as part of publishing / 10th April 2019
Data Descriptors have human and machine
understandable components
Machine
accessible
representation
of study, i.e.
metadata
Human
readable
summary of
the metadata
4
Data curation as part of publishing / 10th April 2019
Output from Scientific Data’s curation process
Machine readable overview of how
sources and samples were turned
into the digital data outputs.
Curator captures key dataset
characteristics using ontology terms:
• Type of study
• What was measured
• How it was measured
• Any independent variables
• Sample characteristics e.g.
- Species
- Geographical location
- Environment type
scientificdata.isa-explorer.org
5
Data curation as part of publishing / 10th April 2019
Publishing a data paper with Scientific Data
Deposit
data in an
appropriate
repository
Draft a
manuscript
based on
the
template
Submit
your
manuscript
Peer review
of the
manuscript
Revise the
manuscript
as required
Make any
changes
requested
by the data
curators
The data
descriptor
is published
6
Data curation as part of publishing / 10th April 2019
A brief history of data curation at Springer Nature
• Research Data Support
service (RDS) launched April
2017
• Expansion of data curation
practice to other Springer
Nature journals
• Provide support and advice
on research data sharing, for
authors and editors
• Promote best practice for
sharing research data
associated with a publication
www.springernature.com/la/authors/research-data
7
Data curation as part of publishing / 10th April 2019
To help authors and journals follow good practice in sharing and archiving of
research data, we provide optional data deposition and curation services.
Springer Nature Research Data Support
Researchers
submit their
data files
securely
The Research
Data team
curates the data
and metadata
The data are
published and
linked to the
author’s paper
More information is available on our website here:
http://www.springernature.com/gb/group/data-policy/data-support-services
8
Data curation as part of publishing / 10th April 2019
Comprehensive
description
including the data
context of the
study and data
gathering method
Altmetrics provide
information on
downloads and
citations
Relevant categories
and keywords added to
enhance discoverability
of the data
Dataset assigned a
DOI
Source: https://doi.org/10.6084/m9.figshare.5259415
Example of curation output from Research Data Support
Licence to clarify
reuse conditions
9
Data curation as part of publishing / 10th April 2019
Example author feedback report
10
Data curation as part of publishing / 10th April 2019
Checks carried out by the curation team
 Most appropriate repository used?
 Data and metadata at the repository
consistent with manuscript?
 Terms of use and terms of access for
the data consistent with manuscript?
 Terms of data use consistent with
journal policy?
11
Data curation as part of publishing / 10th April 2019
Addition of missing information
Error correction
Suggestions
to increase
FAIRness
Improvements to manuscript tables, text or figures to aid
understanding and reuse of the work
Data access or data license conditions updated at repository or
manuscript to aid accessibility
Repository metadata improved to aid dataset discoverability
Improvements to file names and/or file structure at the repository
to aid understanding and reuse of the work
Possible outcomes of curation
Manuscript
text
Manuscript
figure
Manuscript
table
Data files at
the repository
12
Data curation as part of publishing / 10th April 2019
Curation outcomes at Scientific Data (Study period March
2018 to March 2019)
77% of manuscripts - no
issues identified
23% of manuscripts - at least 1 issue identified and
resolved
10% of manuscripts - errors identified and resolved
13
Data curation as part of publishing / 10th April 2019
RDS curation outcomes (Study period March 2018 to March
2019)
In 55% of RDS
curation jobs, the
curator suggested
updates to the
repository hosted
data files
Sensitive data removed
Missing data added
License conditions updated
File format & naming improved
Mandated data moved to specialist repositories
Supplementary Information moved to repository
Opaque language clarified
14
Data curation as part of publishing / 10th April 2019
We encourage the use of community endorsed ontologies,
standards and repositories where possible
15
Data curation as part of publishing / 10th April 2019
We encourage the use of community endorsed ontologies,
standards and repositories where possible
springernature.com/gp/authors/research-data-policy/repositories/12327124
16
Data curation as part of publishing / 10th April 2019
The Springer Nature research data curators
Joseph Salter
Development Editor
Tristan
Matthews
Assistant
Research Data
Editor
Graham Smith
Senior Research Data
Editor
Rebecca Grant
Research Data Manager
Alexandra Philiastides
Assistant Research
Data Editor
Varsha Khodiyar
Data Curation Manager
17
Data curation as part of publishing / 10th April 2019
• Springer Nature has had at least one research data curator since the launch
of Scientific Data in 2014.
• Since 2017, data curation has been available as a separate service for
increasing numbers of Springer Nature authors and editors.
• The Research Data team has built up significant expertise in the area of data
publishing.
• Our curators are able to identify and help resolve both minor and major
issues prior to articles and data being made public.
• Our curators increase the FAIRness of published research data
• We focus on increasing the Findability and Accessibility of data and
metadata in our curation processes.
• We encourage our authors to increase the Interoperability and
Reusability of their data and metadata; by using community ontologies
for metadata, and encouraging the use of community research data
infrastructure where this exists.
Summary: Curation as part of the publishing process
18
Data curation as part of publishing / 10th April 2019
18
The story behind the image
Antarctica meltdown could
double sea level rise
Researchers at Pennsylvania State University
have been considering how quickly a glacial ice
melt in Antarctica would raise sea levels. By
updating models with new discoveries and
comparing them with past sea-level rise events
they predict that a melting Antarctica could raise
oceans by more than 3 feet by the end of the
century if greenhouse gas emissions continued
unabated, roughly doubling previous total sea-
level rise estimates. Rising seas could put many
of the world’s coastlines underwater or at risk of
flooding and storm surges.
Varsha Khodiyar, PhD
Data Curation Manager
varsha.khodiyar@nature.com
@varsha_khodiyar
go.nature.com/ResearchDataServices
researchdata.springernature.com
researchdata@springernature.com
nature.com/scientificdata
scientificdata@nature.com
@scientificdata

More Related Content

PPTX
New approaches to data management: supporting FAIR data sharing at Springer N...
PDF
A Data Curation Framework: Data Curation and Research Support Services
PDF
Valen Metadata and the [Data] Repository
PPT
Data curation and preservation: the Digital Curation Centre
PPT
Who will use the open data? Mark Humphries keynote
PPTX
Research data spring: extending the OPD to cover RDM
PPTX
What infrastructure is necessary for successful research data management (RDM...
PDF
Praetzellis "Data Management Planning and Tools"
New approaches to data management: supporting FAIR data sharing at Springer N...
A Data Curation Framework: Data Curation and Research Support Services
Valen Metadata and the [Data] Repository
Data curation and preservation: the Digital Curation Centre
Who will use the open data? Mark Humphries keynote
Research data spring: extending the OPD to cover RDM
What infrastructure is necessary for successful research data management (RDM...
Praetzellis "Data Management Planning and Tools"

What's hot (20)

PPTX
EPSRC research data expectations and PURE for datasets
PPTX
Publishing perspectives on data management & future directions
PPTX
RDA P16 - Repository Selection Criteria - Funders IG Breakout 8
PPTX
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
PPTX
Supporting the development of a national Research Data Discovery Service - A ...
PPTX
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
PDF
Hawkins "Implementation of the CONSER Standard Record"
PDF
Introduction to the Research Integrity Advisor Data Management Workshop, Bris...
PDF
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
PPTX
EPSRC Policy Compliance: What researchers need to know
PPTX
Repository Fringe 2016 - Survey Documentation and Analysis
PDF
Strasser "Effective data management and its role in open research"
PPTX
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
PPTX
Research data support: a growth area for academic libraries?
PDF
056-Science Europe Draft Proposal for a Sceince Europe position statement on ...
PPTX
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
PPTX
Standardising research data policies, research data network
PPTX
Data management plan format
PPTX
Gold, silver, bronze - research data network
PDF
Increasing research impact: the national data registry - Alex Ball - Jisc Dig...
EPSRC research data expectations and PURE for datasets
Publishing perspectives on data management & future directions
RDA P16 - Repository Selection Criteria - Funders IG Breakout 8
A National Approach to Open Data in Ireland: Publishers and Research Data Man...
Supporting the development of a national Research Data Discovery Service - A ...
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Hawkins "Implementation of the CONSER Standard Record"
Introduction to the Research Integrity Advisor Data Management Workshop, Bris...
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
EPSRC Policy Compliance: What researchers need to know
Repository Fringe 2016 - Survey Documentation and Analysis
Strasser "Effective data management and its role in open research"
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
Research data support: a growth area for academic libraries?
056-Science Europe Draft Proposal for a Sceince Europe position statement on ...
RDAP 16: Perspective on DMPs, Funders and Public Access (Panel 5: DMPs and Pu...
Standardising research data policies, research data network
Data management plan format
Gold, silver, bronze - research data network
Increasing research impact: the national data registry - Alex Ball - Jisc Dig...
Ad

Similar to The value of data curation as part of the publishing process (20)

PDF
re3data - Registry of Research Data Repositories
PDF
re3data.org – Registry of Research Data Repositories
PPT
Open Data and Institutional Repositories
PDF
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
PPTX
Records professionals and Research Data - a new role?
PPT
Services, policy, guidance and training: Improving research data management a...
PPTX
Services, policy, guidance and training: Improving research data management a...
PPTX
Ucla july 2018 natasha simons
PPTX
Rebecca Grant - Publishers and RDM
PDF
Research Integrity Advisor and Data Management
PPT
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
PDF
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
PPTX
Rscd 2018 Journal policies - natasha simons
PPTX
Journal Data Sharing Policies rscd2018
PPTX
Preparing Your Research Material for the Future - 2015-02-23 - Humanities Div...
PPTX
Adding valuethroughdatacuration
PPTX
Birgit Schmidt: RDA for Libraries from an International Perspective
PDF
20230513taibif-datapaper-tutorial_en.pdf.pdf
PPTX
The challenge of sharing data well, how publishers can help
PPTX
Introduction to RDM for Geoscience PhD Students
re3data - Registry of Research Data Repositories
re3data.org – Registry of Research Data Repositories
Open Data and Institutional Repositories
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
Records professionals and Research Data - a new role?
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
Ucla july 2018 natasha simons
Rebecca Grant - Publishers and RDM
Research Integrity Advisor and Data Management
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
Rscd 2018 Journal policies - natasha simons
Journal Data Sharing Policies rscd2018
Preparing Your Research Material for the Future - 2015-02-23 - Humanities Div...
Adding valuethroughdatacuration
Birgit Schmidt: RDA for Libraries from an International Perspective
20230513taibif-datapaper-tutorial_en.pdf.pdf
The challenge of sharing data well, how publishers can help
Introduction to RDM for Geoscience PhD Students
Ad

More from Varsha Khodiyar (20)

PDF
Digital transformation to enable a FAIR approach for health data science
PDF
Lessons from the UK: Data access, patient trust & real-world impact with heal...
PDF
COVID-19 variants, vaccines and tests
PDF
COVID-19 variants and vaccines
PDF
Data citation and sharing during article publication
PDF
The importance of research data repositories
PDF
What role can publishers play in the open data ecosystem?
PDF
Five essentials factors for unlocking the potential for Open Research Data
PDF
Preparing your data for sharing and publishing
PDF
Facilitating good research data management practice as part of scholarly publ...
PDF
Practical challenges for researchers in data sharing
PDF
Update from Data policy standardisation and implementation IG
PPTX
Data peer review workshop
PDF
Peer Reviewing Data: experiences from a data journal
PPTX
Data Publishing and Institutional Repositories
PDF
Gaining credit for sharing research data
PDF
Data sharing as part of the research workflow
PDF
Data sharing as part of the research ecosystem
PPTX
Workflows for Publishing Data; Scientific Data's experience as an early adopter
PPTX
Clinical Data Publishing at Scientific Data
Digital transformation to enable a FAIR approach for health data science
Lessons from the UK: Data access, patient trust & real-world impact with heal...
COVID-19 variants, vaccines and tests
COVID-19 variants and vaccines
Data citation and sharing during article publication
The importance of research data repositories
What role can publishers play in the open data ecosystem?
Five essentials factors for unlocking the potential for Open Research Data
Preparing your data for sharing and publishing
Facilitating good research data management practice as part of scholarly publ...
Practical challenges for researchers in data sharing
Update from Data policy standardisation and implementation IG
Data peer review workshop
Peer Reviewing Data: experiences from a data journal
Data Publishing and Institutional Repositories
Gaining credit for sharing research data
Data sharing as part of the research workflow
Data sharing as part of the research ecosystem
Workflows for Publishing Data; Scientific Data's experience as an early adopter
Clinical Data Publishing at Scientific Data

Recently uploaded (20)

PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
BIOMOLECULES PPT........................
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
famous lake in india and its disturibution and importance
bbec55_b34400a7914c42429908233dbd381773.pdf
microscope-Lecturecjchchchchcuvuvhc.pptx
Viruses (History, structure and composition, classification, Bacteriophage Re...
The KM-GBF monitoring framework – status & key messages.pptx
HPLC-PPT.docx high performance liquid chromatography
Introduction to Cardiovascular system_structure and functions-1
neck nodes and dissection types and lymph nodes levels
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
BIOMOLECULES PPT........................
Phytochemical Investigation of Miliusa longipes.pdf
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Cell Membrane: Structure, Composition & Functions
ECG_Course_Presentation د.محمد صقران ppt
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Comparative Structure of Integument in Vertebrates.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
famous lake in india and its disturibution and importance

The value of data curation as part of the publishing process

  • 1. The value of data curation as part of the publishing process Varsha Khodiyar, PhD Biocuration 2019 Antarcticameltdowncoulddoublesealevelrise
  • 2. 1 Data curation as part of publishing / 10th April 2019 A brief history of data curation at Springer Nature • Scientific Data launched May 2014 • Novel manuscript format, the Data Descriptor • Focus on data generation and data peer review • Machine readable metadata file generated by in house curators (ISA-Tab format) for each published Data Descriptor www.nature.com/scientificdata
  • 3. 2 Data curation as part of publishing / 10th April 2019 Data Descriptors have human and machine understandable components Human readable representation of study i.e. article (HTML & PDF) Human readable representation of study i.e. article (HTML & PDF)
  • 4. 3 Data curation as part of publishing / 10th April 2019 Data Descriptors have human and machine understandable components Machine accessible representation of study, i.e. metadata Human readable summary of the metadata
  • 5. 4 Data curation as part of publishing / 10th April 2019 Output from Scientific Data’s curation process Machine readable overview of how sources and samples were turned into the digital data outputs. Curator captures key dataset characteristics using ontology terms: • Type of study • What was measured • How it was measured • Any independent variables • Sample characteristics e.g. - Species - Geographical location - Environment type scientificdata.isa-explorer.org
  • 6. 5 Data curation as part of publishing / 10th April 2019 Publishing a data paper with Scientific Data Deposit data in an appropriate repository Draft a manuscript based on the template Submit your manuscript Peer review of the manuscript Revise the manuscript as required Make any changes requested by the data curators The data descriptor is published
  • 7. 6 Data curation as part of publishing / 10th April 2019 A brief history of data curation at Springer Nature • Research Data Support service (RDS) launched April 2017 • Expansion of data curation practice to other Springer Nature journals • Provide support and advice on research data sharing, for authors and editors • Promote best practice for sharing research data associated with a publication www.springernature.com/la/authors/research-data
  • 8. 7 Data curation as part of publishing / 10th April 2019 To help authors and journals follow good practice in sharing and archiving of research data, we provide optional data deposition and curation services. Springer Nature Research Data Support Researchers submit their data files securely The Research Data team curates the data and metadata The data are published and linked to the author’s paper More information is available on our website here: http://www.springernature.com/gb/group/data-policy/data-support-services
  • 9. 8 Data curation as part of publishing / 10th April 2019 Comprehensive description including the data context of the study and data gathering method Altmetrics provide information on downloads and citations Relevant categories and keywords added to enhance discoverability of the data Dataset assigned a DOI Source: https://doi.org/10.6084/m9.figshare.5259415 Example of curation output from Research Data Support Licence to clarify reuse conditions
  • 10. 9 Data curation as part of publishing / 10th April 2019 Example author feedback report
  • 11. 10 Data curation as part of publishing / 10th April 2019 Checks carried out by the curation team  Most appropriate repository used?  Data and metadata at the repository consistent with manuscript?  Terms of use and terms of access for the data consistent with manuscript?  Terms of data use consistent with journal policy?
  • 12. 11 Data curation as part of publishing / 10th April 2019 Addition of missing information Error correction Suggestions to increase FAIRness Improvements to manuscript tables, text or figures to aid understanding and reuse of the work Data access or data license conditions updated at repository or manuscript to aid accessibility Repository metadata improved to aid dataset discoverability Improvements to file names and/or file structure at the repository to aid understanding and reuse of the work Possible outcomes of curation Manuscript text Manuscript figure Manuscript table Data files at the repository
  • 13. 12 Data curation as part of publishing / 10th April 2019 Curation outcomes at Scientific Data (Study period March 2018 to March 2019) 77% of manuscripts - no issues identified 23% of manuscripts - at least 1 issue identified and resolved 10% of manuscripts - errors identified and resolved
  • 14. 13 Data curation as part of publishing / 10th April 2019 RDS curation outcomes (Study period March 2018 to March 2019) In 55% of RDS curation jobs, the curator suggested updates to the repository hosted data files Sensitive data removed Missing data added License conditions updated File format & naming improved Mandated data moved to specialist repositories Supplementary Information moved to repository Opaque language clarified
  • 15. 14 Data curation as part of publishing / 10th April 2019 We encourage the use of community endorsed ontologies, standards and repositories where possible
  • 16. 15 Data curation as part of publishing / 10th April 2019 We encourage the use of community endorsed ontologies, standards and repositories where possible springernature.com/gp/authors/research-data-policy/repositories/12327124
  • 17. 16 Data curation as part of publishing / 10th April 2019 The Springer Nature research data curators Joseph Salter Development Editor Tristan Matthews Assistant Research Data Editor Graham Smith Senior Research Data Editor Rebecca Grant Research Data Manager Alexandra Philiastides Assistant Research Data Editor Varsha Khodiyar Data Curation Manager
  • 18. 17 Data curation as part of publishing / 10th April 2019 • Springer Nature has had at least one research data curator since the launch of Scientific Data in 2014. • Since 2017, data curation has been available as a separate service for increasing numbers of Springer Nature authors and editors. • The Research Data team has built up significant expertise in the area of data publishing. • Our curators are able to identify and help resolve both minor and major issues prior to articles and data being made public. • Our curators increase the FAIRness of published research data • We focus on increasing the Findability and Accessibility of data and metadata in our curation processes. • We encourage our authors to increase the Interoperability and Reusability of their data and metadata; by using community ontologies for metadata, and encouraging the use of community research data infrastructure where this exists. Summary: Curation as part of the publishing process
  • 19. 18 Data curation as part of publishing / 10th April 2019 18 The story behind the image Antarctica meltdown could double sea level rise Researchers at Pennsylvania State University have been considering how quickly a glacial ice melt in Antarctica would raise sea levels. By updating models with new discoveries and comparing them with past sea-level rise events they predict that a melting Antarctica could raise oceans by more than 3 feet by the end of the century if greenhouse gas emissions continued unabated, roughly doubling previous total sea- level rise estimates. Rising seas could put many of the world’s coastlines underwater or at risk of flooding and storm surges. Varsha Khodiyar, PhD Data Curation Manager varsha.khodiyar@nature.com @varsha_khodiyar go.nature.com/ResearchDataServices researchdata.springernature.com researchdata@springernature.com nature.com/scientificdata scientificdata@nature.com @scientificdata