SlideShare a Scribd company logo
The Horizon 2020 Open Data Pilot
Sarah Jones
Digital Curation Centre, University of Glasgow
sarah.jones@glasgow.ac.uk
Twitter: sjDCC
Why open access and open data?
“The European Commission’s vision
is that information already paid for
by the public purse should not be
paid for again each time it is
accessed or used, and that it
should benefit European companies
and citizens to the full.”
http://ec.europa.eu/research/participants/
data/ref/h2020/grants_manual/hi/oa_pilot/
h2020-hi-oa-pilot-guide_en.pdf
What is research data?
‘Research data’ refers to information, in particular facts
or numbers, collected to be examined and considered as a
basis for reasoning, discussion or calculation.
In a research context, examples of data include statistics,
results of experiments, measurements, observations
resulting from fieldwork, survey results, interview
recordings and images. The focus is on research data that
is available in digital form.
Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020
v.1.0, 11 December 2013, Footnote 5, p3
What is open data?
Openly accessible research data can typically be accessed,
mined, exploited, reproduced and disseminated, free of
charge for the user.
Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020, p3
make your stuff available on the Web (whatever format) under an open licence
make it available as structured data (e.g. Excel instead of a scan of a table)
use non-proprietary formats (e.g. CSV instead of Excel)
use URIs to denote things, so that people can point at your stuff
link your data to other data to provide context
Tim Berners-Lee’s proposal for five star open data - http://5stardata.info
H2020 OPEN DATA PILOT
Guidelines on Data Management in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/
grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
H2020 areas participating in the pilot
• Future and Emerging Technologies
• Research infrastructures – part e-Infrastructures
• Leadership in enabling and industrial technologies – Information and
Communication Technologies
• Societal Challenge: 'Secure, Clean and Efficient Energy' – part Smart
cities and communities
• Societal Challenge: 'Climate Action, Environment, Resource Efficiency
and Raw materials' – except raw materials
• Societal Challenge: 'Europe in a changing world – inclusive, innovative
and reflective Societies'
• Science with and for Society
Projects in other areas can participate on a voluntary basis
Why would researchers want to opt in?(1)
www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity
... validation of results
“It was a mistake in a spreadsheet that could
have been easily overlooked: a few rows left
out of an equation to average the values in a
column.
The spreadsheet was used to draw the
conclusion of an influential 2010 economics
paper: that public debt of more than 90% of
GDP slows down growth. This conclusion was
later cited by the International Monetary Fund
and the UK Treasury to justify programmes
of austerity that have arguably led to riots,
poverty and lost jobs.”
Why would researchers want to opt in?(2)
www.nytimes.com/2010/08/13/health/research/
13alzheimer.html?pagewanted=all&_r=0
“It was unbelievable. Its not science
the way most of us have practiced
in our careers. But we all realised
that we would never get biomarkers
unless all of us parked our egos and
intellectual property noses outside
the door and agreed that all of our
data would be public immediately.”
Dr John Trojanowski, University of Pennsylvania
... scientific breakthroughs
Why would researchers want to opt in?(3)
“There is evidence that studies that make
their data available do indeed receive more
citations than similar studies that do not.”
Piwowar H. and Vision T.J 2013 "Data reuse and the open data
citation advantage“ https://peerj.com/preprints/1.pdf
9% - 30% increase
... more citations
Exemptions – reasons for opting out
• If results are expected to be commercially or industrially exploited
• If participation is incompatible with the need for confidentiality in
connection with security issues
• Incompatible with existing rules on the protection of personal data
• Would jeopardise the achievement of the main aim of the action
• If the project will not generate / collect any research data
• If there are other legitimate reason to not take part in the Pilot
Can opt out at proposal stage OR during lifetime of project.
Should describe issues in the project Data Management Plan.
Which data does the pilot apply to?
• Data, including associated metadata, needed to
validate the results in scientific publications
• Other curated and/or raw data, including
associated metadata, as specified in the DMP
Doesn’t apply to all data (researchers to define as appropriate)
Don’t have to share data if inappropriate – exemptions apply
Metadata and documentation
Metadata: basic info e.g. title, author, dates, access rights...
Documentation: methods, code, data dictionary, context...
Use standards wherever possible for interoperability
www.dcc.ac.uk/resources/
metadata-standards
Requirements of the open data pilot
1. Develop (and update) a Data Management Plan
2. Deposit in a research data repository
3. Make it possible for third parties to access,
mine, exploit, reproduce and disseminate data –
free of charge for any user
4. Provide information on the tools and
instruments needed to validate the results (or
provide the tools)
1. Develop a Data Management Plan
Not a fixed document – should evolve and gain precision
• Deliver first version within initial 6 months of project
• More elaborate versions whenever important changes to the project occur. At
least at the mid-term and final review.
Two templates provided (annex 1 & 2)
Note that the Commission does NOT require applicants to submit a DMP
at the proposal stage. A DMP is therefore NOT part of the evaluation.
However, all project proposals submitted to "Research and Innovation
actions", as well as "Innovation actions", include a section on research
data management which is evaluated under the criterion 'Impact‘.
Guidelines on Data Management in Horizon 2020, v.1.0, 11 December 2013
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/
h2020-hi-oa-data-mgt_en.pdf
DMPonline
A web-based tool to help researchers write DMPs
Includes a template for Horizon 2020
https://dmponline.dcc.ac.uk
2. Deposit in a repository
http://databib.org
http://service.re3data.org/search
Zenodo
• OpenAIRE-CERN joint effort
• Multidisciplinary repository
• Multiple data types
– Publications
– Long tail of research data
• Citable data (DOI)
• Links to funding, publications,
data & software
www.zenodo.org
http://www.dcc.ac.uk/resources/how-
guides/license-research-data
3. License your data for reuse
Outlines pros and cons of each
approach and gives practical advice
on how to implement your licence
CREATIVE COMMONS LIMITATIONS
NC Non-Commercial
What counts as commercial?
SA Share Alike
Reduces interoperability
ND No Derivatives
Severely restricts use
Horizon 2020 recommendation
is to use
OR
4. Provide info on tools needed for validation
Need to share much more than just the data
for research to be reproducible...
Difficult to validate data if you’re missing info on the
steps between the initial idea and end results
Useful links
• Open Knowledge Foundation (advocacy, training, services,
handbook...) https://okfn.org
• MyExperiment and Taverna (sharing workflows)
http://www.myexperiment.org and
http://www.taverna.org.uk
• Software Sustainability Institute (UK-based)
http://www.software.ac.uk
• School of Data (training to help people use open data)
http://schoolofdata.org
• Digital Curation Centre (RDM guidance, tools and resources)
http://www.dcc.ac.uk/resources
Discussion
• What concerns / misconceptions need to be overcome
about open data?
• What guidance, tools and resources do you need to know
about to support projects in the open data pilot?
• What assistance is needed to review DMPs and monitor
the success of the pilot?
• What other issues or recommendations do you have?

More Related Content

PPTX
Managing and sharing data
PPTX
Managing and sharing data
PPTX
Research support-challenges
PPTX
RDM LIASA webinar
PPTX
H2020 Open Data Pilot
PPTX
RDM and DMP intro
PPTX
H2020 Open Research Data pilot
PPTX
Intro to Data Management Plans
Managing and sharing data
Managing and sharing data
Research support-challenges
RDM LIASA webinar
H2020 Open Data Pilot
RDM and DMP intro
H2020 Open Research Data pilot
Intro to Data Management Plans

What's hot (20)

PPTX
Horizon 2020 and the open research data pilot
PPT
RDM requirements gathering with DAF
PPTX
DMP health sciences
PPTX
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
PPTX
Overcoming obstacles to sharing data about human subjects
PPTX
Research data policy
PPTX
RDM policy and recovering costs
PPTX
How to elaborate a data management plan
PPTX
Open Science: What, why, how?
PPT
Research Data Management
PPTX
EPSRC research data expectations and PURE for datasets
PDF
OU Library Research Support webinar: Data sharing
PPTX
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
PPTX
RDM for librarians
PPTX
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
PPT
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
PDF
Developing a Data Management Plan
PPTX
Data Management Planning for researchers
PPTX
Data Management Planning at Edinburgh
PPT
What is-rdm
Horizon 2020 and the open research data pilot
RDM requirements gathering with DAF
DMP health sciences
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Overcoming obstacles to sharing data about human subjects
Research data policy
RDM policy and recovering costs
How to elaborate a data management plan
Open Science: What, why, how?
Research Data Management
EPSRC research data expectations and PURE for datasets
OU Library Research Support webinar: Data sharing
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
RDM for librarians
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
Developing a Data Management Plan
Data Management Planning for researchers
Data Management Planning at Edinburgh
What is-rdm
Ad

Similar to H2020 open-data-pilot (20)

PPTX
General introduction to Open Data Policies H2020, influence of OD policies on...
PPTX
Open Access Week 2017: Introduction to Open Data Policies in H2020
PPT
H2020 data pilot openaire
PPTX
Open Research Data & H2020
PPTX
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
PPTX
Open Research Data in H2020 and the Data Management plans requirements (Laser...
PPTX
Open by default: the challenges of research data in Europe
PPTX
Open Access Presentation Update June 2015
PPTX
Open by default: the challenges of research data in Europe
PPTX
The Horizon 2020 Open Data Pilot
PPTX
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
PPTX
A funder’s perspective: Welcome from the EC, Caroline Colin (OpenAIRE worksho...
PPTX
Open data pilot
PPTX
20170530_Open Research Data in Horizon 2020
PPTX
The FOSTER project - general overview
PPTX
Workshop Fraunhofer Portugal on Open Science in Horizon 2020
PPTX
Webinar: Data management and the Open Research Data Pilot in Horizon 2020
PPTX
Webinar: Data management and the Open Research Data Pilot in Horizon 2020
PDF
Research Data Management Planning: problems and solutions
PPTX
Open Data Strategies and Research Data Realities
General introduction to Open Data Policies H2020, influence of OD policies on...
Open Access Week 2017: Introduction to Open Data Policies in H2020
H2020 data pilot openaire
Open Research Data & H2020
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Open Research Data in H2020 and the Data Management plans requirements (Laser...
Open by default: the challenges of research data in Europe
Open Access Presentation Update June 2015
Open by default: the challenges of research data in Europe
The Horizon 2020 Open Data Pilot
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
A funder’s perspective: Welcome from the EC, Caroline Colin (OpenAIRE worksho...
Open data pilot
20170530_Open Research Data in Horizon 2020
The FOSTER project - general overview
Workshop Fraunhofer Portugal on Open Science in Horizon 2020
Webinar: Data management and the Open Research Data Pilot in Horizon 2020
Webinar: Data management and the Open Research Data Pilot in Horizon 2020
Research Data Management Planning: problems and solutions
Open Data Strategies and Research Data Realities
Ad

More from Sarah Jones (20)

PPTX
Data training tips and tricks
PPTX
EOSC and libraries
PPTX
EOSC Association priorities and activities
PPTX
Managing and sharing data: lessons from the European context
PPTX
Reflections on Open Science
PPTX
MAR comments analysis
PPTX
Introduction to Open Science and EOSC
PPTX
EOSC-MAR-update.pptx
PPTX
Intro-EOSC.pptx
PPTX
Why is EOSC so hard?
PPTX
The future of FAIR
PPTX
Is Europe ready for Open Science
PPTX
DMPonline: 10 years, 10 lessons
PPTX
Do & don't of supporting Open Science
PPTX
Why institutions need to raise their capabilities to support FAIR
PPTX
It takes more than a village: lessons on building global research commons
PPTX
DMPTuuli - what's new?
PPTX
DCC and FAIR initiatives
PPTX
Intro to RDM
PPTX
Reflections on EOSC through the mirror of ARDC
Data training tips and tricks
EOSC and libraries
EOSC Association priorities and activities
Managing and sharing data: lessons from the European context
Reflections on Open Science
MAR comments analysis
Introduction to Open Science and EOSC
EOSC-MAR-update.pptx
Intro-EOSC.pptx
Why is EOSC so hard?
The future of FAIR
Is Europe ready for Open Science
DMPonline: 10 years, 10 lessons
Do & don't of supporting Open Science
Why institutions need to raise their capabilities to support FAIR
It takes more than a village: lessons on building global research commons
DMPTuuli - what's new?
DCC and FAIR initiatives
Intro to RDM
Reflections on EOSC through the mirror of ARDC

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
KodekX | Application Modernization Development
PPTX
Cloud computing and distributed systems.
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
The AUB Centre for AI in Media Proposal.docx
KodekX | Application Modernization Development
Cloud computing and distributed systems.
Understanding_Digital_Forensics_Presentation.pptx
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
20250228 LYD VKU AI Blended-Learning.pptx
Unlocking AI with Model Context Protocol (MCP)
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
Review of recent advances in non-invasive hemoglobin estimation

H2020 open-data-pilot

  • 1. The Horizon 2020 Open Data Pilot Sarah Jones Digital Curation Centre, University of Glasgow sarah.jones@glasgow.ac.uk Twitter: sjDCC
  • 2. Why open access and open data? “The European Commission’s vision is that information already paid for by the public purse should not be paid for again each time it is accessed or used, and that it should benefit European companies and citizens to the full.” http://ec.europa.eu/research/participants/ data/ref/h2020/grants_manual/hi/oa_pilot/ h2020-hi-oa-pilot-guide_en.pdf
  • 3. What is research data? ‘Research data’ refers to information, in particular facts or numbers, collected to be examined and considered as a basis for reasoning, discussion or calculation. In a research context, examples of data include statistics, results of experiments, measurements, observations resulting from fieldwork, survey results, interview recordings and images. The focus is on research data that is available in digital form. Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 v.1.0, 11 December 2013, Footnote 5, p3
  • 4. What is open data? Openly accessible research data can typically be accessed, mined, exploited, reproduced and disseminated, free of charge for the user. Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020, p3 make your stuff available on the Web (whatever format) under an open licence make it available as structured data (e.g. Excel instead of a scan of a table) use non-proprietary formats (e.g. CSV instead of Excel) use URIs to denote things, so that people can point at your stuff link your data to other data to provide context Tim Berners-Lee’s proposal for five star open data - http://5stardata.info
  • 5. H2020 OPEN DATA PILOT Guidelines on Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/ grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
  • 6. H2020 areas participating in the pilot • Future and Emerging Technologies • Research infrastructures – part e-Infrastructures • Leadership in enabling and industrial technologies – Information and Communication Technologies • Societal Challenge: 'Secure, Clean and Efficient Energy' – part Smart cities and communities • Societal Challenge: 'Climate Action, Environment, Resource Efficiency and Raw materials' – except raw materials • Societal Challenge: 'Europe in a changing world – inclusive, innovative and reflective Societies' • Science with and for Society Projects in other areas can participate on a voluntary basis
  • 7. Why would researchers want to opt in?(1) www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity ... validation of results “It was a mistake in a spreadsheet that could have been easily overlooked: a few rows left out of an equation to average the values in a column. The spreadsheet was used to draw the conclusion of an influential 2010 economics paper: that public debt of more than 90% of GDP slows down growth. This conclusion was later cited by the International Monetary Fund and the UK Treasury to justify programmes of austerity that have arguably led to riots, poverty and lost jobs.”
  • 8. Why would researchers want to opt in?(2) www.nytimes.com/2010/08/13/health/research/ 13alzheimer.html?pagewanted=all&_r=0 “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania ... scientific breakthroughs
  • 9. Why would researchers want to opt in?(3) “There is evidence that studies that make their data available do indeed receive more citations than similar studies that do not.” Piwowar H. and Vision T.J 2013 "Data reuse and the open data citation advantage“ https://peerj.com/preprints/1.pdf 9% - 30% increase ... more citations
  • 10. Exemptions – reasons for opting out • If results are expected to be commercially or industrially exploited • If participation is incompatible with the need for confidentiality in connection with security issues • Incompatible with existing rules on the protection of personal data • Would jeopardise the achievement of the main aim of the action • If the project will not generate / collect any research data • If there are other legitimate reason to not take part in the Pilot Can opt out at proposal stage OR during lifetime of project. Should describe issues in the project Data Management Plan.
  • 11. Which data does the pilot apply to? • Data, including associated metadata, needed to validate the results in scientific publications • Other curated and/or raw data, including associated metadata, as specified in the DMP Doesn’t apply to all data (researchers to define as appropriate) Don’t have to share data if inappropriate – exemptions apply
  • 12. Metadata and documentation Metadata: basic info e.g. title, author, dates, access rights... Documentation: methods, code, data dictionary, context... Use standards wherever possible for interoperability www.dcc.ac.uk/resources/ metadata-standards
  • 13. Requirements of the open data pilot 1. Develop (and update) a Data Management Plan 2. Deposit in a research data repository 3. Make it possible for third parties to access, mine, exploit, reproduce and disseminate data – free of charge for any user 4. Provide information on the tools and instruments needed to validate the results (or provide the tools)
  • 14. 1. Develop a Data Management Plan Not a fixed document – should evolve and gain precision • Deliver first version within initial 6 months of project • More elaborate versions whenever important changes to the project occur. At least at the mid-term and final review. Two templates provided (annex 1 & 2) Note that the Commission does NOT require applicants to submit a DMP at the proposal stage. A DMP is therefore NOT part of the evaluation. However, all project proposals submitted to "Research and Innovation actions", as well as "Innovation actions", include a section on research data management which is evaluated under the criterion 'Impact‘. Guidelines on Data Management in Horizon 2020, v.1.0, 11 December 2013 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/ h2020-hi-oa-data-mgt_en.pdf
  • 15. DMPonline A web-based tool to help researchers write DMPs Includes a template for Horizon 2020 https://dmponline.dcc.ac.uk
  • 16. 2. Deposit in a repository http://databib.org http://service.re3data.org/search Zenodo • OpenAIRE-CERN joint effort • Multidisciplinary repository • Multiple data types – Publications – Long tail of research data • Citable data (DOI) • Links to funding, publications, data & software www.zenodo.org
  • 17. http://www.dcc.ac.uk/resources/how- guides/license-research-data 3. License your data for reuse Outlines pros and cons of each approach and gives practical advice on how to implement your licence CREATIVE COMMONS LIMITATIONS NC Non-Commercial What counts as commercial? SA Share Alike Reduces interoperability ND No Derivatives Severely restricts use Horizon 2020 recommendation is to use OR
  • 18. 4. Provide info on tools needed for validation Need to share much more than just the data for research to be reproducible... Difficult to validate data if you’re missing info on the steps between the initial idea and end results
  • 19. Useful links • Open Knowledge Foundation (advocacy, training, services, handbook...) https://okfn.org • MyExperiment and Taverna (sharing workflows) http://www.myexperiment.org and http://www.taverna.org.uk • Software Sustainability Institute (UK-based) http://www.software.ac.uk • School of Data (training to help people use open data) http://schoolofdata.org • Digital Curation Centre (RDM guidance, tools and resources) http://www.dcc.ac.uk/resources
  • 20. Discussion • What concerns / misconceptions need to be overcome about open data? • What guidance, tools and resources do you need to know about to support projects in the open data pilot? • What assistance is needed to review DMPs and monitor the success of the pilot? • What other issues or recommendations do you have?