SlideShare a Scribd company logo
What is Open Science
and why is it important
for students?
12 June 2019, Trondheim, Norway
Living Atlas Seminar
http://bit.ly/gbifno-openscience
OPEN SCIENCE
WHAT IS OPEN SCIENCE?
Open science is the movement to make scientific research
(including publications, data, physical samples, and software)
and its dissemination accessible to all levels of an inquiring
society, amateur or professional (Woelfle et al. 2011).
cf. Wikipedia
Woelfle, M.; Olliaro, P.; Todd, M. H. (2011). "Open science is a research accelerator". Nature Chemistry. 3 (10): 745–748.
doi:10.1038/nchem.1149
WHAT IS OPEN SCIENCE?
Open science is transparent and accessible knowledge
that is shared and developed through collaborative
networks (Vicente-Saez et al. 2018).
cf. Wikipedia
Vicente-Saez, Ruben; Martinez-Fuentes, Clara (2018). "Open Science now: A systematic literature review for an integrated definition".
Journal of Business Research. 88: 428–436. doi:10.1016/j.jbusres.2017.12.043
WHAT IS OPEN SCIENCE?
Open Science can be seen as a continuation of, rather
than a revolution in, practices begun in the 17th century
with the advent of the academic journal (David 2004).
cf. Wikipedia
David, P. A. (2004). "Understanding the emergence of 'open science' institutions: Functionalist economics in historical context".
Industrial and Corporate Change. 13 (4): 571–589. doi:10.1093/icc/dth023
Open Access (OA): Research results
distributed online and free of costs or other
barriers – often meaning free access to
research articles.
Open Science: Researchers to share their
methods, computer code and research data in
central data repositories.
Open Data: is freely available to everyone to use
and re-publish as they wish, without restrictions
from copyright, patents or other mechanisms of
control.
FAIR data principles: findable, accessible,
interoperable and reusable.
FAIR data principles
Wilkinson et al. 2016 doi:10.1038/sdata.2016.18
FAIRdataprinciples
Promotes maximum (re) use of research data.
Researchers need to do more than simply post their data on the web for it to be useful.
What is FAIR Data?
FINDABLE
• Data and supplementary materials have sufficiently rich
metadata and a unique and persistent identifier.
ACCESSIBLE
• Metadata and data are understandable to humans and
machines. Data is deposited in a trusted repository.
INTEROPERABLE
• Metadata use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
REUSABLE
• Data and collections have a clear usage licenses and
provide accurate information on provenance.
https://libereurope.eu/wp-content/uploads/2017/12/LIBER-FAIR-Data.pdf
FAIRData
SCIENCE CURRENCIES (CITATION)
● Peer-reviewed scholarly papers in high impact
journals (still) maintain considerable weight for scientific
careers.
● A movement is under way to build similar status for
open data, open metadata, and other open science
products…
Data Citation Principles
1. Data to be legitimate citable products of research.
2. Data citations giving scholarly credit and attribution.
3. In scholarly literature, whenever claims are based on data, data should
always be cited.
4. Persistent method for identification of data, that is machine actionable,
globally unique, universal.
5. Data citation facilitate access to data or at least to metadata.
6. Unique identifiers that persist even beyond the lifespan of the data.
7. Data citation identify and access the specific data that support verification
of the claim (provenance, time-slice, version).
8. Flexible, but attention to interoperability of practices across communities.
Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014
Open research data policies
The scientific journals (at Springer Nature) practices different
guidelines and requirements for availability to the underlying
research data for published research papers.
Springer Nature has made a comprehensive report on practical
incentives and appropriate norms to promote open data.
http://www.springernature.com/gp/group/data-policy/policy-types
OPEN SCIENCE
Kunnskapsdepartementet (2016)
EU (2016) Competitiveness Council, 26-27/05/2016
EU (2007) INSPIRE Directive
Norway is to be a careful pioneer in open access to research results.
Norway to follow the ambition of EU on full open access to publicly
funded research by 2020.
Results of research supported by public and public-private funds freely available to and reusable by anyone.
OPEN RESEARCH DATA
Forskningsrådet (2014). ISBN: 978-82-12-03361-0
The Research Council of Norway expects all research data from projects
funded by the Research Council to be made freely available as open data.
In some situations there can be valid and justified reasons for exceptions.
(2014)
WHY TEACH STUDENTS OPEN SCIENCE ?
● We are in the middle of an ongoing paradigm shift in
scientific practice (and impact metrics).
● The open science wave is moving fast!
● Young scientists will (already today) need different
skills, than was needed previously – to succeed in
academia.
Expanding possibilities… (for novel curiosity-driven research)
Open science
Traditional science
Your student
REPRODUCIBILITY CRISIS
"Scientific irreproducibility —
the inability to repeat others'
experiments and reach the
same conclusion” (Nature 2016)
Baker (2016) 1,500 scientists lift the lid on reproducibility.
Nature. doi:10.1038/533452a
"Scientific
irreproducibility — the
inability to repeat others'
experiments and reach
the same conclusion —
is a growing concern”.
Baker (2016) Nature
doi:10.1038/533452a
Open Science solution: researchers to
share their methods, data, computer code
and results in central data repositories.
Note that we also need herbarium specimen
and bio-repositories (eg. museums).
WILL ANYBODY TRUST CLOSED
SCIENCE AGAIN?
● Recent studies indicates that p-hacking [1] is a significant
problem – sometimes even without the scientist even being
aware of doing so (Ioannidis 2005; Head et al. 2015)
● Pre-registered (open) data provides a good insurance
against suspicion of both data dredging (and plain data
falsification).
[1] “p-hacking,” (data dredging, data fishing, …) occurs when researchers collect or
select data or statistical analyses until nonsignificant results become significant.
Ioannidis (2005). "Why Most Published Research Findings Are False". PLoS Medicine. doi:10.1371/journal.pmed.0020124.
Head et al. (2015) The Extent and Consequences of P-Hacking in Science. PLoS Biol. doi:10.1371/journal.pbio.1002106
Reuse of teaching curriculum
Why publish open data?
● Data produced using public funds should be regarded as a common good,
and should be made available for inspection, interpretation and re-use by
third parties.
● Needless duplication of data-collecting efforts and costs will be reduced.
● Open data increases transparency and overall quality of research.
● Published data can be re-analysed, verified, and improved by others.
● Data publication increase recognition and opportunities for collaboration.
● Published data can be cited and re-used, either alone or in combination
with other data.
● Data owners and collection managers can trace data use and citation.
● Data creators, their institutions and funding agencies can be credited.
● Data can be integrated with other datasets across space and time.
● Open data increases potential for interdisciplinary research and re-use in
new contexts not envisioned by the data creator.
Penev et al. (2017) https://doi.org/10.3897/rio.3.e12431
20
Data Management Plan (DMP)
A formal document that outlines HOW data
are to be handled during a research project,
and after the project is completed.
The goal is to plan data management BEFORE the project begins.
Including a plan for the COSTS of data management and archiving.
This saves time in the long run, and promotes data fitness for reuse.
Reduce duplication of existing scientific studies.
Reduce the loss of data.
https://en.wikipedia.org/wiki/Data_management_plan
Illustration CC BY Jørgen Stamp
Why write Data Management Plans?
A data management plan is a tool for
making your research reproducible
and thus trustworthy.
Good data curation saves you research time,
because you, your collaborators, and others,
will find, understand, and get access to your (own) research data.
Efficient data sharing provides broader distribution and impact for your
research results.
Open research data, available for reuse, strengthens open and curiosity-
driven research, and scientific breakthrough not originally foreseen by
the original data producer.
https://en.wikipedia.org/wiki/Data_management_plan
Illustration CC BY Jørgen Stamp
What is Metadata?
Slide source CC BY EUDAT (2016) | Photo: CC-BY by Cea+ http://www.flickr.com/photos/centralasian/8071729256
Metadata, literally “data about
data” are an essential
component of a data
management system,
describing such aspects as
the “what, where, when, who
and how” pertaining to a
resource.
‹#›
Why metadata?
In general, metadata should allow a prospective
end user of data to:
1. identify/discover its existence,
2. learn how to access or acquire the data,
3. understand its fitness-for-use,
4. learn how to transfer (obtain a copy of) the
data, and
5. learn how the data should be used.
Photo CC BY-SA Jennifer Fagan-Fry (NOAA) | GBIF Metadata Profile (2011) https://github.com/gbif/ipt/wiki/GMPHowToGuide
‹#›
Data entropy
Illustration from: The Loss of Information about Data (Metadata) Over Time, Michener et al, 1997
What is a «data paper»?
A data paper is a peer reviewed document describing a
dataset, published in a peer reviewed journal. It takes effort
to prepare, curate and describe data. Data papers provide
recognition for this effort by means of a scholarly article.
• Getting scholarly recognition for your datasets.
• Promote and improve the fitness for reuse of research data.
https://www.gbif.org/data-papers
Data papers explained
A data paper is a searchable metadata document, describing a particular
dataset or a group of datasets, published in the form of a peer-reviewed
article in a scholarly journal.
Unlike a conventional research article, the primary purpose of a data paper is to
describe data and the circumstances of their collection, rather than to report
hypotheses and conclusions.
GBIF has been working with partners in academic publishing to promote the
data paper as a means of bringing credit and recognition to all those
involved in data publication; to alert the scientific community to the existence
of biodiversity datasets and the value they can bring to particular research
projects; and as a mechanism for quality assessment and control of data
accessible through GBIF and other networks.
https://www.gbif.org/data-papers
Why publish data papers?
● Improve the usability (fitness for use) of your published data!
● Receive credit through indexing and citation of the published paper.
● Increase the visibility and credibility of data resources you publish.
● Track more efficiently the use and citations of your data resources.
● Receive feedback and peer-review on your dataset.
● Improve the quality of your data resources.
● Increase your network of collaborators.
● Get more out of your data resources.
● Promote your openly published datasets.
Why publish data papers?
Authoring clear, informative metadata is an essential step if biodiversity
data are going to be discovered and used to inform research and
decisions. This involves extra work, and data publishers need
incentives to do it. In the absence of such incentives, too many
datasets are published with poorly-documented metadata or, worse
still, no metadata at all.
Data papers help to overcome barriers to authoring of metadata by
providing clear acknowledgement of all those involved in the
collection, management, curation and publishing of biodiversity data.
https://www.gbif.org/data-papers
By publishing a data paper, you will:
Receive credit through indexing and citation of the published
paper, in the same way as with any conventional scholarly
publication, offering benefits to authors in terms of recognition and
career building.
Increase the visibility, usability and credibility of the data
resources you publish.
Track more effectively the usage and citations of the data you
publish.
https://www.gbif.org/data-papers
Data
cleaning
skills and
services
DATA CLEANING
SKILLS
Corrected in GBIF in April 2013
Open science curriculum for students, June 2019
Machine-readable data …
"We are increasingly relying on machines that derive conclusions from models that they
themselves have created, models that are often beyond human comprehension, models
that “think” about the world differently than we do" (David Weinberger 2017).
Scientist versus machine
Singularity estimated to arrive in 2045 -- 26 year from now (Kurzweil 2005)
ca 2045
The future is already here —
it's just not very evenly distributed.
William Gibson
Will our data start watching us?
Who will our students compete
with in the future job market?
Open science curriculum for students, June 2019
What is Open Science
and why is it important
for students?
12 June 2019, Trondheim, Norway
Living Atlas Seminar

More Related Content

PPTX
Research Data Management Services at UWA (November 2015)
PPTX
Introduction to Research Data Management at UWA
PPTX
The FOSTER project - general overview
PPTX
Open science, open data - FOSTER training, Potsdam
PPTX
UWA Research Week 2016
PPTX
Research Data in the Arts and Humanities: A Few Difficulties
PDF
Open Access and Open Data: what do I need to know (and do)?
PDF
Digital Data Sharing: Opportunities and Challenges of Opening Research
Research Data Management Services at UWA (November 2015)
Introduction to Research Data Management at UWA
The FOSTER project - general overview
Open science, open data - FOSTER training, Potsdam
UWA Research Week 2016
Research Data in the Arts and Humanities: A Few Difficulties
Open Access and Open Data: what do I need to know (and do)?
Digital Data Sharing: Opportunities and Challenges of Opening Research

What's hot (18)

PPTX
The Future of Open Science
PDF
Research Data Management Services at UWA (July 2015)
PPTX
The Horizon 2020 Open Data Pilot
PDF
Digital Resources for Open Science
PDF
Public data archiving: Who does? Who doesn't? What can we do about it?
PPTX
The Landscape of Research Data Management
PPTX
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
PPTX
Managing and Sharing Research Data: Good practices for an ideal world...in th...
PDF
Principles, key responsibilities, and their intersection
PPTX
Winning the Tour de France, Research Data and Data Stewardship
PDF
Open Science Governance and Regulation/Simon Hodson
PDF
Research Integrity Advisor and Data Management
PDF
Open Science Incentives/Veerle van den Eynden
PDF
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
PPTX
Journal Data Sharing Policies rscd2018
PPTX
Rscd 2018 Journal policies - natasha simons
PPTX
Winning Horizon 2020 with Open Science
PPTX
Practical Research Data Management: tools and approaches, pre- and post-award
The Future of Open Science
Research Data Management Services at UWA (July 2015)
The Horizon 2020 Open Data Pilot
Digital Resources for Open Science
Public data archiving: Who does? Who doesn't? What can we do about it?
The Landscape of Research Data Management
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Principles, key responsibilities, and their intersection
Winning the Tour de France, Research Data and Data Stewardship
Open Science Governance and Regulation/Simon Hodson
Research Integrity Advisor and Data Management
Open Science Incentives/Veerle van den Eynden
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Journal Data Sharing Policies rscd2018
Rscd 2018 Journal policies - natasha simons
Winning Horizon 2020 with Open Science
Practical Research Data Management: tools and approaches, pre- and post-award
Ad

Similar to Open science curriculum for students, June 2019 (20)

PDF
Enhance your rese​arch impact through open science
PDF
FAIR and open biodiversity collection data management
PPTX
Open Data Strategies and Research Data Realities
PDF
Strasser "Effective data management and its role in open research"
PPTX
Open Science: What, why, how?
PPTX
Open Science
PDF
Library Seminar 4: Open Science, 18.11.2021
PPTX
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
PDF
2021-01-27--biodiversity-informatics-gbif-(52slides)
PPTX
The Challenges of Making Data Travel, by Sabina Leonelli
PPTX
Introduction to open-data
PPTX
Open Science and Open Data for Librarians
PDF
Open data in ubi systems research - introduction to open science and open dat...
PPTX
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
PDF
Research Data Management
PPT
What does open science mean? A stakeholder perspective
PPTX
Open data: Enhancing preservation, reproducibility, and innovation
PDF
Open science as roadmap to better data science research
PPTX
Open, FAIR data and RDM
PPTX
Learn to speak open
Enhance your rese​arch impact through open science
FAIR and open biodiversity collection data management
Open Data Strategies and Research Data Realities
Strasser "Effective data management and its role in open research"
Open Science: What, why, how?
Open Science
Library Seminar 4: Open Science, 18.11.2021
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
2021-01-27--biodiversity-informatics-gbif-(52slides)
The Challenges of Making Data Travel, by Sabina Leonelli
Introduction to open-data
Open Science and Open Data for Librarians
Open data in ubi systems research - introduction to open science and open dat...
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
Research Data Management
What does open science mean? A stakeholder perspective
Open data: Enhancing preservation, reproducibility, and innovation
Open science as roadmap to better data science research
Open, FAIR data and RDM
Learn to speak open
Ad

More from Dag Endresen (20)

PDF
Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
PDF
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
PDF
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
PDF
Ontologies for biodiversity informatics, UiO DSC June 2023
PDF
Evacuation of the Kherson herbarium
PDF
2023-05-08 GLIS SAC Rome
PDF
BioDT for the UiO Science section meeting 2023-03-24
PDF
Data and Stats Forum at MINA NMBU - 2023-04-26
PPTX
BioDATA final conference in Oslo, November 2022
PDF
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
PDF
GBIF at Living Norway Open Science Lab 2022-03-03
PDF
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
PDF
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
PDF
The role of biodiversity informatics in GBIF, 2021-05-18
PDF
GBIF and Biodiversity informatics for museums, 15 March 2021
PDF
2016-10-12 MUSIT & GBIF - Dataset portals
PDF
GBIF and Open Science
PDF
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
PDF
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
PDF
Museum collections as research data - October 2019
Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
Ontologies for biodiversity informatics, UiO DSC June 2023
Evacuation of the Kherson herbarium
2023-05-08 GLIS SAC Rome
BioDT for the UiO Science section meeting 2023-03-24
Data and Stats Forum at MINA NMBU - 2023-04-26
BioDATA final conference in Oslo, November 2022
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF at Living Norway Open Science Lab 2022-03-03
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
The role of biodiversity informatics in GBIF, 2021-05-18
GBIF and Biodiversity informatics for museums, 15 March 2021
2016-10-12 MUSIT & GBIF - Dataset portals
GBIF and Open Science
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
Museum collections as research data - October 2019

Recently uploaded (20)

PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
master seminar digital applications in india
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Insiders guide to clinical Medicine.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
master seminar digital applications in india
Renaissance Architecture: A Journey from Faith to Humanism
Microbial diseases, their pathogenesis and prophylaxis
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Complications of Minimal Access Surgery at WLH
VCE English Exam - Section C Student Revision Booklet
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
O7-L3 Supply Chain Operations - ICLT Program
PPH.pptx obstetrics and gynecology in nursing
Anesthesia in Laparoscopic Surgery in India
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Insiders guide to clinical Medicine.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
O5-L3 Freight Transport Ops (International) V1.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx

Open science curriculum for students, June 2019

  • 1. What is Open Science and why is it important for students? 12 June 2019, Trondheim, Norway Living Atlas Seminar http://bit.ly/gbifno-openscience
  • 3. WHAT IS OPEN SCIENCE? Open science is the movement to make scientific research (including publications, data, physical samples, and software) and its dissemination accessible to all levels of an inquiring society, amateur or professional (Woelfle et al. 2011). cf. Wikipedia Woelfle, M.; Olliaro, P.; Todd, M. H. (2011). "Open science is a research accelerator". Nature Chemistry. 3 (10): 745–748. doi:10.1038/nchem.1149
  • 4. WHAT IS OPEN SCIENCE? Open science is transparent and accessible knowledge that is shared and developed through collaborative networks (Vicente-Saez et al. 2018). cf. Wikipedia Vicente-Saez, Ruben; Martinez-Fuentes, Clara (2018). "Open Science now: A systematic literature review for an integrated definition". Journal of Business Research. 88: 428–436. doi:10.1016/j.jbusres.2017.12.043
  • 5. WHAT IS OPEN SCIENCE? Open Science can be seen as a continuation of, rather than a revolution in, practices begun in the 17th century with the advent of the academic journal (David 2004). cf. Wikipedia David, P. A. (2004). "Understanding the emergence of 'open science' institutions: Functionalist economics in historical context". Industrial and Corporate Change. 13 (4): 571–589. doi:10.1093/icc/dth023
  • 6. Open Access (OA): Research results distributed online and free of costs or other barriers – often meaning free access to research articles. Open Science: Researchers to share their methods, computer code and research data in central data repositories. Open Data: is freely available to everyone to use and re-publish as they wish, without restrictions from copyright, patents or other mechanisms of control. FAIR data principles: findable, accessible, interoperable and reusable.
  • 7. FAIR data principles Wilkinson et al. 2016 doi:10.1038/sdata.2016.18 FAIRdataprinciples Promotes maximum (re) use of research data. Researchers need to do more than simply post their data on the web for it to be useful.
  • 8. What is FAIR Data? FINDABLE • Data and supplementary materials have sufficiently rich metadata and a unique and persistent identifier. ACCESSIBLE • Metadata and data are understandable to humans and machines. Data is deposited in a trusted repository. INTEROPERABLE • Metadata use a formal, accessible, shared, and broadly applicable language for knowledge representation. REUSABLE • Data and collections have a clear usage licenses and provide accurate information on provenance. https://libereurope.eu/wp-content/uploads/2017/12/LIBER-FAIR-Data.pdf FAIRData
  • 9. SCIENCE CURRENCIES (CITATION) ● Peer-reviewed scholarly papers in high impact journals (still) maintain considerable weight for scientific careers. ● A movement is under way to build similar status for open data, open metadata, and other open science products…
  • 10. Data Citation Principles 1. Data to be legitimate citable products of research. 2. Data citations giving scholarly credit and attribution. 3. In scholarly literature, whenever claims are based on data, data should always be cited. 4. Persistent method for identification of data, that is machine actionable, globally unique, universal. 5. Data citation facilitate access to data or at least to metadata. 6. Unique identifiers that persist even beyond the lifespan of the data. 7. Data citation identify and access the specific data that support verification of the claim (provenance, time-slice, version). 8. Flexible, but attention to interoperability of practices across communities. Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014
  • 11. Open research data policies The scientific journals (at Springer Nature) practices different guidelines and requirements for availability to the underlying research data for published research papers. Springer Nature has made a comprehensive report on practical incentives and appropriate norms to promote open data. http://www.springernature.com/gp/group/data-policy/policy-types
  • 12. OPEN SCIENCE Kunnskapsdepartementet (2016) EU (2016) Competitiveness Council, 26-27/05/2016 EU (2007) INSPIRE Directive Norway is to be a careful pioneer in open access to research results. Norway to follow the ambition of EU on full open access to publicly funded research by 2020. Results of research supported by public and public-private funds freely available to and reusable by anyone.
  • 13. OPEN RESEARCH DATA Forskningsrådet (2014). ISBN: 978-82-12-03361-0 The Research Council of Norway expects all research data from projects funded by the Research Council to be made freely available as open data. In some situations there can be valid and justified reasons for exceptions. (2014)
  • 14. WHY TEACH STUDENTS OPEN SCIENCE ? ● We are in the middle of an ongoing paradigm shift in scientific practice (and impact metrics). ● The open science wave is moving fast! ● Young scientists will (already today) need different skills, than was needed previously – to succeed in academia.
  • 15. Expanding possibilities… (for novel curiosity-driven research) Open science Traditional science Your student
  • 16. REPRODUCIBILITY CRISIS "Scientific irreproducibility — the inability to repeat others' experiments and reach the same conclusion” (Nature 2016) Baker (2016) 1,500 scientists lift the lid on reproducibility. Nature. doi:10.1038/533452a
  • 17. "Scientific irreproducibility — the inability to repeat others' experiments and reach the same conclusion — is a growing concern”. Baker (2016) Nature doi:10.1038/533452a Open Science solution: researchers to share their methods, data, computer code and results in central data repositories. Note that we also need herbarium specimen and bio-repositories (eg. museums).
  • 18. WILL ANYBODY TRUST CLOSED SCIENCE AGAIN? ● Recent studies indicates that p-hacking [1] is a significant problem – sometimes even without the scientist even being aware of doing so (Ioannidis 2005; Head et al. 2015) ● Pre-registered (open) data provides a good insurance against suspicion of both data dredging (and plain data falsification). [1] “p-hacking,” (data dredging, data fishing, …) occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Ioannidis (2005). "Why Most Published Research Findings Are False". PLoS Medicine. doi:10.1371/journal.pmed.0020124. Head et al. (2015) The Extent and Consequences of P-Hacking in Science. PLoS Biol. doi:10.1371/journal.pbio.1002106
  • 19. Reuse of teaching curriculum
  • 20. Why publish open data? ● Data produced using public funds should be regarded as a common good, and should be made available for inspection, interpretation and re-use by third parties. ● Needless duplication of data-collecting efforts and costs will be reduced. ● Open data increases transparency and overall quality of research. ● Published data can be re-analysed, verified, and improved by others. ● Data publication increase recognition and opportunities for collaboration. ● Published data can be cited and re-used, either alone or in combination with other data. ● Data owners and collection managers can trace data use and citation. ● Data creators, their institutions and funding agencies can be credited. ● Data can be integrated with other datasets across space and time. ● Open data increases potential for interdisciplinary research and re-use in new contexts not envisioned by the data creator. Penev et al. (2017) https://doi.org/10.3897/rio.3.e12431 20
  • 21. Data Management Plan (DMP) A formal document that outlines HOW data are to be handled during a research project, and after the project is completed. The goal is to plan data management BEFORE the project begins. Including a plan for the COSTS of data management and archiving. This saves time in the long run, and promotes data fitness for reuse. Reduce duplication of existing scientific studies. Reduce the loss of data. https://en.wikipedia.org/wiki/Data_management_plan Illustration CC BY Jørgen Stamp
  • 22. Why write Data Management Plans? A data management plan is a tool for making your research reproducible and thus trustworthy. Good data curation saves you research time, because you, your collaborators, and others, will find, understand, and get access to your (own) research data. Efficient data sharing provides broader distribution and impact for your research results. Open research data, available for reuse, strengthens open and curiosity- driven research, and scientific breakthrough not originally foreseen by the original data producer. https://en.wikipedia.org/wiki/Data_management_plan Illustration CC BY Jørgen Stamp
  • 23. What is Metadata? Slide source CC BY EUDAT (2016) | Photo: CC-BY by Cea+ http://www.flickr.com/photos/centralasian/8071729256 Metadata, literally “data about data” are an essential component of a data management system, describing such aspects as the “what, where, when, who and how” pertaining to a resource. ‹#›
  • 24. Why metadata? In general, metadata should allow a prospective end user of data to: 1. identify/discover its existence, 2. learn how to access or acquire the data, 3. understand its fitness-for-use, 4. learn how to transfer (obtain a copy of) the data, and 5. learn how the data should be used. Photo CC BY-SA Jennifer Fagan-Fry (NOAA) | GBIF Metadata Profile (2011) https://github.com/gbif/ipt/wiki/GMPHowToGuide ‹#›
  • 25. Data entropy Illustration from: The Loss of Information about Data (Metadata) Over Time, Michener et al, 1997
  • 26. What is a «data paper»? A data paper is a peer reviewed document describing a dataset, published in a peer reviewed journal. It takes effort to prepare, curate and describe data. Data papers provide recognition for this effort by means of a scholarly article. • Getting scholarly recognition for your datasets. • Promote and improve the fitness for reuse of research data. https://www.gbif.org/data-papers
  • 27. Data papers explained A data paper is a searchable metadata document, describing a particular dataset or a group of datasets, published in the form of a peer-reviewed article in a scholarly journal. Unlike a conventional research article, the primary purpose of a data paper is to describe data and the circumstances of their collection, rather than to report hypotheses and conclusions. GBIF has been working with partners in academic publishing to promote the data paper as a means of bringing credit and recognition to all those involved in data publication; to alert the scientific community to the existence of biodiversity datasets and the value they can bring to particular research projects; and as a mechanism for quality assessment and control of data accessible through GBIF and other networks. https://www.gbif.org/data-papers
  • 28. Why publish data papers? ● Improve the usability (fitness for use) of your published data! ● Receive credit through indexing and citation of the published paper. ● Increase the visibility and credibility of data resources you publish. ● Track more efficiently the use and citations of your data resources. ● Receive feedback and peer-review on your dataset. ● Improve the quality of your data resources. ● Increase your network of collaborators. ● Get more out of your data resources. ● Promote your openly published datasets.
  • 29. Why publish data papers? Authoring clear, informative metadata is an essential step if biodiversity data are going to be discovered and used to inform research and decisions. This involves extra work, and data publishers need incentives to do it. In the absence of such incentives, too many datasets are published with poorly-documented metadata or, worse still, no metadata at all. Data papers help to overcome barriers to authoring of metadata by providing clear acknowledgement of all those involved in the collection, management, curation and publishing of biodiversity data. https://www.gbif.org/data-papers
  • 30. By publishing a data paper, you will: Receive credit through indexing and citation of the published paper, in the same way as with any conventional scholarly publication, offering benefits to authors in terms of recognition and career building. Increase the visibility, usability and credibility of the data resources you publish. Track more effectively the usage and citations of the data you publish. https://www.gbif.org/data-papers
  • 32. DATA CLEANING SKILLS Corrected in GBIF in April 2013
  • 35. "We are increasingly relying on machines that derive conclusions from models that they themselves have created, models that are often beyond human comprehension, models that “think” about the world differently than we do" (David Weinberger 2017).
  • 36. Scientist versus machine Singularity estimated to arrive in 2045 -- 26 year from now (Kurzweil 2005) ca 2045
  • 37. The future is already here — it's just not very evenly distributed. William Gibson Will our data start watching us?
  • 38. Who will our students compete with in the future job market?
  • 40. What is Open Science and why is it important for students? 12 June 2019, Trondheim, Norway Living Atlas Seminar