SlideShare a Scribd company logo
Is the current measure of
excellence perverting Science?
A Data deluge is coming, it is time to act
Lourdes Verdes-Montenegro
Instituto de Astrofísica de Andalucía (IAA-CSIC)
Session Organiser: William Garnier (SKAO)
(Submitter and Manager)
Theme #3 Science policy and transformation of research practice
ANY PROBLEM WITH THE SCIENTIFIC METHOD??
Scientific Reproducibility is a fundamental principle of the Scientific
Method, a process established in the 17th century that marked the
beginning of modern science and laid the foundations for the Philosophy
of Science.
We all agree, “reproducibility is great!”, right?
Then… What is the problem?
ANY PROBLEM WITH THE SCIENTIFIC METHOD??
Scientific Reproducibility is a fundamental principle of the Scientific
Method, a process established in the 17th century that marked the
beginning of modern science and laid the foundations for the Philosophy
of Science.
We all agree, “reproducibility is great!”, right?
Then… What is the problem?
Paradoxically:
• part of the scientific community claims that reproducibility is already
achieved (because they have a section describing their methods in
their papers, or because they share their data)
• the remainder mostly consider it a utopy
Questionnaire on reproducibility (1500 scientists)
• 70% of researchers have tried and failed to
reproduce another scientist's experiments
• > 50% have failed to reproduce their own ones!
• Chemistry: 90% (60%)
• Biology: 80% (60%)
• Physics and engineering: 70% (50%)
• Medicine: 70% (60%)
• Earth and environmental science: 60% (40%)
ANY PROBLEM WITH THE SCIENTIFIC METHOD??
ACTUALLY, YES!
2016
Questionnaire on reproducibility (1500 scientists)
• 70% of researchers have tried and failed to
reproduce another scientist's experiments
• > 50% have failed to reproduce their own ones!
• Chemistry: 90% (60%)
• Biology: 80% (60%)
• Physics and engineering: 70% (50%)
• Medicine: 70% (60%)
• Earth and environmental science: 60% (40%)
ANY PROBLEM WITH THE SCIENTIFIC METHOD??
ACTUALLY, YES!
2016
Aha! So you don’t empathise?
Questionnaire on reproducibility (1500 scientists)
• 70% of researchers have tried and failed to
reproduce another scientist's experiments
• > 50% have failed to reproduce their own ones!
• Chemistry: 90% (60%)
• Biology: 80% (60%)
• Physics and engineering: 70% (50%)
• Medicine: 70% (60%)
• Earth and environmental science: 60% (40%)
ANY PROBLEM WITH THE SCIENTIFIC METHOD??
ACTUALLY, YES!
Maybe with this?
2016
• Reproducibility (theoretical definition): An experiment/study is reproducible if an
external researcher could repeat the same procedures and confirm the results using the
same set up, input data and methods
• Reproducibility (in practice): input data, methods, set up parameters, output data and
results, and the computational environment, together with details on the context and links
between the pieces of the experiment.
UHMMM,WHAT DOYOU MEAN BY REPRODUCIBILITY?
Moving from narratives
(last 300 yrs) to the
actual output of research
• Reproducibility (theoretical definition): An experiment/study is reproducible if an
external researcher could repeat the same procedures and confirm the results using the
same set up, input data and methods
• Reproducibility (in practice): input data, methods, set up parameters, output data and
results, and the computational environment, together with details on the context and links
between the pieces of the experiment.
UHMMM,WHAT DOYOU MEAN BY REPRODUCIBILITY?
BTW: Reproducibility is
not the aim, is the mean
PERSPECTIVES
PERSPECTIVES
Data to the desktop: “individual scientist”
• I have the best code, which I know how to use and can do special things
• I do not trust any “pipeline” that you made
• partly because I know better how to do it
• partly because I read the news and there is a reproducibility crisis
• well, and I can hardly reproduce the results of my own papers
some years later...
• In general I want full control of the software and of the computational
environment
PERSPECTIVES
Computation to data, providers perspective: Data Centres
Mandatory in the Era of
Megascience infrastructures
PERSPECTIVES
Computation to data, providers perspective: Data Centres
• We need to install your software in our platform. Can we trust it? Can we run it?
Environment, dependencies, etc
• Hey, we are offering services to the community, computation + tools. We would
be grateful if you allow us to share it with other users (with proper credit)
• Mmmm, sharing is great, but, putting the software in the platform is not
enough: you need to provide the context for people to be able to rerun the
software on the same or other data
Mandatory in the Era of
Megascience infrastructures
PERSPECTIVES
Large alliances of scientists to develop Key Science Projects
Mandatory (as well) to analyse the data
deluge from this next generation of facilities
PERSPECTIVES
Large alliances of scientists to develop Key Science Projects
• We have tools to generate Advanced Data Products, and we will put them there
where the storage and computation is (Data Centres)
• But... we put effort on it, what would we gain if we make the *additional effort*
to make it reusable? If we make it, then we will pave the way to competitors
• Well, maybe we will share in 4 yrs time (PhD typical time)
Mandatory (as well) to analyse the data
deluge from this next generation of facilities
PERSPECTIVES
Publishers
• Will we need different profiles of referees to evaluate the scientific discussion
together with the data quality and the methods (aka. Reproducibility)?
• If the data and the methods (tools) will be in Data Centres, will our referees need
to become a “user” of the Data Centres to be able to validate a paper?
• Will we be able to engage so many referees as may be needed?
• Will we need to validate the data, the tools, and the scientific analysis separetely?
The challenge of going “beyond the PDF”
PERSPECTIVES
Policy makers / funding agencies
• How to measure reproducibility?
• How to weight it and/or aggregate with other indicators?
• Is it affordable / sustainable?
Reproducibility as a key element of Open Science
METRICS
... “Science is being
killed by numerical
ranking,”[...] Ranking
systems lures scientists
into pursuing high
rankings first and good
science second
Productivity seems to prevail
over Discovery
METRICS
... “Science is being
killed by numerical
ranking,”[...] Ranking
systems lures scientists
into pursuing high
rankings first and good
science second
Reproducibility
crisis
SKA1: 197 dishes + 125.000 dipoles
SKA: 2500 dishes + 500.000 dipoles
THE SQUARE KILOMETRE ARRAY
Ø 1000 scientists &
engineers from > 270
institutions, > 20 countries.
Ø 11 Member countries
SKA1: 197 dishes + 125.000 dipoles
SKA: 2500 dishes + 500.000 dipoles
THE SQUARE KILOMETRE ARRAY
Ø 1000 scientists &
engineers from > 270
institutions, > 20 countries.
Ø 11 Member countries
THE SQUARE KILOMETRE ARRAY
The Challenge: Extraction of scientific knowledge
• Direct delivery to end users is unfeasible
• International distributed scientific teams
SKA Regional Centres will provide access to SKA data products, tools and
processing power to generate and analyse Advanced Data Products
AREWE READY?
• We are in a race to exploit ever larger datasets:
in our quest for “efficiency” we risk forgetting about reproducibility
• Unless we are ready to change the way in which we, the scientists, work, there
is no guarantee that the quality of Science will improve.
AREWE READY?
The era of Big Data is beginning across sciences
Today is the time to ask what kind of Research mega-science
infrastructures want to do in tomorrow’s future
Are we ready to take up that challenge?
• We are in a race to exploit ever larger datasets:
in our quest for “efficiency” we risk forgetting about reproducibility
• Unless we are ready to change the way in which we, the scientists, work, there
is no guarantee that the quality of Science will improve.
Is the current measure of excellence perverting Science? A Data deluge is coming, it is time to act

More Related Content

PDF
Open Science for sustainability and inclusiveness: the SKA role model
PDF
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
PPT
Hosting public domain chemicals data online for the community – the challenge...
PPTX
How to share useful data
PPTX
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
PDF
A Data Science Workflow: Nonprofit Edition
PDF
Data Management
PDF
NGP Retreat Open Science 2015
Open Science for sustainability and inclusiveness: the SKA role model
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...
Hosting public domain chemicals data online for the community – the challenge...
How to share useful data
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
A Data Science Workflow: Nonprofit Edition
Data Management
NGP Retreat Open Science 2015

Similar to Is the current measure of excellence perverting Science? A Data deluge is coming, it is time to act (20)

PDF
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
PDF
Open Access Week - Oxford, 20-24 Oct 2014
PDF
Love for science or Academic prostitution, 2019 update
PDF
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
PDF
Sgci nsf-2-22-17
PPTX
Reproducibility (and the R*) of Science: motivations, challenges and trends
PDF
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
PDF
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
PDF
Tradeline 2016
PPTX
Research Objects for FAIRer Science
PDF
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
PPTX
Genome sharing projects around the world nijmegen oct 29 - 2015
PDF
Talk at OHSU, September 25, 2013
PDF
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
PDF
Open Science for sustainability and inclusiveness: the SKA role model
PDF
Spark Summit Europe: Share and analyse genomic data at scale
PPT
Actc2012 Scoopit Cytomics
PDF
Roche_open_science_NIOO_KNAW_workshop_NL
PPTX
Stevan Harnad: Slides for promoting open access mandates and metrics
PPT
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
Open Access Week - Oxford, 20-24 Oct 2014
Love for science or Academic prostitution, 2019 update
High quality data publications: drives and needs - Sansone, BDebate, 12 Nov 2014
Sgci nsf-2-22-17
Reproducibility (and the R*) of Science: motivations, challenges and trends
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
Tradeline 2016
Research Objects for FAIRer Science
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014
Genome sharing projects around the world nijmegen oct 29 - 2015
Talk at OHSU, September 25, 2013
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
Open Science for sustainability and inclusiveness: the SKA role model
Spark Summit Europe: Share and analyse genomic data at scale
Actc2012 Scoopit Cytomics
Roche_open_science_NIOO_KNAW_workshop_NL
Stevan Harnad: Slides for promoting open access mandates and metrics
Facilitating Scientific Discovery through Crowdsourcing and Distributed Parti...
Ad

Recently uploaded (20)

PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
CORDINATION COMPOUND AND ITS APPLICATIONS
PPTX
Overview of calcium in human muscles.pptx
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PPTX
Fluid dynamics vivavoce presentation of prakash
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PPTX
Pharmacology of Autonomic nervous system
PPTX
Biomechanics of the Hip - Basic Science.pptx
PPT
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PPTX
The Minerals for Earth and Life Science SHS.pptx
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
PPTX
Introcution to Microbes Burton's Biology for the Health
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PDF
Sciences of Europe No 170 (2025)
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PPTX
perinatal infections 2-171220190027.pptx
lecture 2026 of Sjogren's syndrome l .pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
CORDINATION COMPOUND AND ITS APPLICATIONS
Overview of calcium in human muscles.pptx
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Fluid dynamics vivavoce presentation of prakash
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Pharmacology of Autonomic nervous system
Biomechanics of the Hip - Basic Science.pptx
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
The Minerals for Earth and Life Science SHS.pptx
BODY FLUIDS AND CIRCULATION class 11 .pptx
Introcution to Microbes Burton's Biology for the Health
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
Science Quipper for lesson in grade 8 Matatag Curriculum
Sciences of Europe No 170 (2025)
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
perinatal infections 2-171220190027.pptx
Ad

Is the current measure of excellence perverting Science? A Data deluge is coming, it is time to act

  • 1. Is the current measure of excellence perverting Science? A Data deluge is coming, it is time to act Lourdes Verdes-Montenegro Instituto de Astrofísica de Andalucía (IAA-CSIC) Session Organiser: William Garnier (SKAO) (Submitter and Manager) Theme #3 Science policy and transformation of research practice
  • 2. ANY PROBLEM WITH THE SCIENTIFIC METHOD?? Scientific Reproducibility is a fundamental principle of the Scientific Method, a process established in the 17th century that marked the beginning of modern science and laid the foundations for the Philosophy of Science. We all agree, “reproducibility is great!”, right? Then… What is the problem?
  • 3. ANY PROBLEM WITH THE SCIENTIFIC METHOD?? Scientific Reproducibility is a fundamental principle of the Scientific Method, a process established in the 17th century that marked the beginning of modern science and laid the foundations for the Philosophy of Science. We all agree, “reproducibility is great!”, right? Then… What is the problem? Paradoxically: • part of the scientific community claims that reproducibility is already achieved (because they have a section describing their methods in their papers, or because they share their data) • the remainder mostly consider it a utopy
  • 4. Questionnaire on reproducibility (1500 scientists) • 70% of researchers have tried and failed to reproduce another scientist's experiments • > 50% have failed to reproduce their own ones! • Chemistry: 90% (60%) • Biology: 80% (60%) • Physics and engineering: 70% (50%) • Medicine: 70% (60%) • Earth and environmental science: 60% (40%) ANY PROBLEM WITH THE SCIENTIFIC METHOD?? ACTUALLY, YES! 2016
  • 5. Questionnaire on reproducibility (1500 scientists) • 70% of researchers have tried and failed to reproduce another scientist's experiments • > 50% have failed to reproduce their own ones! • Chemistry: 90% (60%) • Biology: 80% (60%) • Physics and engineering: 70% (50%) • Medicine: 70% (60%) • Earth and environmental science: 60% (40%) ANY PROBLEM WITH THE SCIENTIFIC METHOD?? ACTUALLY, YES! 2016 Aha! So you don’t empathise?
  • 6. Questionnaire on reproducibility (1500 scientists) • 70% of researchers have tried and failed to reproduce another scientist's experiments • > 50% have failed to reproduce their own ones! • Chemistry: 90% (60%) • Biology: 80% (60%) • Physics and engineering: 70% (50%) • Medicine: 70% (60%) • Earth and environmental science: 60% (40%) ANY PROBLEM WITH THE SCIENTIFIC METHOD?? ACTUALLY, YES! Maybe with this? 2016
  • 7. • Reproducibility (theoretical definition): An experiment/study is reproducible if an external researcher could repeat the same procedures and confirm the results using the same set up, input data and methods • Reproducibility (in practice): input data, methods, set up parameters, output data and results, and the computational environment, together with details on the context and links between the pieces of the experiment. UHMMM,WHAT DOYOU MEAN BY REPRODUCIBILITY? Moving from narratives (last 300 yrs) to the actual output of research
  • 8. • Reproducibility (theoretical definition): An experiment/study is reproducible if an external researcher could repeat the same procedures and confirm the results using the same set up, input data and methods • Reproducibility (in practice): input data, methods, set up parameters, output data and results, and the computational environment, together with details on the context and links between the pieces of the experiment. UHMMM,WHAT DOYOU MEAN BY REPRODUCIBILITY? BTW: Reproducibility is not the aim, is the mean
  • 10. PERSPECTIVES Data to the desktop: “individual scientist” • I have the best code, which I know how to use and can do special things • I do not trust any “pipeline” that you made • partly because I know better how to do it • partly because I read the news and there is a reproducibility crisis • well, and I can hardly reproduce the results of my own papers some years later... • In general I want full control of the software and of the computational environment
  • 11. PERSPECTIVES Computation to data, providers perspective: Data Centres Mandatory in the Era of Megascience infrastructures
  • 12. PERSPECTIVES Computation to data, providers perspective: Data Centres • We need to install your software in our platform. Can we trust it? Can we run it? Environment, dependencies, etc • Hey, we are offering services to the community, computation + tools. We would be grateful if you allow us to share it with other users (with proper credit) • Mmmm, sharing is great, but, putting the software in the platform is not enough: you need to provide the context for people to be able to rerun the software on the same or other data Mandatory in the Era of Megascience infrastructures
  • 13. PERSPECTIVES Large alliances of scientists to develop Key Science Projects Mandatory (as well) to analyse the data deluge from this next generation of facilities
  • 14. PERSPECTIVES Large alliances of scientists to develop Key Science Projects • We have tools to generate Advanced Data Products, and we will put them there where the storage and computation is (Data Centres) • But... we put effort on it, what would we gain if we make the *additional effort* to make it reusable? If we make it, then we will pave the way to competitors • Well, maybe we will share in 4 yrs time (PhD typical time) Mandatory (as well) to analyse the data deluge from this next generation of facilities
  • 15. PERSPECTIVES Publishers • Will we need different profiles of referees to evaluate the scientific discussion together with the data quality and the methods (aka. Reproducibility)? • If the data and the methods (tools) will be in Data Centres, will our referees need to become a “user” of the Data Centres to be able to validate a paper? • Will we be able to engage so many referees as may be needed? • Will we need to validate the data, the tools, and the scientific analysis separetely? The challenge of going “beyond the PDF”
  • 16. PERSPECTIVES Policy makers / funding agencies • How to measure reproducibility? • How to weight it and/or aggregate with other indicators? • Is it affordable / sustainable? Reproducibility as a key element of Open Science
  • 17. METRICS ... “Science is being killed by numerical ranking,”[...] Ranking systems lures scientists into pursuing high rankings first and good science second Productivity seems to prevail over Discovery
  • 18. METRICS ... “Science is being killed by numerical ranking,”[...] Ranking systems lures scientists into pursuing high rankings first and good science second Reproducibility crisis
  • 19. SKA1: 197 dishes + 125.000 dipoles SKA: 2500 dishes + 500.000 dipoles THE SQUARE KILOMETRE ARRAY Ø 1000 scientists & engineers from > 270 institutions, > 20 countries. Ø 11 Member countries
  • 20. SKA1: 197 dishes + 125.000 dipoles SKA: 2500 dishes + 500.000 dipoles THE SQUARE KILOMETRE ARRAY Ø 1000 scientists & engineers from > 270 institutions, > 20 countries. Ø 11 Member countries
  • 21. THE SQUARE KILOMETRE ARRAY The Challenge: Extraction of scientific knowledge • Direct delivery to end users is unfeasible • International distributed scientific teams SKA Regional Centres will provide access to SKA data products, tools and processing power to generate and analyse Advanced Data Products
  • 22. AREWE READY? • We are in a race to exploit ever larger datasets: in our quest for “efficiency” we risk forgetting about reproducibility • Unless we are ready to change the way in which we, the scientists, work, there is no guarantee that the quality of Science will improve.
  • 23. AREWE READY? The era of Big Data is beginning across sciences Today is the time to ask what kind of Research mega-science infrastructures want to do in tomorrow’s future Are we ready to take up that challenge? • We are in a race to exploit ever larger datasets: in our quest for “efficiency” we risk forgetting about reproducibility • Unless we are ready to change the way in which we, the scientists, work, there is no guarantee that the quality of Science will improve.