What is Reproducibility? The R* brouhaha and how Research Objects can help

What is
Reproducibility?
The R* brouhaha
(and how Research Objects
can help)
Professor Carole Goble
The University of Manchester, UK
Software Sustainability Institute, UK
ELIXIR-UK, FAIRDOMAssociation e.V.
carole.goble@manchester.ac.uk
First International Workshop on Reproducible Open Science @ TPDL, 9 Sept 2016, Hannover, Germany

Acknowledgements
• Dagstuhl Seminar 16041 , January 2016
– http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16041
• ATI Symposium Reproducibility, Sustainability and Preservation , April 2016
– https://turing.ac.uk/events/reproducibility-sustainability-and-preservation/
– https://osf.io/bcef5/files/
• CTitus Brown
• Juliana Freire
• David De Roure
• Stian Soiland-Reyes
• Barend Mons
• Tim Clark
• Daniel Garijo
• Norman Morrison

“When I use a word," Humpty Dumpty
said in rather a scornful tone, "it means
just what I choose it to mean - neither
more nor less.”
Carroll, Through the Looking Glass
re-compute
replicate
rerun
repeat
re-examine
repurpose
recreate
reuse
restore
reconstruct review
regenerate
revise
recycle
redo
robustness
tolerance
verificationcompliancevalidation assurance
remix

Reproducibility of
Reproducibility Research

Computational Science
http://tpeterka.github.io/maui-project/
From:The Future of ScientificWorkflows, Report of DOEWorkshop 2015,
http://science.energy.gov/~/media/ascr/pdf/programdocuments/docs/workflows_final_report.pd
1. Observational,
experimental
2. Theoretical
3. Simulation
4. Data intensive

Scientific publications goals:
(i) announce a result
(ii) convince readers its correct.
Papers in experimental science
should describe the results and
provide a clear enough protocol to
allow successful repetition and
extension.
Papers in computational science
should describe the results and
provide the complete software
development environment, data
and set of instructions which
generated the figures.
VirtualWitnessing*
*Leviathan and theAir-Pump: Hobbes, Boyle, and the
Experimental Life (1985) Shapin and Schaffer.
Jill Mesirov
David Donoho

Datasets, Data collections
Standard operating procedures
Software, algorithms
Configurations,
Tools and apps, services
Codes, code libraries
Workflows, scripts
System software
Infrastructure
Compilers, hardware
Systems of
Systems
Heterogeneous hybrid
patchwork of tools and
service evolving over time

10 “Simple” Rules for Reproducible
Computational Research: RACE
1. For Every Result, Keep Track of How It Was
Produced
2. Avoid Manual Data Manipulation Steps
3. Archive the Exact Versions of All External
Programs Used
4. Version Control All Custom Scripts
5. Record All Intermediate Results, When Possible in
Standardized Formats
6. For Analyses That Include Randomness, Note
Underlying Random Seeds
7. Always Store Raw Data behind Plots
8. Generate Hierarchical Analysis Output, Allowing
Layers of Increasing Detail to Be Inspected
9. Connect Textual Statements to Underlying
Results
10. Provide Public Access to Scripts, Runs, and
Results
Sandve GK, Nekrutenko A,Taylor J, Hovig E (2013)Ten Simple Rules for Reproducible
Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285
Record
Everything
Automate
Everything
Contain
Everything
Expose
Everything

Preparation pain
independent testing trials and tribulations
[Norman Morrison]
replication hostility no funding, time, recognition, place to publish
resource intensive access to the complete environment

Lab Analogy: Witnessing “Datascopes”
Input Data
Software
Output Data
Config
Parameters
Methods
techniques, algorithms,
spec. of the steps, models
Materials
datasets, parameters,
algorithm seeds
Instruments
codes, services, scripts,
underlying libraries,
workflows, , ref resources
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
computational environment

“Micro” Reproducibility
“Macro” Reproducibility
Fixivity
Validate
Verify
Trust

Repeat, Replicate, Robust
[CTitus Brown]
https://2016-oslo-repeatability.readthedocs.org/en/latest/repeatability-discussion.html
Why the differences?
Reproduce,Trust

“an experiment is reproducible until
another laboratory tries to repeat it”
Alexander Kohn
Repeatability:
“Sameness”
Same result
1 Lab
1 experiment
Reproducibility:
“Similarity”
Similar result
> 1 Lab
> 1 experiment
Validate
Verify

Method
Reproducibility
the provision of
enough detail about
study procedures and
data so the same
procedures could, in
theory or in actuality,
be exactly repeated.
Result Reproducibility
(aka replicability)
obtaining the same
results from the
conduct of an
independent study
whose procedures are
as closely matched to
the original experiment
as possible
What does research reproducibility mean? Steven N. Goodman, Daniele Fanelli, John
P. A. Ioannidis ScienceTranslational Medicine 8 (341), 341ps12.
[doi: 10.1126/scitranslmed.aaf5027]
http://stm.sciencemag.org/content/scitransmed/8/341/341ps12.full.pdf

Productivity
Track differences
Validate
Verify

reviewers want additional work
statistician wants more runs
analysis needs to be repeated
post-doc leaves,
student arrives
new/revised datasets
updated/new versions of
algorithms/codes
sample was contaminated
better kit - longer simulations
new partners, new projects
Personal & Lab
Productivity
Public Good
Reproducibility

“Datascope” Lab Analogy
Methods
Materials
algorithm seeds
Instruments
workflows, ref datasets
Laboratory
systems software,

“Datascope” Lab Analogy
Methods
Materials
algorithm seeds
Instruments
Laboratory
systems software,
Form
Function

“Datascope” Practicalities
Methods
Materials
algorithm seeds
Instruments
Laboratory
systems software,
Living Dependencies
Science,
methods,
datasets
questions stay,
answers change
breakage, labs
decay, services and
techniques come
and go, new
instruments,
updated datasets,
services, codes,
hardware
One offs, streams,
stochastics,
sensitivities,
scale, non-portable
data
black boxes
supercomputer
access
non-portable
software
licensing restrictions
unreliable resources
black boxes
complexity

T1 T2
evolving ref datasets,
new simulation codes
Environment
Archived vs Active
Contained vs Distributed
Regimented vs Free-for-all
Who owns the dependencies?
Dependencies -> Manage
Black boxes -> Expose
Dynamics -> Fixity
Reliability

Replicate harder than Reproduce?
Repeating the experiment or the set up?
Container Conundrum Results willVary
ReplicabilityWindow
All experiments become less replicable over time
Prepare to repair

Levels of Computational Reproducibility
Coverage: how
much of an
experiment is
reproducible
OriginalExperimentSimilarExperimentDifferentExperiment
Portability
Depth: how much of an experiment is available
Binaries +
Data
Source Code /
Workflow
+ Data
Binaries +
Data +
Dependencies
Source Code /
Workflow
+ Data +
Dependencies
Virtual Machine
Binaries +
Data +
Dependencies
Virtual Machine
Source Code /
Workflow
+ Data +
Dependencies
Figures +
Data
[Freire, 2014]
Minimum:
data and source
code available
under terms
that permit
inspection and
execution.

Measuring Information Gain from Reproducibility
Research goal
Method/Alg.
Platform/Exec Env
Data Parameters
Input data
Actors
Information Gain
Implementation/Code
No change
Change
Don’t care
https://linkingresearch.wordpress.com/2016/02/21/dagstuhl-seminar-report-reproducibility-of-data-oriented-experiments-in-e-scienc/
http://www.dagstuhl.de/16041

How? Preserve by Reporting, Reproduce by Reading
Archived
Record
Description Zoo
standards, common metadata

How? Preserve by Maintaining, Repairing, Containing
Reproduce by Running, Emulating, Reconstructing
Active Instrument Byte level Buildability Zoo

provenance
portability, preservation
robustness, versioning
access description
standards
common APIs
licensing, identifiers
standards,
common metadata
change
variation sensitivity
discrepancy handling
packaging, containers
FAIR RACE Reproducibility Dimensions
dependencies
steps

Research Object
Standards-based metadata framework for logically and
physically bundling resources with context,
http://researchobject.org
Bigger on the inside than the outside
external referencing

Manifest
Construction
Aggregates
link things together
Annotations
about things & their
relationships
Container
Research Object Standards-based metadata framework for logically
and physically bundling resources with context, http://researchobject.org
Packaging content & links:
Zip files, BagIt, Docker images
Catalogues & Commons Platforms:
FAIRDOM
Manifest
Description
Dependencies
what else is
needed
Versioning
its evolution
Checklists
what should
be there
Provenance
where it
came from
Identification
locate things
regardless where
id

Systems Biology
Commons
• Link data, models
and SOPs
• Standards
• Span resources
• Snapshot + DOIs
• Bundle and export
• Logical bundles

Belhajjame et al (2015) Using a suite of ontologies for preserving workflow-centric research objects,
JWeb Semantics doi:10.1016/j.websem.2015.01.003
application/vnd.wf4ever.robundle+zip
Workflow Research Objects
exchange, portability and
maintenance
*https://2016-oslo-repeatability.readthedocs.org/en/latest/overview-and-agenda.html

Asthma Research e-Lab
Dataset building and
releasing
Standardised
packing of Systems
Biology models
European Space
Agency RO Library
Large dataset
management for life
science workflows
LHC ATLAS
experiments
Notre Dame U Rostock
Encyclopedia of DNA
Elements
PeptideAtlas

Words matter.
Reproducibility is not a end.
Its a means to an end.
Beware reproducibility zealots.
50 Shades of Reproducibility.
form vs function
A conundrum:
big co-operative data-driven
science makes reproducibility
desirable but also means
dependency and change are to be
expected.
Lab analogy for
computational science

What is Reproducibility? The R* brouhaha and how Research Objects can help

More Related Content

What's hot (20)

Similar to What is Reproducibility? The R* brouhaha and how Research Objects can help (20)

More from Carole Goble (15)

Recently uploaded (20)

What is Reproducibility? The R* brouhaha and how Research Objects can help