SlideShare a Scribd company logo
Building the
FAIR Research Commons:
A Data Driven Society of Scientists
Professor Carole Goble CBE FREng FBCS
The University of Manchester, UK
carole.goble@manchester.ac.uk
FAIR
Research
Commons
Symposium: The Future of a Data-Driven Society, Maastricht University, 25 Jan 2018
Data-Driven Science
Simulations, data exploration, data processing, analytics, text mining,
visual analytics, automated inference….
e-Science:
enabling Data Driven Science
e-Infrastructure:
enabling e-Science
Distributed computing
Data management, Catalogues
Virtual Research Environments
Metadata & Semantic Web technologies
Software Engineering Products and Services
Collaboration, Sharing & Publishing Platforms
Open
Science
Open Data
Reproducible Science
Personally Productive Science
“The FAIR Guiding Principles for scientific data management and stewardship
Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
Principles
Metadata
Identifiers
Access policies
Technical: Political
Social
Economic:
A Flag,
A Meme
The Future of a Data-Driven Society
A Society of Scientists
Do Data Driven Science
Data Driven Scholarship
Data contributors,
curators, consumers
Biodiversity Scientists +
Research InfrastructureTechies
ProjectTeams……. Of Individuals
Collaborating and Competing Simultaneously
KnowledgeTurning
Increase Flow of Information
• Across scattered resources, platform, people
• Coordination, collaboration
• Cumulative, Dynamic
[original figure: Josh Sommer]
Cumulative
Commons
Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, I3CK, 2013, isbn: 978-3-642-37186-8
• Distributed, Fragmented, Siloed
• No single entry point
• Living software, models, data, catalogues, tools …
What’s the Commons?
Resources
• collectively created
• owned or shared
• between or among a
community
Governance
https://scholarlycommons.org/
Macro, Micro*, pooled
• public resources
• data centres
• journals
• dedicated projects
• governance
• majority of
researchers
• labs & universities
• generators
• my resources
*Meso too – but to complicated for 20 minutes! See
http://www.knowledge-exchange.info/event/ke-approach-open-scholarship
Some Data-driven Predictive Science
in Ecological Niche Modelling
predatory fish
the grazer endemic alga
[Obst, Leidenberger]
BioSTIF
Do Research
Research Infrastructure
Services
Assemble
Methods, Materials Experiment
ObserveSimulate
Analyse
Results
Quality
Assessment
Track and Credit
Disseminate
Deposit &
Licence
Marketplace
Services
Publish
Share
Results
Any
research
product
Selected
products
Manage
Results
The Data-Driven Open Science
Public + Personal Commons
Science 2.0 Repositories: Time for a Change in Scholarly Communication Assante, Candela, Castelli, Manghi, Pagano, D-Lib 2015
“The questions don’t change but the
answers do” Dan Reed, Microsoft
Salami Slicing, Scattering
101 Innovations in Scholarly Communication - the Changing Research Workflow, Boseman and Kramer, 2015,
http://figshare.com/articles/101_Innovations_in_Scholarly_Communication_the_Changing_Research_Workflow/1286826
Research
Infrastructure
Services
Assemble
Methods, Materials Experiment
ObserveSimulate
Analyse
Results
Quality
Assessment
Track and Credit
Disseminate
Deposit &
Licence
Marketplace
Services
Share
Results
Manage
Results
Building a FAIR Research Commons
Portable
Automated
Reproducible
Methods
Supporting
Collaborations
Science 2.0 Repositories:Time for a Change in Scholarly Communication
Assante, Candela,Castelli, Manghi, Pagano DOI: 10.1045/january2015-assante
Mesirov,J. Accessible Reproducible Research Science
327(5964), 415-416 (2010)
Clear steps
Transparent
Comprehensible
Replicable
Logged
Accessible
Provenance
Standardised
Harmonised
Combined
Method
Materials
Variations X N
Repeat. Compare.
Log & Track
Provenance
Scale
Data-driven Science, Predictive Science
is Software-driven, Method-Driven
x
Data ScienceAnalytics
Machine learning
Discovery, New algorithms
Data stewardship
Standardisation, Harmonisation,
Annotation and enrichment,
Maintaining access, preserving
Software stewardship
Updates, versions, porting
Prep & Processing
Data wrangling & curation
Instrument pipelines
Simulation sweeps
Method Commodities
Workflows ASAP
Automate, Scale, Abstract, Provenance
Taverna 14th Anniversary
Building the FAIR Research Commons: A Data Driven Society of Scientists
Methods
techniques, algorithms, spec.
of the steps, models, versions,
robustness, statistical power …
Materials
datasets, parameters, thresholds,
versions, algorithm seeds, reference
datasets…
Instruments
tools, codes, services, scripts,
underlying libraries, versions,
workflows…
Laboratory
computational environment,
High performance access,
Operating system…
Data Instruments -> Data Scopes
Method Objects, fragile, updating ….
Maintain for Running
Document for Reading
Software is a first class member of
Data-driven Science
56% Of UK researchers
develop their own
research software
or scripts
73% Of UK researchers
have had no
formal software
engineering
training
Survey of researchers from 15 RussellGroup universities conducted by SSI between August - October 2014.
406 respondents covering representative range of funders, discipline and seniority.
Goble, Better Software, Better Research IEEE Internet Computing doi: 10.1109/MIC.2014.88
De Roure, Goble,Software Design for Empowering Scientists IEEE Software doi: 10.1109/MS.2009.22
Research Software Engineers
National Capability
10th Anniversary
Workflow Commons
Groups
Social collaboration, credit and
citation around Research Objects
Replicate- Reproduce - Remix -Repurpose
Reuse – Repurpose – avoid Reinvent
FAIR Workflow
Research Object
Reproducibility, Portability, Repurpose
Repair. Preservation,
Executable Publishing
Metadata
Object
metadata, ontologies,
identifiers
Manifest
Provenance
Dependencies
Versions
Checklists
Annotations
Container
System
researchobject.org
Unbounded Objects
FAIR Methods, Different wflow systems
Living
Products
Jennifer Schopf,Treating Data Like Software: ACase for ProductionQuality Data, JCDL
2012
Don’t Publish, Release
Analogous to software
products and practices
rather than data or
articles
Agile Data-driven
Science
Treat ALL Products and
ALL Research Like Software
“evolving
manuscript”
Sir Mark WalportTime Higher Education Supplement, 14 May 2015
Context
Relationships
Credit
Research Goods FAIR Exchange
Governance
Stewardship
Credit
Tracking
Lifecycles
Fixivity…
Arxiv,
my Lab
myExperiment
GitHub,
Web Service myWebSite
bioModels.org,
openModeller
PubMed
Spreadsheet in
figshare
ArrayExpress,
BioSamples,
PRIDE, GBIF,
my Lab,
institutional
repository
Overlaying the
Research Commons
ecosystem
Unbounded
Composite
Living
Rots
Tracking, credit mining, comparison, auto-
metadata, blockchain, boundary objects….
1
3
2
A FAIR KnowledgeWeb of Research Objects
Map across metadata
Threaded publications
Navigate, Pivot-Focus, Cite
Self-describing
Unit for Reproducibility / Productivity, Portability,
Preservation, Executable Publishing
researchobject.org
Bechhofer et al (2013)Why linked data is not enough for scientists https://doi.org/10.1016/j.future.2011.08.004
Bechhofer et al (2010) Research Objects:Towards Exchange and Reuse of Digital Knowledge, https://eprints.soton.ac.uk/268555/
Linked Data / Semantic Web
FAIR machine processable metadata
Standards-based generic
metadata framework
Provenance
Dependencies
Versions
Checklists
Annotations
The time is right …
Reproducible Document
Stack project
Social
Technology Process
Purpose
Publishers, Research
Infrastructures, Communities,
Library services, Agencies ….
Not Jo Public….
Research
Infrastructure
Services
Assemble
Methods, Materials Experiment
ObserveSimulate
Analyse
Results
Quality
Assessment
Track and Credit
Disseminate
Deposit &
Licence
Marketplace
Services
Share
Results
Manage
Results
Building a FAIR Research Commons
Portable
Automated
Reproducible
Methods Supporting collaborations
to make & exchange FAIR
content
Systems Biology Projects
• SME multi-disciplinary groups
• Multi-site collaborations
• Competing
• Experimentalists, dry modellers
• Self-deposit, no stewardship skills
• Funder driven sharing
modellers
experimentalists
Build a Project Commons!!
• Foster stewardship
• Stimulate sharing
• Ensure retention
• Respect global community,
local project resources
http://fair-dom.org Wolstencroft et al , Nucleic Acids Research, 2016, 10.1093/nar/gkw1032.
3 Studies
Model analysis,
construction, validation
24 Assays/Analysis
Simulations,
characterisations
16
19
13
2
1
Structured organisation
Retain context in one place, Release FAIR products
Use and deposit in the fragmented resources [Penkler, Snoep]
FAIRDOMHub Systems Biology Commons
http://fairdomhub.org
Distributed Commons, Integrated View
“During and within” publishing
Simulate
Compare
Validate
10th Anniversary
What methods are been used to
determine enzyme activity?
What SOP was used for this
sample?
Where is the validation data for this
model?
Is there any group
generating kinetic data?
Is this data available?
Track versions of my model
Whats the relationship between the
data and model?
Which data belong to
which publications?
Self-controlled spaces
• enclaves -> public
Discover own assets
One entry point
• over external systems
Project Pals
Post-docs, Postgrads,
Data stewards
Building the Commons so they Come
The Programme Funders
Stewardship
Support
TheTragedy of the Commons? FAIR Play?
Values
of assets
of reproducibility
of metadata
economics of infras.
priorities
Behaviours
enclave sharing
hoarding, flirting, voyerism
consumer-producer asymmetry
playground rules
Sweatshop
collaborating but competing
burden - time, skills
short term, shortcuts
principle investigators
tools & templates
seamless join-up
automation, stewards
reprod. debt is hard
The last mile
Self
Retention, Access
Productivity
Quick, Lightweight
Simple
ShortTerm
Credit
Trusted & Free
Just Enough
Skills?
Service
Sharing
Reproducibility
Accurate, Reusable
Rich
LongTerm
Credit
Sustained
Just in Case
Stewards
Pushing FAIR upstream
“Sloppy ScienceWins”
John Ioannidis,
Stanford School of Medicine
Open Science Fair, Athens 2017
Social
Technology Process
Professional
Stewardship
Ramps
Defeating
Cultural Inertia
Overcoming
TheTragedy of the Commons
Paying for it
Building the FAIR Research Commons: A Data Driven Society of Scientists
By Side Effect
By side effect – metadata for FAIR
Universal tagging of Life
Science datasets, tools,
protocols, training materials
Web scale knowledge graph
Embedded ontologies and
metadata templates
Metadata harvesting by
stealth
https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/
Ask what can you and Data Science
do for the FAIR Commons?
Building the
FAIR Research Commons:
A Data Driven Society of Scientists
Release FAIR
Research Objects
Manage
Datascopes
FAIR play incentives
FAIR
Research
Commons
All the members of the Wf4Ever team
Colleagues in Manchester’s
Information Management Group,
ELIXIR-UK, Bioschemas
http://www.researchobject.org
http://www.myexperiment.org
http://wf4ever.org
http://www.fair-dom.org
http://www.fairdomhub.org
http://seek4science.org
http://rightfield.org.uk
http://www.bioschemas.org
http://www.commonwl.org
http://www.bioexcel.eu
http://www.openphacts.org
https://www.force11.org/
Mark Robinson
AlanWilliams
Jo McEntyre
Norman Morrison
Stian Soiland-Reyes
Paul Groth
Tim Clark
Alejandra Gonzalez-Beltran
Philippe Rocca-Serra
Ian Cottam
Susanna Sansone
Kristian Garza
Daniel Garijo
Catarina Martins
Alasdair Gray
Rafael Jimenez
Iain Buchan
Caroline Jay
Michael Crusoe
Katy Wolstencroft
Barend Mons
Sean Bechhofer
Matthew Gamble
Raul Palma
Jun Zhao
Josh Sommer
Matthias Obst
Jacky Snoep
David Gavaghan
Stuart Owen
Finn Bacall
Paolo Missier
Phil Crouch
Oscar Corcho
Dan Katz
Arfon Smith
David De Roure
Marco Roos
Massimilano Assante
Paolo Manghi
EXTRAS
HIDDEN SLIDES

More Related Content

PPTX
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
PPTX
Open Science: how to serve the needs of the researcher?
PPTX
FAIRy stories: the FAIR Data principles in theory and in practice
PPTX
How are we Faring with FAIR? (and what FAIR is not)
PPTX
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
PPTX
A Big Picture in Research Data Management
PPTX
RO-Crate: A framework for packaging research products into FAIR Research Objects
PPTX
Introduction to FAIRDOM
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Open Science: how to serve the needs of the researcher?
FAIRy stories: the FAIR Data principles in theory and in practice
How are we Faring with FAIR? (and what FAIR is not)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
A Big Picture in Research Data Management
RO-Crate: A framework for packaging research products into FAIR Research Objects
Introduction to FAIRDOM

What's hot (20)

PPTX
The swings and roundabouts of a decade of fun and games with Research Objects
PPTX
Let’s go on a FAIR safari!
PPTX
FAIR Computational Workflows
PPTX
FAIR History and the Future
PPTX
ELIXIR UK Node presentation to the ELIXIR Board
PPTX
FAIRy stories: tales from building the FAIR Research Commons
PPTX
Better software, better service, better research: The Software Sustainabilit...
PPTX
FAIR Workflows and Research Objects get a Workout
PPTX
FAIR Computational Workflows
PPTX
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
PPTX
Research Objects, SEEK and FAIRDOM
PDF
Scientific Workflows: what do we have, what do we miss?
PPTX
Reproducibility (and the R*) of Science: motivations, challenges and trends
PPTX
Open Access: Open Access Looking for ways to increase the reach and impact of...
PPTX
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
PPTX
Reproducible Research: how could Research Objects help
PDF
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
PPTX
FAIRer Research
PPTX
Being FAIR: Enabling Reproducible Data Science
PPTX
Research Object Community Update
The swings and roundabouts of a decade of fun and games with Research Objects
Let’s go on a FAIR safari!
FAIR Computational Workflows
FAIR History and the Future
ELIXIR UK Node presentation to the ELIXIR Board
FAIRy stories: tales from building the FAIR Research Commons
Better software, better service, better research: The Software Sustainabilit...
FAIR Workflows and Research Objects get a Workout
FAIR Computational Workflows
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Research Objects, SEEK and FAIRDOM
Scientific Workflows: what do we have, what do we miss?
Reproducibility (and the R*) of Science: motivations, challenges and trends
Open Access: Open Access Looking for ways to increase the reach and impact of...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Reproducible Research: how could Research Objects help
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
FAIRer Research
Being FAIR: Enabling Reproducible Data Science
Research Object Community Update
Ad

Similar to Building the FAIR Research Commons: A Data Driven Society of Scientists (20)

PPT
Acting as Advocate? Seven steps for libraries in the data decade
PDF
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
PPTX
Open Research: Manchester leading and learning
PPTX
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
PPT
eResearch initiatives: collaboration
PPT
UK Digital Curation Centre: enabling research data management at the coalface
PPTX
Preserving the Inputs and Outputs of Scholarship
PPTX
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
PPT
2016-05-31 Venia Legendi (CEITER): Adolfo Ruiz Calleja
PPT
Virtual Research Networks : Towards Research 2.0
PPTX
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
PPTX
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
PDF
NFDI Physical Sciences Colloquium - FAIR
PPTX
Keynote speech - Carole Goble - Jisc Digital Festival 2015
PPTX
RARE and FAIR Science: Reproducibility and Research Objects
PPT
Collaboration and Sharing
PDF
Metadata 2020 Vivo Conference 2018
PPTX
NIH Data Summit - The NIH Data Commons
PPT
Vellino presentationtocisti
PPTX
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Acting as Advocate? Seven steps for libraries in the data decade
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Open Research: Manchester leading and learning
eROSA Stakeholder WS1: Big Data and Open Science in agricultural and environm...
eResearch initiatives: collaboration
UK Digital Curation Centre: enabling research data management at the coalface
Preserving the Inputs and Outputs of Scholarship
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
2016-05-31 Venia Legendi (CEITER): Adolfo Ruiz Calleja
Virtual Research Networks : Towards Research 2.0
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
NFDI Physical Sciences Colloquium - FAIR
Keynote speech - Carole Goble - Jisc Digital Festival 2015
RARE and FAIR Science: Reproducibility and Research Objects
Collaboration and Sharing
Metadata 2020 Vivo Conference 2018
NIH Data Summit - The NIH Data Commons
Vellino presentationtocisti
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Ad

More from Carole Goble (14)

PPTX
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
PPTX
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
PPTX
RO-Crate: packaging metadata love notes into FAIR Digital Objects
PPTX
Research Software Sustainability takes a Village
PPTX
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
PPTX
FAIR Computational Workflows
PPTX
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
PPTX
FAIR Computational Workflows
PPTX
EOSC-Life Workflow Collaboratory
PPTX
FAIR Computational Workflows
PPTX
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
PPTX
What is Reproducibility? The R* brouhaha and how Research Objects can help
PPTX
Reflections on a (slightly unusual) multi-disciplinary academic career
PPTX
Better Software, Better Research
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Research Software Sustainability takes a Village
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
FAIR Computational Workflows
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
FAIR Computational Workflows
EOSC-Life Workflow Collaboratory
FAIR Computational Workflows
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
What is Reproducibility? The R* brouhaha and how Research Objects can help
Reflections on a (slightly unusual) multi-disciplinary academic career
Better Software, Better Research

Recently uploaded (20)

PPTX
1.pptx 2.pptx for biology endocrine system hum ppt
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPT
protein biochemistry.ppt for university classes
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
Microbiology with diagram medical studies .pptx
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
1.pptx 2.pptx for biology endocrine system hum ppt
Biophysics 2.pdffffffffffffffffffffffffff
Cell Membrane: Structure, Composition & Functions
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
The KM-GBF monitoring framework – status & key messages.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
bbec55_b34400a7914c42429908233dbd381773.pdf
AlphaEarth Foundations and the Satellite Embedding dataset
Viruses (History, structure and composition, classification, Bacteriophage Re...
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ECG_Course_Presentation د.محمد صقران ppt
protein biochemistry.ppt for university classes
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Microbiology with diagram medical studies .pptx
microscope-Lecturecjchchchchcuvuvhc.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField

Building the FAIR Research Commons: A Data Driven Society of Scientists

  • 1. Building the FAIR Research Commons: A Data Driven Society of Scientists Professor Carole Goble CBE FREng FBCS The University of Manchester, UK carole.goble@manchester.ac.uk FAIR Research Commons Symposium: The Future of a Data-Driven Society, Maastricht University, 25 Jan 2018
  • 2. Data-Driven Science Simulations, data exploration, data processing, analytics, text mining, visual analytics, automated inference…. e-Science: enabling Data Driven Science e-Infrastructure: enabling e-Science Distributed computing Data management, Catalogues Virtual Research Environments Metadata & Semantic Web technologies Software Engineering Products and Services Collaboration, Sharing & Publishing Platforms
  • 4. “The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 Principles Metadata Identifiers Access policies Technical: Political Social Economic: A Flag, A Meme
  • 5. The Future of a Data-Driven Society A Society of Scientists Do Data Driven Science Data Driven Scholarship Data contributors, curators, consumers Biodiversity Scientists + Research InfrastructureTechies ProjectTeams……. Of Individuals Collaborating and Competing Simultaneously
  • 6. KnowledgeTurning Increase Flow of Information • Across scattered resources, platform, people • Coordination, collaboration • Cumulative, Dynamic [original figure: Josh Sommer] Cumulative Commons Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, I3CK, 2013, isbn: 978-3-642-37186-8
  • 7. • Distributed, Fragmented, Siloed • No single entry point • Living software, models, data, catalogues, tools … What’s the Commons? Resources • collectively created • owned or shared • between or among a community Governance https://scholarlycommons.org/
  • 8. Macro, Micro*, pooled • public resources • data centres • journals • dedicated projects • governance • majority of researchers • labs & universities • generators • my resources *Meso too – but to complicated for 20 minutes! See http://www.knowledge-exchange.info/event/ke-approach-open-scholarship
  • 9. Some Data-driven Predictive Science in Ecological Niche Modelling predatory fish the grazer endemic alga [Obst, Leidenberger]
  • 11. Do Research Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Publish Share Results Any research product Selected products Manage Results The Data-Driven Open Science Public + Personal Commons Science 2.0 Repositories: Time for a Change in Scholarly Communication Assante, Candela, Castelli, Manghi, Pagano, D-Lib 2015
  • 12. “The questions don’t change but the answers do” Dan Reed, Microsoft Salami Slicing, Scattering
  • 13. 101 Innovations in Scholarly Communication - the Changing Research Workflow, Boseman and Kramer, 2015, http://figshare.com/articles/101_Innovations_in_Scholarly_Communication_the_Changing_Research_Workflow/1286826
  • 14. Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Share Results Manage Results Building a FAIR Research Commons Portable Automated Reproducible Methods Supporting Collaborations Science 2.0 Repositories:Time for a Change in Scholarly Communication Assante, Candela,Castelli, Manghi, Pagano DOI: 10.1045/january2015-assante Mesirov,J. Accessible Reproducible Research Science 327(5964), 415-416 (2010)
  • 15. Clear steps Transparent Comprehensible Replicable Logged Accessible Provenance Standardised Harmonised Combined Method Materials Variations X N Repeat. Compare. Log & Track Provenance Scale Data-driven Science, Predictive Science is Software-driven, Method-Driven x
  • 16. Data ScienceAnalytics Machine learning Discovery, New algorithms Data stewardship Standardisation, Harmonisation, Annotation and enrichment, Maintaining access, preserving Software stewardship Updates, versions, porting Prep & Processing Data wrangling & curation Instrument pipelines Simulation sweeps
  • 17. Method Commodities Workflows ASAP Automate, Scale, Abstract, Provenance Taverna 14th Anniversary
  • 19. Methods techniques, algorithms, spec. of the steps, models, versions, robustness, statistical power … Materials datasets, parameters, thresholds, versions, algorithm seeds, reference datasets… Instruments tools, codes, services, scripts, underlying libraries, versions, workflows… Laboratory computational environment, High performance access, Operating system… Data Instruments -> Data Scopes Method Objects, fragile, updating …. Maintain for Running Document for Reading
  • 20. Software is a first class member of Data-driven Science 56% Of UK researchers develop their own research software or scripts 73% Of UK researchers have had no formal software engineering training Survey of researchers from 15 RussellGroup universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority. Goble, Better Software, Better Research IEEE Internet Computing doi: 10.1109/MIC.2014.88 De Roure, Goble,Software Design for Empowering Scientists IEEE Software doi: 10.1109/MS.2009.22 Research Software Engineers National Capability
  • 21. 10th Anniversary Workflow Commons Groups Social collaboration, credit and citation around Research Objects Replicate- Reproduce - Remix -Repurpose Reuse – Repurpose – avoid Reinvent
  • 22. FAIR Workflow Research Object Reproducibility, Portability, Repurpose Repair. Preservation, Executable Publishing Metadata Object metadata, ontologies, identifiers Manifest Provenance Dependencies Versions Checklists Annotations Container System researchobject.org Unbounded Objects
  • 23. FAIR Methods, Different wflow systems Living Products
  • 24. Jennifer Schopf,Treating Data Like Software: ACase for ProductionQuality Data, JCDL 2012 Don’t Publish, Release Analogous to software products and practices rather than data or articles Agile Data-driven Science Treat ALL Products and ALL Research Like Software “evolving manuscript” Sir Mark WalportTime Higher Education Supplement, 14 May 2015
  • 25. Context Relationships Credit Research Goods FAIR Exchange Governance Stewardship Credit Tracking Lifecycles Fixivity… Arxiv, my Lab myExperiment GitHub, Web Service myWebSite bioModels.org, openModeller PubMed Spreadsheet in figshare ArrayExpress, BioSamples, PRIDE, GBIF, my Lab, institutional repository Overlaying the Research Commons ecosystem Unbounded Composite Living Rots
  • 26. Tracking, credit mining, comparison, auto- metadata, blockchain, boundary objects…. 1 3 2 A FAIR KnowledgeWeb of Research Objects Map across metadata Threaded publications Navigate, Pivot-Focus, Cite Self-describing
  • 27. Unit for Reproducibility / Productivity, Portability, Preservation, Executable Publishing researchobject.org Bechhofer et al (2013)Why linked data is not enough for scientists https://doi.org/10.1016/j.future.2011.08.004 Bechhofer et al (2010) Research Objects:Towards Exchange and Reuse of Digital Knowledge, https://eprints.soton.ac.uk/268555/ Linked Data / Semantic Web FAIR machine processable metadata Standards-based generic metadata framework Provenance Dependencies Versions Checklists Annotations
  • 28. The time is right … Reproducible Document Stack project Social Technology Process Purpose Publishers, Research Infrastructures, Communities, Library services, Agencies …. Not Jo Public….
  • 29. Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Share Results Manage Results Building a FAIR Research Commons Portable Automated Reproducible Methods Supporting collaborations to make & exchange FAIR content
  • 30. Systems Biology Projects • SME multi-disciplinary groups • Multi-site collaborations • Competing • Experimentalists, dry modellers • Self-deposit, no stewardship skills • Funder driven sharing modellers experimentalists Build a Project Commons!! • Foster stewardship • Stimulate sharing • Ensure retention • Respect global community, local project resources http://fair-dom.org Wolstencroft et al , Nucleic Acids Research, 2016, 10.1093/nar/gkw1032.
  • 31. 3 Studies Model analysis, construction, validation 24 Assays/Analysis Simulations, characterisations 16 19 13 2 1 Structured organisation Retain context in one place, Release FAIR products Use and deposit in the fragmented resources [Penkler, Snoep]
  • 32. FAIRDOMHub Systems Biology Commons http://fairdomhub.org Distributed Commons, Integrated View “During and within” publishing Simulate Compare Validate 10th Anniversary
  • 33. What methods are been used to determine enzyme activity? What SOP was used for this sample? Where is the validation data for this model? Is there any group generating kinetic data? Is this data available? Track versions of my model Whats the relationship between the data and model? Which data belong to which publications? Self-controlled spaces • enclaves -> public Discover own assets One entry point • over external systems
  • 34. Project Pals Post-docs, Postgrads, Data stewards Building the Commons so they Come The Programme Funders Stewardship Support
  • 35. TheTragedy of the Commons? FAIR Play? Values of assets of reproducibility of metadata economics of infras. priorities Behaviours enclave sharing hoarding, flirting, voyerism consumer-producer asymmetry playground rules Sweatshop collaborating but competing burden - time, skills short term, shortcuts principle investigators tools & templates seamless join-up automation, stewards reprod. debt is hard The last mile
  • 36. Self Retention, Access Productivity Quick, Lightweight Simple ShortTerm Credit Trusted & Free Just Enough Skills? Service Sharing Reproducibility Accurate, Reusable Rich LongTerm Credit Sustained Just in Case Stewards Pushing FAIR upstream
  • 37. “Sloppy ScienceWins” John Ioannidis, Stanford School of Medicine Open Science Fair, Athens 2017
  • 41. By side effect – metadata for FAIR Universal tagging of Life Science datasets, tools, protocols, training materials Web scale knowledge graph Embedded ontologies and metadata templates Metadata harvesting by stealth https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/
  • 42. Ask what can you and Data Science do for the FAIR Commons?
  • 43. Building the FAIR Research Commons: A Data Driven Society of Scientists Release FAIR Research Objects Manage Datascopes FAIR play incentives FAIR Research Commons
  • 44. All the members of the Wf4Ever team Colleagues in Manchester’s Information Management Group, ELIXIR-UK, Bioschemas http://www.researchobject.org http://www.myexperiment.org http://wf4ever.org http://www.fair-dom.org http://www.fairdomhub.org http://seek4science.org http://rightfield.org.uk http://www.bioschemas.org http://www.commonwl.org http://www.bioexcel.eu http://www.openphacts.org https://www.force11.org/ Mark Robinson AlanWilliams Jo McEntyre Norman Morrison Stian Soiland-Reyes Paul Groth Tim Clark Alejandra Gonzalez-Beltran Philippe Rocca-Serra Ian Cottam Susanna Sansone Kristian Garza Daniel Garijo Catarina Martins Alasdair Gray Rafael Jimenez Iain Buchan Caroline Jay Michael Crusoe Katy Wolstencroft Barend Mons Sean Bechhofer Matthew Gamble Raul Palma Jun Zhao Josh Sommer Matthias Obst Jacky Snoep David Gavaghan Stuart Owen Finn Bacall Paolo Missier Phil Crouch Oscar Corcho Dan Katz Arfon Smith David De Roure Marco Roos Massimilano Assante Paolo Manghi