SlideShare a Scribd company logo
The People Behind
Research Software
crediting from the informatics,
technical point of view
Professor Carole Goble,
University of Manchester, UK
Software Sustainability Institute UK
ELIXIR, ISBE, FAIRDOM
Views are my own
Science Europe LEGS Committee: Career Pathways in Multidisciplinary Research:
How to Assess the Contributions of Single Authors in Large Teams, 1-2 Dec 2015, Brussels.
Team Science: Ego-System
• Experimental scientists
• Theoretical scientists
• Modellers
• Social scientists
• Computer scientists
• Computational Scientists
• Scientific informaticians
• Specialist Tool developers
• Research Software Engineers
• Data engineers and curators
• Service & resource providers
• Infrastructure developers
• System Administrators
Many software, services
and public data resources
are team based
collaborations
Service vs Science in Projects
teams within teams
Biologists
Software frameworks
Tools, Infrastructure
Data platforms
Public data archives
Bioinformaticians
Comp Biologists
Local data curators
Informatics contribution to team
Reputation, Recognition, Productivity, Respect
Contribution to the informatics
– Technical publications in their own right
– Software publications: citation proxies
• Fosselise snapshot of authors as
contributors
– Specific code and curation tracking
– Usage metrics (downloads, reuse)
– Comp Sci - Conferences matter
– IMPACT
Compound, collaborative, living nature of
data and software
Acknowledgement by research teams
– “We are not the janitors” It’s not “free”.
– The Craftsmen of Science
– Careers, credibility and sustainability
– Recognised career role of Research Software
Engineer and BioCurator
– Recognition of professionalism, software and
data quality.
– Reward for LABOUR.
Informatics contribution to team
Reputation, Recognition, Productivity, Respect
*Survey of researchers from 15 UK Russell Group universities conducted by SSI between August - October
2014. 406 respondents covering representative range of funders, discipline and seniority.
Crediting informatics and data folks in life science teams
Credit
Biologists
Bioinformaticians
Cite
Local tool providers
Public data set providers
Service vs Science
Background vs Foreground
Data [and software] in
foreground most likely cited.
Same data [and software] viewed
as background not / explicitly
cited though equally essential
Wynholds, et al (2012) Data, data use, and scientific inquiry: two case studies of data practices
10.1145/2232817.2232822
25% Publications that used
the public Arrayexpress
Archive cited it*
The invisibility of software
esp software that is widely
used, infrastructural,
components or cross-discipline
*Rung, Brazma Reuse of public wide gene expression data Nature Review Genetics 2012
What is a Team? Credit drift
Immediate
team
Background
team
“Foreground”
informatics
Authorship Authorship?
Cited?
Acknowledged
Cited?
Mentioned
Ignored
“Background”
informatics
Cited
The Currency of Recognition
Person Career
Peers
Funders
Institutions
Public
Resource Sustainability
Software mentions in the
biology literature (90 articles)
Howison and Bullard 2015 The visibility of software in the scientific literature: how
do scientists mention software and how effective are those mentions? J Assoc for
Info Science and Technology DOI: 10.1002/asi.23538
37% citations formal
87% software could be found
informal mentions very common
-> poor at providing crediting information
18% software author offered preferred citation
-> 32% who cited it ignored it
24% journals had a citation policy Legal License
attribution
obligations
ignored
Team reciprocity rules
Download and Go. No.
Jam for Everyone.
sciencecodemanifesto.org
1. Software and Data Research Objects
into the Publishing Workflow
informal
mentions
replaced by
formal
http://ivory.idyll.org/blog/2015-authorship-on-software-papers.html
*http://arxiv.org/pdf/1407.5117v3.pdf
• Research Object-specific credit models
– Software, data, models….
– Credit based on use: downloads, reusability, reuse, FAIR
• Contribution: Credit distribution, propagation, dividends
– Transitive credit maps (Katz and Smith)* , CReDIT**
• Use: Credit trajectories: tracing, tracking, mining
– Recovery from literature, identifier and provenance infrastructure,
standards, data/software level metrics services (Datacite),
repositories, machine readable and processable metadata.
3. Credit
networks &
credit currency
**http://casrai.org/CRediT
http://depsy.org/
2. Stop conflating credit with
Authorship
Contribution
Roles
Usage
Liz Allen: CreDiT
4. Research units and credit models
that reflect software
Not Publish. Release paradigm. Portfolio paradigm.
Jennifer Schopf,Treating Data Like Software: A Case for Production Quality Data,JCDL 2012
Evolving Multi-stewarded
Multi-authored
Multi-platform
Reproducible
Executable papers
Connected
Body of work
Compound, Aggregated
https://dx.doi.org/10.1111/febs.13237
https://doi.org/10.15490/seek.1.investigation.56
http://www.fair-dom.org
28/01/2016 22
An “evolving manuscript” would begin with a pre-
publication, pre-peer review “beta 0.9” version of an
article, followed by the approved published article itself, [
… ] “version 1.0”.
Subsequently, scientists would update this paper with
details of further work as the area of research develops.
Versions 2.0 and 3.0 might allow for the “accretion of
confirmation [and] reputation”.
Ottoline Leyser […] assessment criteria in science revolve
around the individual. “People have stopped thinking
about the scientific enterprise”.
http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
Ramps vs Revolutions
Technical ramps
• Machinery, tools, platforms,
repositories
Process ramps
• Research processes and
Publisher workflows
Social ramps
• Rules and policies
• Adoption by stakeholders
– interventions & automations
• Recognition by stakeholders
Credit is like love not money
Citations and across discipline boundaries.
Within discipline more like dividends.
All research products and all scholarly
labour are equally valued
(except by institutional promotion,
funding review and REF committees)
Public software and data resources
are not free.
Stewardship costs and needs crediting
Publishers adapt to “Publications”
that are dynamic Research Objects
(still need to snapshot)
http://www.software.ac.uk/software-credi
https://www.force11.org/group/software-citation-working-group
Links
• FAIRDOM
– http://www.fair-dom.org
• SEEK Platform
– http://www.seek4science.org
• Research Objects
– http://www.researchobject.org
• Software Sustainability Institute
– http://www.software.ac.uk
• Software Carpentry
– http://www.software-carpentry.org
• Force11
– http://www.force11.org

More Related Content

PPTX
Research Objects, SEEK and FAIRDOM
PPTX
Mtsr2015 goble-keynote
PPTX
FAIRer Research
PPTX
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
PDF
Research Shared: researchobject.org
PPTX
The Research Object Initiative: Frameworks and Use Cases
PPTX
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
PPTX
Reproducibility, Research Objects and Reality, Leiden 2016
Research Objects, SEEK and FAIRDOM
Mtsr2015 goble-keynote
FAIRer Research
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Research Shared: researchobject.org
The Research Object Initiative: Frameworks and Use Cases
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Reproducibility, Research Objects and Reality, Leiden 2016

What's hot (20)

PPTX
Let’s go on a FAIR safari!
PDF
Capturing the context: one small(ish step for modellers, one giant leap for m...
PPTX
Citing data in research articles: principles, implementation, challenges - an...
PPTX
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
PPTX
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
PPTX
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
PPTX
The Rhetoric of Research Objects
PPTX
Advances in Scientific Workflow Environments
PPTX
Being FAIR: Enabling Reproducible Data Science
PPTX
ROHub
PPTX
Software Sustainability: Better Software Better Science
PPT
Publishing data and code openly
PDF
Karma Data Modeling
PDF
Starting from scratch – building the perfect digital repository
PPTX
Crosslinks
PDF
Modeling Data with Karma – Data Integration Tool
PDF
Karma is a tool! Managing your Data
PDF
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
PPTX
Aspects of Reproducibility in Earth Science
PDF
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...
Let’s go on a FAIR safari!
Capturing the context: one small(ish step for modellers, one giant leap for m...
Citing data in research articles: principles, implementation, challenges - an...
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
The Rhetoric of Research Objects
Advances in Scientific Workflow Environments
Being FAIR: Enabling Reproducible Data Science
ROHub
Software Sustainability: Better Software Better Science
Publishing data and code openly
Karma Data Modeling
Starting from scratch – building the perfect digital repository
Crosslinks
Modeling Data with Karma – Data Integration Tool
Karma is a tool! Managing your Data
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
Aspects of Reproducibility in Earth Science
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...
Ad

Viewers also liked (10)

PDF
Chillida circo
PDF
BBA_Agenda_for_growth_WEB
DOC
Marathi, Mahatma Gandhi And Namasmaran Dr Shriniwas Kashalikar
PDF
สติ๊กเกอร์ ฉลากยา พิมพ์ดำ-สี พิมพ์สำเร็จ
PPTX
Dolphin swimming academy ac program wsld
PPTX
PPM Assignment
PPT
Dr. Azpitarte Almagro: Caso clínico con insuficiencia renal
PDF
MALACHI #6 - WILL MAN ROB GOD - PTR ALVIN GUTIERREZ - 10AM MORNING SERVICE
PPTX
Kalipres
Chillida circo
BBA_Agenda_for_growth_WEB
Marathi, Mahatma Gandhi And Namasmaran Dr Shriniwas Kashalikar
สติ๊กเกอร์ ฉลากยา พิมพ์ดำ-สี พิมพ์สำเร็จ
Dolphin swimming academy ac program wsld
PPM Assignment
Dr. Azpitarte Almagro: Caso clínico con insuficiencia renal
MALACHI #6 - WILL MAN ROB GOD - PTR ALVIN GUTIERREZ - 10AM MORNING SERVICE
Kalipres
Ad

Similar to Crediting informatics and data folks in life science teams (20)

PPTX
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
PDF
Citation and reproducibility in software
PPTX
Data Publishing Workflows with Dataverse
PPTX
Research Objects: more than the sum of the parts
PPTX
Software Repositories for Research-- An Environmental Scan
PPTX
Emerging Data Citation Infrastructure
PPTX
20160607 citation4software panel
PPTX
Software Repositories for Research -- An Environmental Scan
PPTX
Hughes RDAP11 Data Publication Repositories
PPT
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
PPTX
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
PDF
A Data Biosphere for Biomedical Research
PDF
Tag.bio: Self Service Data Mesh Platform
PPTX
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
PPTX
BD2K and the Commons : ELIXR All Hands
PDF
NIH BD2K bioCADDIE DataMed: Data Discovery Index
PPTX
Software Citation in Theory and Practice
PPT
Fox-Keynote-Now and Now of Data Publishing-nfdp13
PPT
myExperiment - Defining the Social Virtual Research Environment
PDF
RDA BoF on Sustainability - my experience with ISA tools
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Citation and reproducibility in software
Data Publishing Workflows with Dataverse
Research Objects: more than the sum of the parts
Software Repositories for Research-- An Environmental Scan
Emerging Data Citation Infrastructure
20160607 citation4software panel
Software Repositories for Research -- An Environmental Scan
Hughes RDAP11 Data Publication Repositories
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
A Data Biosphere for Biomedical Research
Tag.bio: Self Service Data Mesh Platform
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
BD2K and the Commons : ELIXR All Hands
NIH BD2K bioCADDIE DataMed: Data Discovery Index
Software Citation in Theory and Practice
Fox-Keynote-Now and Now of Data Publishing-nfdp13
myExperiment - Defining the Social Virtual Research Environment
RDA BoF on Sustainability - my experience with ISA tools

More from Carole Goble (20)

PPTX
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
PPTX
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
PPTX
RO-Crate: packaging metadata love notes into FAIR Digital Objects
PPTX
Research Software Sustainability takes a Village
PPTX
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
PPTX
FAIR Computational Workflows
PPTX
Open Research: Manchester leading and learning
PPTX
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
PPTX
FAIR Computational Workflows
PPTX
FAIR Computational Workflows
PPTX
EOSC-Life Workflow Collaboratory
PPTX
FAIR Computational Workflows
PPTX
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
PPTX
FAIR Computational Workflows
PPTX
FAIR Workflows and Research Objects get a Workout
PPTX
FAIRy stories: the FAIR Data principles in theory and in practice
PPTX
RO-Crate: A framework for packaging research products into FAIR Research Objects
PPTX
The swings and roundabouts of a decade of fun and games with Research Objects
PPTX
How are we Faring with FAIR? (and what FAIR is not)
PPTX
What is Reproducibility? The R* brouhaha and how Research Objects can help
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Research Software Sustainability takes a Village
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
FAIR Computational Workflows
Open Research: Manchester leading and learning
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
FAIR Computational Workflows
FAIR Computational Workflows
EOSC-Life Workflow Collaboratory
FAIR Computational Workflows
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Computational Workflows
FAIR Workflows and Research Objects get a Workout
FAIRy stories: the FAIR Data principles in theory and in practice
RO-Crate: A framework for packaging research products into FAIR Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
How are we Faring with FAIR? (and what FAIR is not)
What is Reproducibility? The R* brouhaha and how Research Objects can help

Recently uploaded (20)

PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PDF
diccionario toefl examen de ingles para principiante
PPTX
Microbiology with diagram medical studies .pptx
PDF
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Production technology of seed spices,,,,
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Cell Membrane: Structure, Composition & Functions
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
HPLC-PPT.docx high performance liquid chromatography
microscope-Lecturecjchchchchcuvuvhc.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Comparative Structure of Integument in Vertebrates.pptx
Derivatives of integument scales, beaks, horns,.pptx
diccionario toefl examen de ingles para principiante
Microbiology with diagram medical studies .pptx
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
2. Earth - The Living Planet Module 2ELS
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Introduction to Fisheries Biotechnology_Lesson 1.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Production technology of seed spices,,,,
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField

Crediting informatics and data folks in life science teams

  • 1. The People Behind Research Software crediting from the informatics, technical point of view Professor Carole Goble, University of Manchester, UK Software Sustainability Institute UK ELIXIR, ISBE, FAIRDOM Views are my own Science Europe LEGS Committee: Career Pathways in Multidisciplinary Research: How to Assess the Contributions of Single Authors in Large Teams, 1-2 Dec 2015, Brussels.
  • 2. Team Science: Ego-System • Experimental scientists • Theoretical scientists • Modellers • Social scientists • Computer scientists • Computational Scientists • Scientific informaticians • Specialist Tool developers • Research Software Engineers • Data engineers and curators • Service & resource providers • Infrastructure developers • System Administrators Many software, services and public data resources are team based collaborations
  • 3. Service vs Science in Projects teams within teams Biologists Software frameworks Tools, Infrastructure Data platforms Public data archives Bioinformaticians Comp Biologists Local data curators
  • 4. Informatics contribution to team Reputation, Recognition, Productivity, Respect Contribution to the informatics – Technical publications in their own right – Software publications: citation proxies • Fosselise snapshot of authors as contributors – Specific code and curation tracking – Usage metrics (downloads, reuse) – Comp Sci - Conferences matter – IMPACT
  • 5. Compound, collaborative, living nature of data and software
  • 6. Acknowledgement by research teams – “We are not the janitors” It’s not “free”. – The Craftsmen of Science – Careers, credibility and sustainability – Recognised career role of Research Software Engineer and BioCurator – Recognition of professionalism, software and data quality. – Reward for LABOUR. Informatics contribution to team Reputation, Recognition, Productivity, Respect
  • 7. *Survey of researchers from 15 UK Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority.
  • 10. Service vs Science Background vs Foreground Data [and software] in foreground most likely cited. Same data [and software] viewed as background not / explicitly cited though equally essential Wynholds, et al (2012) Data, data use, and scientific inquiry: two case studies of data practices 10.1145/2232817.2232822 25% Publications that used the public Arrayexpress Archive cited it* The invisibility of software esp software that is widely used, infrastructural, components or cross-discipline *Rung, Brazma Reuse of public wide gene expression data Nature Review Genetics 2012
  • 11. What is a Team? Credit drift Immediate team Background team “Foreground” informatics Authorship Authorship? Cited? Acknowledged Cited? Mentioned Ignored “Background” informatics Cited
  • 12. The Currency of Recognition Person Career Peers Funders Institutions Public Resource Sustainability
  • 13. Software mentions in the biology literature (90 articles) Howison and Bullard 2015 The visibility of software in the scientific literature: how do scientists mention software and how effective are those mentions? J Assoc for Info Science and Technology DOI: 10.1002/asi.23538 37% citations formal 87% software could be found informal mentions very common -> poor at providing crediting information 18% software author offered preferred citation -> 32% who cited it ignored it 24% journals had a citation policy Legal License attribution obligations ignored
  • 14. Team reciprocity rules Download and Go. No. Jam for Everyone.
  • 16. 1. Software and Data Research Objects into the Publishing Workflow informal mentions replaced by formal
  • 18. *http://arxiv.org/pdf/1407.5117v3.pdf • Research Object-specific credit models – Software, data, models…. – Credit based on use: downloads, reusability, reuse, FAIR • Contribution: Credit distribution, propagation, dividends – Transitive credit maps (Katz and Smith)* , CReDIT** • Use: Credit trajectories: tracing, tracking, mining – Recovery from literature, identifier and provenance infrastructure, standards, data/software level metrics services (Datacite), repositories, machine readable and processable metadata. 3. Credit networks & credit currency **http://casrai.org/CRediT http://depsy.org/
  • 19. 2. Stop conflating credit with Authorship Contribution Roles Usage Liz Allen: CreDiT
  • 20. 4. Research units and credit models that reflect software Not Publish. Release paradigm. Portfolio paradigm. Jennifer Schopf,Treating Data Like Software: A Case for Production Quality Data,JCDL 2012 Evolving Multi-stewarded Multi-authored Multi-platform Reproducible Executable papers Connected Body of work Compound, Aggregated
  • 22. 28/01/2016 22 An “evolving manuscript” would begin with a pre- publication, pre-peer review “beta 0.9” version of an article, followed by the approved published article itself, [ … ] “version 1.0”. Subsequently, scientists would update this paper with details of further work as the area of research develops. Versions 2.0 and 3.0 might allow for the “accretion of confirmation [and] reputation”. Ottoline Leyser […] assessment criteria in science revolve around the individual. “People have stopped thinking about the scientific enterprise”. http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
  • 23. Ramps vs Revolutions Technical ramps • Machinery, tools, platforms, repositories Process ramps • Research processes and Publisher workflows Social ramps • Rules and policies • Adoption by stakeholders – interventions & automations • Recognition by stakeholders Credit is like love not money Citations and across discipline boundaries. Within discipline more like dividends. All research products and all scholarly labour are equally valued (except by institutional promotion, funding review and REF committees) Public software and data resources are not free. Stewardship costs and needs crediting Publishers adapt to “Publications” that are dynamic Research Objects (still need to snapshot)
  • 26. Links • FAIRDOM – http://www.fair-dom.org • SEEK Platform – http://www.seek4science.org • Research Objects – http://www.researchobject.org • Software Sustainability Institute – http://www.software.ac.uk • Software Carpentry – http://www.software-carpentry.org • Force11 – http://www.force11.org