SlideShare a Scribd company logo
Metrics & Citation
for Software (and Data)
Daniel S. Katz
dkatz@nsf.gov & d.katz@ieee.org
@danielskatz
Program Director, Division of
Advanced Cyberinfrastructure
(http://www.slideshare.net/danielskatz/metrics-citation-for-
software-and-data)
Workshop on Supporting Scientific Discovery through
Norms and Practices for Software and Data Citation and
Attribution, Washington DC, 29 Jan 2015
National Science Foundation
• Federal agency created in 1950 "to promote the
progress of science; to advance the national
health, prosperity, and welfare; to secure the
national defense…”
• Annual budget of $7.3 billion (FY 2015)
• Funds 24 percent of all federally supported
basic research at US colleges and universities
• In many fields such as mathematics, computer
science and the social sciences, NSF is the
major source of federal funds
NSF
NATIONAL SCIENCE FOUNDATION
DIRECTORATE FOR
BIOLOGICAL
SCIENCES
(BIO)
James L. Olds,
Assistant Director
Jane Silverthorne,
Deputy AD
703.292.8400
DIRECTORATE FOR
EDUCATION & HUMAN
RESOURCES
(EHR)
Joan Ferrini-Mundy,
Assistant Director
James W. Lewis,
Deputy AD
703.292.8600
DIVISION OF BIOLOGICAL
INFRASTRUCTURE (DBI)
Scott Edwards,
Division Director
703.292.8470
DIVISION OF ENVIRONMENTAL
BIOLOGY (DEB)
Alan Tessler,
Acting Division Director
703.292.8480
DIVISION OF INTEGRATIVE
ORGANISMAL SYSTEMS (IOS)
William Zamer,
Acting Division Director
703.292.8420
DIVISION OF MOLECULAR &
CELLULAR BIOSCIENCES (MCB)
Gregory Warr,
Acting Division Director
703.292.8440
OFFICE OF EMERGING
FRONTIERS (EF)
Charles Liarakos,
Acting Division Director
703.292.8508
DIRECTORATE FOR
COMPUTER &
INFORMATION SCIENCE &
ENGINEERING (CISE)
James F. Kurose,
Assistant Director
Suzanne Iacono,
Deputy AD
703.292.8900
DIVISION OF CHEMICAL,
BIOENGINEERING,ENVIRONMENTAL&
TRANSPORT SYSTEMS (CBET)
JoAnn Lighty,
Division Director
703.292.8320
DIVISION OF CIVIL,
MECHANICAL & MANUFACTURING
INNOVATION (CMMI)
Deborah Goodings,
Acting Division Director
703.292.8360
DIVISION OF ELECTRICAL,
COMMUNICATIONS & CYBER
SYSTEMS (ECCS)
Samir El-Ghazaly,
Division Director
703.292.8339
DIVISION OF ENGINEERING
EDUCATION & CENTERS (EEC)
Don L. Millard,
Acting Division Director
703.292.8380
DIVISION OF INDUSTRIAL
INNOVATION & PARTNERSHIPS (IIP)
Joseph Hennessey,
Acting Division Director
703.292.8050
OFFICE OF EMERGING
FRONTIERS IN RESEARCH &
INNOVATION (EFRI)
Sohi Rastegar,
Senior Advisor
703.292.8301
DIRECTORATE FOR
GEOSCIENCES
(GEO)
Roger Wakimoto,
Assistant Director
Margaret Cavanaugh,
Deputy AD
703.292.8500
DIRECTORATE FOR
MATHEMATICAL &
PHYSICAL SCIENCES
(MPS)
Fleming Crim,
Assistant Director
Celeste M. Rohlfin
g
,
Deputy AD
703.292.8800
DIVISION OF ASTRONOMICAL
SCIENCES (AST)
James Ulvestad,
Division Director
703.292.8820
DIVISION OF CHEMISTRY (CHE)
Steven Bernasek,
Division Director
703.292.8840
DIVISION OF MATERIALS
RESEARCH (DMR)
Mary Galvin-Donoghue ,
Division Director
703.292.8810
DIVISION OF MATHEMATICAL
SCIENCES (DMS)
Michael Vogelius,
Division Director
703.292.8870
DIVISION OF PHYSICS (PHY)
Denise Caldwell,
Division Director
703.292.8890
OFFICE OF MULTIDISCIPLINARY
ACTIVITIES (OMA)
Clark Cooper,
Offic
e
He ad
703.292.8800
DIRECTORATE FOR
SOCIAL, BEHAVIORAL, &
ECONOMIC SCIENCES
(SBE)
Fay L. Cook,
Assistant Director
Clifford Gabriel,
Deputy AD (Acting)
703.292.8700
DIVISION OF BEHAVIORAL &
COGNITIVE SCIENCES (BCS)
Mark Weiss,
Division Director
703.292.8740
DIVISION OF SOCIAL &
ECONOMIC SCIENCES (SES)
Jeryl Mumpower,
Division Director
703.292.8760
NATIONAL CENTER FOR
SCIENCE AND ENGINEERING
STATISTICS (NCSES)
John Gawalt,
Division Director
703.292.8780
National Science Foundation
4201 Wilson Boulevard
Arlington, Virginia 22230
TEL: 703.292.5111 | FIRS: 800.877.8339 | TDD: 800.281.8749 January 2015
DIRECTORATE FOR
ENGINEERING
(ENG)
Pramod P.
Khargonekar,
Assistant Director
Grace Wang,
Deputy AD
703.292.8300
DIVISION OF GRADUATE
EDUCATION (DGE)
Valerie Wilson,
Acting Division Director
703.292.8630
DIVISION OF HUMAN RESOURCE
DEVELOPMENT (HRD)
Sylvia James,
Division Director
703.292.8640
DIVISION OF RESEARCH ON
LEARNING IN FORMAL &
INFORMAL SETTINGS (DRL)
Sarah McDonald,
Acting Division Director
703.292.8620
DIVISION OF UNDERGRADUATE
EDUCATION (DUE)
Susan Singer,
Division Director
703.292.8670
DIVISION OF ATMOSPHERIC &
GEOSPACE SCIENCES (AGS)
Paul Shepson
Division Director
703.292.8520
DIVISION OF EARTH
SCIENCES (EAR)
Carol Frost,
Division Director
703.292.8550
DIVISION OF OCEAN
SCIENCES (OCE)
Deborah Bronk,
Division Director
703.292.8580
DIVISION OF
POLAR PROGRAMS (PLR)
Kelly Falkner,
Division Director
703.292.8030
DIVISION OF COMPUTER &
NETWORK SYSTEMS (CNS)
Keith Marzullo,
Division Director
703.292.8950
OFFICE OF INFORMATION
& RESOURCE
MANAGEMENT
(OIRM)
Joanne S. Tornow,
Head / Chief Human
Capital Offic
e
r
Amy Northcutt,
Chief Information Offic
e
r
703.292.8100
OFFICE OF BUDGET,
FINANCE, & AWARD
MANAGEMENT
(BFA)
MarthaA. Rubenstein,
Head / Chief Financial
Offic
e
r
Joanna E. Rom,
Deputy Head
703.292.8200
BUDGET DIVISION (BUD)
Michael Sieverts,
Division Director
703.292.8260
DIVISION OF ACQUISITION AND
COOPERATIVE SUPPORT (DACS)
Jeffery Lupis,
Division Director
703.292.8240
DIVISION OF FINANCIAL
MANAGEMENT (DFM)
Shirl Ruffin
,
Division Director / Deputy CFO
703.292.8280
DIVISION OF ADMINISTRATIVE
SERVICES (DAS)
Mercedes Eugenia,
Division Director
703.292.8190
DIVISION OF INFORMATION
SYSTEMS (DIS)
Dorothy Aronson,
Division Director
703.292.8150
DIVISION OF HUMAN RESOURCE
MANAGEMENT (HRM)
Judy Sunley,
Division Director
703.292.8180
DIVISION OF GRANTS &
AGREEMENTS (DGA)
Karen Tiplady,
Division Director
703.292.8210
DIVISION OF INSTITUTION &
AWARD SUPPORT (DIAS)
Mary Santonastasso,
Division Director
703.292.8230
LARGE FACILITIES OFFICE
Matthew Hawkins,
Acting Deputy Director
703.292.4416
DIVISION OF COMPUTING &
COMMUNICATION
FOUNDATIONS (CCF)
Rao Kosaraju,
Division Director
703.292.8910
DIVISION OF ADVANCED
CYBERINFRASTRUCTURE (ACI)
Irene Qualters,
Division Director
703.292.8970
DIVISION OF INFORMATION &
INTELLIGENT SYSTEMS (IIS)
Lynne E. Parker,
Division Director
703.292.8930
Richard Buckius
Chief Operating
Offic
e
r
OFFICE OF THE GENERAL
COUNSEL (OGC)
Lawrence Rudolph,
General Counsel
Peggy Hoyle, Deputy GC
703.292.8060
OFFICE OF DIVERSITY &
INCLUSION (ODI)
Vacant, Head
703.292.8020
OFFICE OF LEGISLATIVE &
PUBLIC AFFAIRS (OLPA)
Dana Toupousis, Acting Head
703.292.8070
OFFICE OF INTERNATIONAL &
INTEGRATIVE ACTIVITIES (OIIA)
Wanda Ward, Head
703.292.8040
OFFICE OF INSPECTOR
GENERAL (OIG)
Allison C. Lerner, Inspector General
703.292.7100
NATIONAL SCIENCE BOARD
OFFICE
Michael Van Woert
Executive Offic
e
r
703.292.7000
NATIONAL SCIENCE BOARD (NSB)
Dan E. Arvizu
Chair
Kelvin K. Droegemeier
Vice Chair
703.292.7000
OFFICE OF THE DIRECTOR
703.292.8000
Vacant
Deputy Director
France A. Córdova
Director
Advanced Cyberinfrastructure
(ACI) Division
• Supports and coordinates the development,
acquisition, and provision of state-of-the-art
cyberinfrastructure resources, tools, and
services
• Supports forward-looking research and
education to expand the future capabilities of
cyberinfrastructure
• Serves the growing community of scientists and
engineers, across all disciplines, whose work
relies on the power of advanced computation,
data-handling, and networking
Cyberinfrastructure
“Cyberinfrastructure consists of
computing systems,
data storage systems,
advanced instruments and
data repositories,
visualization environments, and
people,
all linked together by
software and
high performance networks,
to improve research productivity and
enable breakthroughs not otherwise possible.”
-- Craig Stewart
Software as Infrastructure
Science
Software
Computing
Infrastructure
• Software (including services) essential for
the bulk of science
- About half the papers in recent issues of
Science were software-intensive projects
- Research becoming dependent upon
advances in software
- Significant software development being
conducted across NSF: NEON, OOI,
NEES, NCN, iPlant, etc
• Wide range of software types: system, applications, modeling,
gateways, analysis, algorithms, middleware, libraries
• Software is not a one-time effort, it must be sustained
• Development, production, and maintenance are people intensive
• Software life-times are long vs hardware
• Software has under-appreciated value
For software to be sustainable,
it must become infrastructure
See http://bit.ly/sw-ci for current projects
5 rounds of funding,
65 SSEs
4 rounds of funding,
35 SSIs
2 rounds of funding,
14 S2I2
conceptualizations
NSF Software Infrastructure Projects
SSE & SSI – NSF 14-520: Cross-NSF, all Directorates participating
Next SSEs due Feb 2015; Next SSIs due June 2015
SI2 Solicitation and Decision Process
• Proposal reviews well -> my role becomes
matchmaking
– I want to find program officers with funds, and convince them
that they should spend their funds on the proposal
• Unidisciplinary project (e.g. bioinformatics app)
– Work with single program officer, either likes the proposal or
not
• Multidisciplinary project (e.g., molecular
dynamics)
– Work with multiple program officers, ...
• Omnidisciplinary project (e.g. http, math library)
– Try to work with all program officers, often am told “it’s your
responsibility”
To judge software, need to
understand/forecast impact
Measuring Impact – Scenarios
1. Developer of open source physics simulation
– Possible metrics
• How many downloads? (easiest to measure, least value)
• How many contributors?
• How many uses?
• How many papers cite it?
• How many papers that cite it are cited? (hardest to measure,
most value)
2. Developer of open source math library
– Possible metrics are similar, but citations are less
likely
– What if users don’t download it?
• It’s part of a distro
• It’s pre-installed (and optimized) on an HPC system
• It’s part of a cloud image
• It’s a service
• Future impacts – let proposers suggest
ACI Software Cluster Programs
• In these programs, ACI works with other NSF
units to support projects that lead to software
as an element of infrastructure
• Issue: amount of software that is
infrastructure grows over time, and grows
faster than NSF funding
Q: How can NSF ensure that software as
infrastructure continues to appear, without
funding all of it?
A: Incentives
• The devil is in the details
Other Software Discussions
• Working Towards Sustainable Software for
Science: Practice and Experience (WSSSPE)
– http://wssspe.researchcomputing.org.uk
– 3 workshops held
• Lessons:
Many of the issues in developing
sustainable software are social, not
technical
Software work is inadequately visible in
ways that “count” within the reputation
system underlying science
Where We Are
• To judge software, need to understand/forecast impact
• Q: How can NSF ensure that software as infrastructure
continues to appear, without funding all of it?
• A: Incentives
• Many of the issues in developing sustainable software are
social, not technical
• Software work is inadequately visible in ways that “count”
within the reputation system underlying science
Hypothesis: better measurement of
contributions can lead to rewards
(incentives), leading to career paths,
willingness to join communities, leading to
more sustainable software
A Problem
Credit for finding: Amy Brand, Digital Science
Another Problem
Credit for finding: Amy Brand, Digital Science
Last Problem
Credit for finding: Amy Brand, Digital Science
Moving Forward - NSF
• Recent CISE/ACI & SBE/SES Dear Colleague
Letter: Supporting Scientific Discovery through
Norms and Practices for Software and Data
Citation and Attribution (NSF 14-059,
http://www.nsf.gov/pubs/2014/nsf14059/nsf14059
.jsp)
– Need well-developed metrics to assess the
impact and quality of scientific software and
data
– Explore new norms and practices for software
and data citation and attribution, so that data
producers, software and tool developers, and
data curators are credited
• 6 projects and 3 collaborative workshops funded
Moving Forward - Dan
• Products (software, paper, data set) are
registered
– Credit map (weighted list of contributors—
people, products, etc.) is an input
– DOI is an output
Paper
Author
B
... Paper
M
... Software
X
...
0.2
0.05 0.2
Author
A
0.2
Data
K
...
0.1
Moving Forward - Dan
– Enables transitive credit1
• E.g., paper 1 provides 25% credit to software A, and
software A provides 10% credit to library X -> library X gets
2.5% credit for paper 1
• Helps developer show: “my tools are important”
– Issues:
• Social: Trust in person who registers a product
• Technological: How2, Registration system
1D. S. Katz, "Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of
Digital Products," Journal of Open Research Software, v.2(1): e20, 2014. DOI: 10.5334/jors.be
2D. S. Katz, A. M. Smith, "Implementing Transitive Credit with JSON-LD," 2nd Workshop on Sustainable Software for Science:
Practice and Experiences (WSSSPE2), 2014. URL: http://arxiv.org/abs/1407.5117
Author
1
... Paper
4
... Software
12
...
0.1
0.1
0.3
Paper
Author
B
... Paper
M
... Software
X
...
0.2
0.05 0.2
Author
A
0.2
Data
K
...
0.1
Moving Forward – Project CRediT
• Goal: develop a contributor role taxonomy to enable greater granularity &
transparency around contributions to scholarly published output in science
• http://projectcredit.net
• Rationale:
• Comments to a.brand@digital-science.com & l.allen@wellcome.ac.uk
Publishers
Increase transparency
Reduce author disputes
Simplify process of chasing authors
Identifying peer reviewers
Research funders
Supporting grant applications
Understanding impact
Awarding credit
Identifying peer reviewers
Identifying new funding opportunities
Researchers
Gaining credit for true contribution
Credit for ‘new’/specific roles
Identify collaborators
Benefit junior reviewers
Reduce authorship politics?
Research institutions
Support tenure & appointment
New esteem & credit metrics for
staff
Understanding impact
Moving Forward – Project CRediT
Role Description
Study conception Idea; formulation of research question; statement of hypothesis
Methodology Development or design of methodology; creation of models
Computation
Programming, software development; designing computer programs;
implementation of computer code and supporting algorithms
Formal analysis
Application of statistical, mathematical, or or formal techniques to analyze study
data
Investigation; performed the
experiments
Conducting the research and investigation process, specifically performing the
experiments
Investigation; data/evidence collection
Conducting the research and investigation process, specifically data/evidence
collection
Resources
Provision of study materials, reagents, patients, laboratory samples, animals,
instrumentation, or other analysis tools
Data curation
Management activities to annotate (produce metadata) and maintain research
data for initial use and later re-use
Writing/manuscript preparation: writing
the initial draft
Preparation, creation, and/or presentation of published work, specifically writing
the initial draft
Writing/manuscript preparation: critical
review, commentary, or revision
Preparation, creation, and/or presentation of published work, specifically critical
review, commentary, or revision
Writing/manuscript preparation:
visualization/data presentation
Preparation, creation, and/or presentation of published work, specifically
visualization/data presentation
Supervision
Responsibility for supervising research; project orchestration; principal investigator
or other lead stakeholder
Project administration Coordination or management of research activities leading to this publication
Funding acquisition Acquisition of the financial support for the project leading to this publication
Moving Forward – Software Discovery Index
• NIH workshop, May 2014, within Big Data to
Knowledge (BD2K) initiative
– http://softwarediscoveryindex.org/
• Explored challenges facing the biomedical
research community in locating, citing, and
reusing biomedical software
• Identified fundamental prerequisite for success:
an automated, broadly accessible system
enabling comprehensive identification of
biomedical software.
• SDI Objectives:
– to assign standard and unambiguous identifiers to reference
all software
– to track specific metadata features that describe that
software
– to enable robust querying of all relevant information for users
Moving Forward – Software Discovery Index
• Complementary with BD2K Data Discovery Index (DDI)
• Data vs. Software Characteristics
• Research Resource Identifiers (RRIDs) as prototype?
– http://scicrunch.com/resources
• Note strong biomedical focus of SDI and DDI
– initial case or limiting?
Issue Data Software
Storage-limited  
Number of {datasets | software}  
Complex metadata  
Cited consistently and effectively  
Consistently accessible long-term  
Dependent on other data and software  
(Credit; Chris Wellington & Vivien Bonazzi, NIH)
Moving Forward - Scholarly Contributions
Workshop & FORCE11
• FORCE11 – Open community aiming to improve future
research communication and e-Scholarship
– http://force11.org
• Scholarly Communications Workshop @ FORCE2015,
Oxford, UK, Jan 11 2015
• Goals:
1. Develop collaborative, interdisciplinary group to technically
implement a scholarly contribution roles ontology in
context of VIVO-ISF
2. Skeleton of scholarly products and the contribution roles
that people have towards each
3. Plan for technical next steps and development of proposal
to get funding to support this work
• Interest led to Force11 Attribution working group
– Webpage: http://www.force11.org/group/attributionwg
– Mailing List: attribution@force11.org
Moving Forward - Community
• Lots of challenges remain – within and across projects
• Career paths – Is there a role for non-tenure-track researchers
who produce software, data, etc. in universities?
– Assuming yes, do universities recognize and support this? If not,
how to get them to?
• What is needed to support reproducibility of science, in terms of
data and software?
• Versioning & provenance
• Lots of entities with similar interests in both software and data,
e.g. JISC, RCUK, NIH, DOE, Sloan & Moore, Mozilla, Apache
– Identifier work from Zenodo/GitHub, DataCite, CrossRef, VIVO, ...
• Need institutional buy-in, incorporation in researcher profiles
• Publisher involvement is essential
– Software papers vs software?
• Future of Google Scholar?
• Continued participation in WSSSPE invited, leading to actions
• Other ideas and questions are welcome, now or later
– dkatz@nsf.gov or d.katz@ieee.org
Resources
• NSF Software as Infrastructure Vision:
http://www.nsf.gov/publications/pub_summ.jsp?ods_key=nsf12113
• Implementation of NSF Software Vision:
http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817
• Software Infrastructure for Sustained Innovation (SI2) Program
– Scientific Software Elements (SSE) & Scientific Software Integration (SSI) solicitation:
http://www.nsf.gov/publications/pub_summ.jsp?ods_key=nsf14520
– 2013 PI meeting: https://sites.google.com/site/si2pimeeting/
– 2014 PI meeting: https://sites.google.com/site/si2pimeeting2014/
– Awards: http://bit.ly/sw-ci
• Working towards Sustainable Software for Science: Practice and Experiences (WSSSPE)
– Home: http://wssspe.researchcomputing.org.uk (includes links to all slides & papers)
– 1st workshop paper: http://arxiv.org/abs/1404.7414
– 2nd workshop site: http://wssspe.researchcomputing.org.uk/wssspe2/
• NSF 14-059: “Dear Colleague Letter - Supporting Scientific Discovery through Norms and
Practices for Software and Data Citation and Attribution”
– http://www.nsf.gov/pubs/2014/nsf14059/nsf14059.jsp
• Transitive Credit Papers
– http://dx.doi.org/10.5334/jors.be
– http://arxiv.org/abs/1407.5117
• Project CRediT: http://projectcredit.net
• NIH Software Discovery Index: http://softwarediscoveryindex.org/
• FORCE11: http://foce11.org/
– Attribution Working Group: http://www.force11.org/group/attributionwg
Credits:
• SI2 Program:
– Current program officers: Daniel S. Katz, Rudolf Eigenmann, William Y. B. Chang,
John C. Cherniavsky, Almadena Y. Chtchelkanova, Cheryl L. Eavey, Evelyn
Goldfield, Sol Greenspan, Daryl W. Hess, Peter H. McCartney, Bogdan Mihaila,
Dimitrios V. Papavassiliou, Andrew D. Pollington, Barbara Ransom, Thomas
Russell, Massimo Ruzzene, Nigel A. Sharp, Paul Werbos, Eva Zanzerkia
– Formerly-involved program officers: Manish Parashar, Gabrielle Allen, Sumanta
Acharya, Eduardo Misawa, Jean Cottam-Allen, Thomas Siegmund
• WSSSPE:
– Organizers: Daniel S. Katz, Gabrielle Allen, Neil Chue Hong, Karen Cranston,
Manish Parashar, David Proctor, Matthew Turk, Colin C. Venters, Nancy Wilkins-
Diehr
– WSSSPE1 summary paper authors: Daniel S. Katz, Sou-Cheng T. Choi, Hilmar
Lapp, Ketan Maheshwari, Frank Löffler, Matthew Turk, Marcus D. Hanwell, Nancy
Wilkins-Diehr, James Hetherington, James Howison, Shel Swenson, Gabrielle D.
Allen, Anne C. Elster, Bruce Berriman, Colin Venters
– Keynote speakers: Phil Bourne, Arfon Smith, Kaitlin Thaney, Neil Chue Hong
• Project CRediT
– Leads: Liz Allen, Amy Brandt, full group at http://projectcredit.net/
• NIH Software Discovery Index
– http://softwarediscoveryindex.org/
• Force11 community
– http://force11.org/

More Related Content

PPTX
Valuing Software and Other Research Outputs
PPTX
EarthCube Stakeholder Alignment Survey - End-Users & Professional Societies W...
PDF
A Better Way to Unlock Speech
PDF
Data Driven Blogging: Treat Your Blog Like a Product
PPTX
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
PDF
Force 2015 : Communicating with a crowd/Lessons from citizen science
PDF
Walls and Windows for Highly Insulated Buildings in the Pacific Northwest
PPTX
Why developing research software is like a startup (and why this matters)
Valuing Software and Other Research Outputs
EarthCube Stakeholder Alignment Survey - End-Users & Professional Societies W...
A Better Way to Unlock Speech
Data Driven Blogging: Treat Your Blog Like a Product
Scientific Software Innovation Institutes (S2I2s) as part of NSF’s SI2 program
Force 2015 : Communicating with a crowd/Lessons from citizen science
Walls and Windows for Highly Insulated Buildings in the Pacific Northwest
Why developing research software is like a startup (and why this matters)

Similar to Metrics & Citation for Software (and Data) (20)

PPTX
Working towards Sustainable Software for Science (an NSF and community view)
PDF
Steve's CV
PDF
ePOM - Fundamentals of Research Software Development - Introduction
PPTX
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
PDF
Big Data, Computational Biology & the Future of Strategic Planning for Research
PDF
iodgc_program_guide_2012_web
PPT
Sla2009 D Curation Heidorn
PDF
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
PDF
Dubinnet k to r (Dec 2013)
PDF
PaigeMartin_FOGSS_2023.pdf
PDF
impact of microcredit on agricultural development in Sindh Pakistan
PDF
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
PPT
F&I Rosei Death by 1,000 Cuts: Researcher Burden Up Close & Personal
PDF
EarthCube Science of Team Science Poster
PPTX
Hydroinformatics committee meeting - chengdu (08 sept 2013)
PPTX
Research Data Management
PDF
Immigration Policy and the Search for Skilled Workers Summary of a Workshop 1...
PDF
Developing Geographic Information Infrastructures The Role Of Information Pol...
PDF
2014 NSF Environmental R&D Report October 2014
PPT
The NIH as a Digital Enterprise: Implications for PAG
Working towards Sustainable Software for Science (an NSF and community view)
Steve's CV
ePOM - Fundamentals of Research Software Development - Introduction
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Big Data, Computational Biology & the Future of Strategic Planning for Research
iodgc_program_guide_2012_web
Sla2009 D Curation Heidorn
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
Dubinnet k to r (Dec 2013)
PaigeMartin_FOGSS_2023.pdf
impact of microcredit on agricultural development in Sindh Pakistan
Impact of Microcredit on Agricultural Development in DistrictMastung Balochis...
F&I Rosei Death by 1,000 Cuts: Researcher Burden Up Close & Personal
EarthCube Science of Team Science Poster
Hydroinformatics committee meeting - chengdu (08 sept 2013)
Research Data Management
Immigration Policy and the Search for Skilled Workers Summary of a Workshop 1...
Developing Geographic Information Infrastructures The Role Of Information Pol...
2014 NSF Environmental R&D Report October 2014
The NIH as a Digital Enterprise: Implications for PAG
Ad

More from Daniel S. Katz (20)

PDF
Research software susainability
PPTX
Software Professionals (RSEs) at NCSA
PPTX
Parsl: Pervasive Parallel Programming in Python
PPTX
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
PPTX
What is eScience, and where does it go from here?
PDF
Citation and Research Objects: Toward Active Research Objects
PDF
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
PPTX
Fundamentals of software sustainability
PPTX
Software Citation in Theory and Practice
PPTX
PDF
Research Software Sustainability: WSSSPE & URSSI
PDF
Software citation
PDF
Expressing and sharing workflows
PDF
Citation and reproducibility in software
PPTX
Software Citation: Principles, Implementation, and Impact
PPTX
Summary of WSSSPE and its working groups
PPTX
Working towards Sustainable Software for Science: Practice and Experience (WS...
PPTX
20160607 citation4software panel
PPTX
20160607 citation4software opening
PPTX
Scientific Software Challenges and Community Responses
Research software susainability
Software Professionals (RSEs) at NCSA
Parsl: Pervasive Parallel Programming in Python
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
What is eScience, and where does it go from here?
Citation and Research Objects: Toward Active Research Objects
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
Fundamentals of software sustainability
Software Citation in Theory and Practice
Research Software Sustainability: WSSSPE & URSSI
Software citation
Expressing and sharing workflows
Citation and reproducibility in software
Software Citation: Principles, Implementation, and Impact
Summary of WSSSPE and its working groups
Working towards Sustainable Software for Science: Practice and Experience (WS...
20160607 citation4software panel
20160607 citation4software opening
Scientific Software Challenges and Community Responses
Ad

Recently uploaded (20)

PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
2. Earth - The Living Planet earth and life
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
PPT
protein biochemistry.ppt for university classes
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
bbec55_b34400a7914c42429908233dbd381773.pdf
The KM-GBF monitoring framework – status & key messages.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
2. Earth - The Living Planet earth and life
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Cell Membrane: Structure, Composition & Functions
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
neck nodes and dissection types and lymph nodes levels
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Crime Scene Investigation: A Guide for Law Enforcement (2013 Update)
protein biochemistry.ppt for university classes
Comparative Structure of Integument in Vertebrates.pptx
INTRODUCTION TO EVS | Concept of sustainability
Viruses (History, structure and composition, classification, Bacteriophage Re...
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...

Metrics & Citation for Software (and Data)

  • 1. Metrics & Citation for Software (and Data) Daniel S. Katz dkatz@nsf.gov & d.katz@ieee.org @danielskatz Program Director, Division of Advanced Cyberinfrastructure (http://www.slideshare.net/danielskatz/metrics-citation-for- software-and-data) Workshop on Supporting Scientific Discovery through Norms and Practices for Software and Data Citation and Attribution, Washington DC, 29 Jan 2015
  • 2. National Science Foundation • Federal agency created in 1950 "to promote the progress of science; to advance the national health, prosperity, and welfare; to secure the national defense…” • Annual budget of $7.3 billion (FY 2015) • Funds 24 percent of all federally supported basic research at US colleges and universities • In many fields such as mathematics, computer science and the social sciences, NSF is the major source of federal funds
  • 3. NSF NATIONAL SCIENCE FOUNDATION DIRECTORATE FOR BIOLOGICAL SCIENCES (BIO) James L. Olds, Assistant Director Jane Silverthorne, Deputy AD 703.292.8400 DIRECTORATE FOR EDUCATION & HUMAN RESOURCES (EHR) Joan Ferrini-Mundy, Assistant Director James W. Lewis, Deputy AD 703.292.8600 DIVISION OF BIOLOGICAL INFRASTRUCTURE (DBI) Scott Edwards, Division Director 703.292.8470 DIVISION OF ENVIRONMENTAL BIOLOGY (DEB) Alan Tessler, Acting Division Director 703.292.8480 DIVISION OF INTEGRATIVE ORGANISMAL SYSTEMS (IOS) William Zamer, Acting Division Director 703.292.8420 DIVISION OF MOLECULAR & CELLULAR BIOSCIENCES (MCB) Gregory Warr, Acting Division Director 703.292.8440 OFFICE OF EMERGING FRONTIERS (EF) Charles Liarakos, Acting Division Director 703.292.8508 DIRECTORATE FOR COMPUTER & INFORMATION SCIENCE & ENGINEERING (CISE) James F. Kurose, Assistant Director Suzanne Iacono, Deputy AD 703.292.8900 DIVISION OF CHEMICAL, BIOENGINEERING,ENVIRONMENTAL& TRANSPORT SYSTEMS (CBET) JoAnn Lighty, Division Director 703.292.8320 DIVISION OF CIVIL, MECHANICAL & MANUFACTURING INNOVATION (CMMI) Deborah Goodings, Acting Division Director 703.292.8360 DIVISION OF ELECTRICAL, COMMUNICATIONS & CYBER SYSTEMS (ECCS) Samir El-Ghazaly, Division Director 703.292.8339 DIVISION OF ENGINEERING EDUCATION & CENTERS (EEC) Don L. Millard, Acting Division Director 703.292.8380 DIVISION OF INDUSTRIAL INNOVATION & PARTNERSHIPS (IIP) Joseph Hennessey, Acting Division Director 703.292.8050 OFFICE OF EMERGING FRONTIERS IN RESEARCH & INNOVATION (EFRI) Sohi Rastegar, Senior Advisor 703.292.8301 DIRECTORATE FOR GEOSCIENCES (GEO) Roger Wakimoto, Assistant Director Margaret Cavanaugh, Deputy AD 703.292.8500 DIRECTORATE FOR MATHEMATICAL & PHYSICAL SCIENCES (MPS) Fleming Crim, Assistant Director Celeste M. Rohlfin g , Deputy AD 703.292.8800 DIVISION OF ASTRONOMICAL SCIENCES (AST) James Ulvestad, Division Director 703.292.8820 DIVISION OF CHEMISTRY (CHE) Steven Bernasek, Division Director 703.292.8840 DIVISION OF MATERIALS RESEARCH (DMR) Mary Galvin-Donoghue , Division Director 703.292.8810 DIVISION OF MATHEMATICAL SCIENCES (DMS) Michael Vogelius, Division Director 703.292.8870 DIVISION OF PHYSICS (PHY) Denise Caldwell, Division Director 703.292.8890 OFFICE OF MULTIDISCIPLINARY ACTIVITIES (OMA) Clark Cooper, Offic e He ad 703.292.8800 DIRECTORATE FOR SOCIAL, BEHAVIORAL, & ECONOMIC SCIENCES (SBE) Fay L. Cook, Assistant Director Clifford Gabriel, Deputy AD (Acting) 703.292.8700 DIVISION OF BEHAVIORAL & COGNITIVE SCIENCES (BCS) Mark Weiss, Division Director 703.292.8740 DIVISION OF SOCIAL & ECONOMIC SCIENCES (SES) Jeryl Mumpower, Division Director 703.292.8760 NATIONAL CENTER FOR SCIENCE AND ENGINEERING STATISTICS (NCSES) John Gawalt, Division Director 703.292.8780 National Science Foundation 4201 Wilson Boulevard Arlington, Virginia 22230 TEL: 703.292.5111 | FIRS: 800.877.8339 | TDD: 800.281.8749 January 2015 DIRECTORATE FOR ENGINEERING (ENG) Pramod P. Khargonekar, Assistant Director Grace Wang, Deputy AD 703.292.8300 DIVISION OF GRADUATE EDUCATION (DGE) Valerie Wilson, Acting Division Director 703.292.8630 DIVISION OF HUMAN RESOURCE DEVELOPMENT (HRD) Sylvia James, Division Director 703.292.8640 DIVISION OF RESEARCH ON LEARNING IN FORMAL & INFORMAL SETTINGS (DRL) Sarah McDonald, Acting Division Director 703.292.8620 DIVISION OF UNDERGRADUATE EDUCATION (DUE) Susan Singer, Division Director 703.292.8670 DIVISION OF ATMOSPHERIC & GEOSPACE SCIENCES (AGS) Paul Shepson Division Director 703.292.8520 DIVISION OF EARTH SCIENCES (EAR) Carol Frost, Division Director 703.292.8550 DIVISION OF OCEAN SCIENCES (OCE) Deborah Bronk, Division Director 703.292.8580 DIVISION OF POLAR PROGRAMS (PLR) Kelly Falkner, Division Director 703.292.8030 DIVISION OF COMPUTER & NETWORK SYSTEMS (CNS) Keith Marzullo, Division Director 703.292.8950 OFFICE OF INFORMATION & RESOURCE MANAGEMENT (OIRM) Joanne S. Tornow, Head / Chief Human Capital Offic e r Amy Northcutt, Chief Information Offic e r 703.292.8100 OFFICE OF BUDGET, FINANCE, & AWARD MANAGEMENT (BFA) MarthaA. Rubenstein, Head / Chief Financial Offic e r Joanna E. Rom, Deputy Head 703.292.8200 BUDGET DIVISION (BUD) Michael Sieverts, Division Director 703.292.8260 DIVISION OF ACQUISITION AND COOPERATIVE SUPPORT (DACS) Jeffery Lupis, Division Director 703.292.8240 DIVISION OF FINANCIAL MANAGEMENT (DFM) Shirl Ruffin , Division Director / Deputy CFO 703.292.8280 DIVISION OF ADMINISTRATIVE SERVICES (DAS) Mercedes Eugenia, Division Director 703.292.8190 DIVISION OF INFORMATION SYSTEMS (DIS) Dorothy Aronson, Division Director 703.292.8150 DIVISION OF HUMAN RESOURCE MANAGEMENT (HRM) Judy Sunley, Division Director 703.292.8180 DIVISION OF GRANTS & AGREEMENTS (DGA) Karen Tiplady, Division Director 703.292.8210 DIVISION OF INSTITUTION & AWARD SUPPORT (DIAS) Mary Santonastasso, Division Director 703.292.8230 LARGE FACILITIES OFFICE Matthew Hawkins, Acting Deputy Director 703.292.4416 DIVISION OF COMPUTING & COMMUNICATION FOUNDATIONS (CCF) Rao Kosaraju, Division Director 703.292.8910 DIVISION OF ADVANCED CYBERINFRASTRUCTURE (ACI) Irene Qualters, Division Director 703.292.8970 DIVISION OF INFORMATION & INTELLIGENT SYSTEMS (IIS) Lynne E. Parker, Division Director 703.292.8930 Richard Buckius Chief Operating Offic e r OFFICE OF THE GENERAL COUNSEL (OGC) Lawrence Rudolph, General Counsel Peggy Hoyle, Deputy GC 703.292.8060 OFFICE OF DIVERSITY & INCLUSION (ODI) Vacant, Head 703.292.8020 OFFICE OF LEGISLATIVE & PUBLIC AFFAIRS (OLPA) Dana Toupousis, Acting Head 703.292.8070 OFFICE OF INTERNATIONAL & INTEGRATIVE ACTIVITIES (OIIA) Wanda Ward, Head 703.292.8040 OFFICE OF INSPECTOR GENERAL (OIG) Allison C. Lerner, Inspector General 703.292.7100 NATIONAL SCIENCE BOARD OFFICE Michael Van Woert Executive Offic e r 703.292.7000 NATIONAL SCIENCE BOARD (NSB) Dan E. Arvizu Chair Kelvin K. Droegemeier Vice Chair 703.292.7000 OFFICE OF THE DIRECTOR 703.292.8000 Vacant Deputy Director France A. Córdova Director
  • 4. Advanced Cyberinfrastructure (ACI) Division • Supports and coordinates the development, acquisition, and provision of state-of-the-art cyberinfrastructure resources, tools, and services • Supports forward-looking research and education to expand the future capabilities of cyberinfrastructure • Serves the growing community of scientists and engineers, across all disciplines, whose work relies on the power of advanced computation, data-handling, and networking
  • 5. Cyberinfrastructure “Cyberinfrastructure consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks, to improve research productivity and enable breakthroughs not otherwise possible.” -- Craig Stewart
  • 6. Software as Infrastructure Science Software Computing Infrastructure • Software (including services) essential for the bulk of science - About half the papers in recent issues of Science were software-intensive projects - Research becoming dependent upon advances in software - Significant software development being conducted across NSF: NEON, OOI, NEES, NCN, iPlant, etc • Wide range of software types: system, applications, modeling, gateways, analysis, algorithms, middleware, libraries • Software is not a one-time effort, it must be sustained • Development, production, and maintenance are people intensive • Software life-times are long vs hardware • Software has under-appreciated value For software to be sustainable, it must become infrastructure
  • 7. See http://bit.ly/sw-ci for current projects 5 rounds of funding, 65 SSEs 4 rounds of funding, 35 SSIs 2 rounds of funding, 14 S2I2 conceptualizations NSF Software Infrastructure Projects SSE & SSI – NSF 14-520: Cross-NSF, all Directorates participating Next SSEs due Feb 2015; Next SSIs due June 2015
  • 8. SI2 Solicitation and Decision Process • Proposal reviews well -> my role becomes matchmaking – I want to find program officers with funds, and convince them that they should spend their funds on the proposal • Unidisciplinary project (e.g. bioinformatics app) – Work with single program officer, either likes the proposal or not • Multidisciplinary project (e.g., molecular dynamics) – Work with multiple program officers, ... • Omnidisciplinary project (e.g. http, math library) – Try to work with all program officers, often am told “it’s your responsibility” To judge software, need to understand/forecast impact
  • 9. Measuring Impact – Scenarios 1. Developer of open source physics simulation – Possible metrics • How many downloads? (easiest to measure, least value) • How many contributors? • How many uses? • How many papers cite it? • How many papers that cite it are cited? (hardest to measure, most value) 2. Developer of open source math library – Possible metrics are similar, but citations are less likely – What if users don’t download it? • It’s part of a distro • It’s pre-installed (and optimized) on an HPC system • It’s part of a cloud image • It’s a service • Future impacts – let proposers suggest
  • 10. ACI Software Cluster Programs • In these programs, ACI works with other NSF units to support projects that lead to software as an element of infrastructure • Issue: amount of software that is infrastructure grows over time, and grows faster than NSF funding Q: How can NSF ensure that software as infrastructure continues to appear, without funding all of it? A: Incentives • The devil is in the details
  • 11. Other Software Discussions • Working Towards Sustainable Software for Science: Practice and Experience (WSSSPE) – http://wssspe.researchcomputing.org.uk – 3 workshops held • Lessons: Many of the issues in developing sustainable software are social, not technical Software work is inadequately visible in ways that “count” within the reputation system underlying science
  • 12. Where We Are • To judge software, need to understand/forecast impact • Q: How can NSF ensure that software as infrastructure continues to appear, without funding all of it? • A: Incentives • Many of the issues in developing sustainable software are social, not technical • Software work is inadequately visible in ways that “count” within the reputation system underlying science Hypothesis: better measurement of contributions can lead to rewards (incentives), leading to career paths, willingness to join communities, leading to more sustainable software
  • 13. A Problem Credit for finding: Amy Brand, Digital Science
  • 14. Another Problem Credit for finding: Amy Brand, Digital Science
  • 15. Last Problem Credit for finding: Amy Brand, Digital Science
  • 16. Moving Forward - NSF • Recent CISE/ACI & SBE/SES Dear Colleague Letter: Supporting Scientific Discovery through Norms and Practices for Software and Data Citation and Attribution (NSF 14-059, http://www.nsf.gov/pubs/2014/nsf14059/nsf14059 .jsp) – Need well-developed metrics to assess the impact and quality of scientific software and data – Explore new norms and practices for software and data citation and attribution, so that data producers, software and tool developers, and data curators are credited • 6 projects and 3 collaborative workshops funded
  • 17. Moving Forward - Dan • Products (software, paper, data set) are registered – Credit map (weighted list of contributors— people, products, etc.) is an input – DOI is an output Paper Author B ... Paper M ... Software X ... 0.2 0.05 0.2 Author A 0.2 Data K ... 0.1
  • 18. Moving Forward - Dan – Enables transitive credit1 • E.g., paper 1 provides 25% credit to software A, and software A provides 10% credit to library X -> library X gets 2.5% credit for paper 1 • Helps developer show: “my tools are important” – Issues: • Social: Trust in person who registers a product • Technological: How2, Registration system 1D. S. Katz, "Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products," Journal of Open Research Software, v.2(1): e20, 2014. DOI: 10.5334/jors.be 2D. S. Katz, A. M. Smith, "Implementing Transitive Credit with JSON-LD," 2nd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), 2014. URL: http://arxiv.org/abs/1407.5117 Author 1 ... Paper 4 ... Software 12 ... 0.1 0.1 0.3 Paper Author B ... Paper M ... Software X ... 0.2 0.05 0.2 Author A 0.2 Data K ... 0.1
  • 19. Moving Forward – Project CRediT • Goal: develop a contributor role taxonomy to enable greater granularity & transparency around contributions to scholarly published output in science • http://projectcredit.net • Rationale: • Comments to a.brand@digital-science.com & l.allen@wellcome.ac.uk Publishers Increase transparency Reduce author disputes Simplify process of chasing authors Identifying peer reviewers Research funders Supporting grant applications Understanding impact Awarding credit Identifying peer reviewers Identifying new funding opportunities Researchers Gaining credit for true contribution Credit for ‘new’/specific roles Identify collaborators Benefit junior reviewers Reduce authorship politics? Research institutions Support tenure & appointment New esteem & credit metrics for staff Understanding impact
  • 20. Moving Forward – Project CRediT Role Description Study conception Idea; formulation of research question; statement of hypothesis Methodology Development or design of methodology; creation of models Computation Programming, software development; designing computer programs; implementation of computer code and supporting algorithms Formal analysis Application of statistical, mathematical, or or formal techniques to analyze study data Investigation; performed the experiments Conducting the research and investigation process, specifically performing the experiments Investigation; data/evidence collection Conducting the research and investigation process, specifically data/evidence collection Resources Provision of study materials, reagents, patients, laboratory samples, animals, instrumentation, or other analysis tools Data curation Management activities to annotate (produce metadata) and maintain research data for initial use and later re-use Writing/manuscript preparation: writing the initial draft Preparation, creation, and/or presentation of published work, specifically writing the initial draft Writing/manuscript preparation: critical review, commentary, or revision Preparation, creation, and/or presentation of published work, specifically critical review, commentary, or revision Writing/manuscript preparation: visualization/data presentation Preparation, creation, and/or presentation of published work, specifically visualization/data presentation Supervision Responsibility for supervising research; project orchestration; principal investigator or other lead stakeholder Project administration Coordination or management of research activities leading to this publication Funding acquisition Acquisition of the financial support for the project leading to this publication
  • 21. Moving Forward – Software Discovery Index • NIH workshop, May 2014, within Big Data to Knowledge (BD2K) initiative – http://softwarediscoveryindex.org/ • Explored challenges facing the biomedical research community in locating, citing, and reusing biomedical software • Identified fundamental prerequisite for success: an automated, broadly accessible system enabling comprehensive identification of biomedical software. • SDI Objectives: – to assign standard and unambiguous identifiers to reference all software – to track specific metadata features that describe that software – to enable robust querying of all relevant information for users
  • 22. Moving Forward – Software Discovery Index • Complementary with BD2K Data Discovery Index (DDI) • Data vs. Software Characteristics • Research Resource Identifiers (RRIDs) as prototype? – http://scicrunch.com/resources • Note strong biomedical focus of SDI and DDI – initial case or limiting? Issue Data Software Storage-limited   Number of {datasets | software}   Complex metadata   Cited consistently and effectively   Consistently accessible long-term   Dependent on other data and software   (Credit; Chris Wellington & Vivien Bonazzi, NIH)
  • 23. Moving Forward - Scholarly Contributions Workshop & FORCE11 • FORCE11 – Open community aiming to improve future research communication and e-Scholarship – http://force11.org • Scholarly Communications Workshop @ FORCE2015, Oxford, UK, Jan 11 2015 • Goals: 1. Develop collaborative, interdisciplinary group to technically implement a scholarly contribution roles ontology in context of VIVO-ISF 2. Skeleton of scholarly products and the contribution roles that people have towards each 3. Plan for technical next steps and development of proposal to get funding to support this work • Interest led to Force11 Attribution working group – Webpage: http://www.force11.org/group/attributionwg – Mailing List: attribution@force11.org
  • 24. Moving Forward - Community • Lots of challenges remain – within and across projects • Career paths – Is there a role for non-tenure-track researchers who produce software, data, etc. in universities? – Assuming yes, do universities recognize and support this? If not, how to get them to? • What is needed to support reproducibility of science, in terms of data and software? • Versioning & provenance • Lots of entities with similar interests in both software and data, e.g. JISC, RCUK, NIH, DOE, Sloan & Moore, Mozilla, Apache – Identifier work from Zenodo/GitHub, DataCite, CrossRef, VIVO, ... • Need institutional buy-in, incorporation in researcher profiles • Publisher involvement is essential – Software papers vs software? • Future of Google Scholar? • Continued participation in WSSSPE invited, leading to actions • Other ideas and questions are welcome, now or later – dkatz@nsf.gov or d.katz@ieee.org
  • 25. Resources • NSF Software as Infrastructure Vision: http://www.nsf.gov/publications/pub_summ.jsp?ods_key=nsf12113 • Implementation of NSF Software Vision: http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504817 • Software Infrastructure for Sustained Innovation (SI2) Program – Scientific Software Elements (SSE) & Scientific Software Integration (SSI) solicitation: http://www.nsf.gov/publications/pub_summ.jsp?ods_key=nsf14520 – 2013 PI meeting: https://sites.google.com/site/si2pimeeting/ – 2014 PI meeting: https://sites.google.com/site/si2pimeeting2014/ – Awards: http://bit.ly/sw-ci • Working towards Sustainable Software for Science: Practice and Experiences (WSSSPE) – Home: http://wssspe.researchcomputing.org.uk (includes links to all slides & papers) – 1st workshop paper: http://arxiv.org/abs/1404.7414 – 2nd workshop site: http://wssspe.researchcomputing.org.uk/wssspe2/ • NSF 14-059: “Dear Colleague Letter - Supporting Scientific Discovery through Norms and Practices for Software and Data Citation and Attribution” – http://www.nsf.gov/pubs/2014/nsf14059/nsf14059.jsp • Transitive Credit Papers – http://dx.doi.org/10.5334/jors.be – http://arxiv.org/abs/1407.5117 • Project CRediT: http://projectcredit.net • NIH Software Discovery Index: http://softwarediscoveryindex.org/ • FORCE11: http://foce11.org/ – Attribution Working Group: http://www.force11.org/group/attributionwg
  • 26. Credits: • SI2 Program: – Current program officers: Daniel S. Katz, Rudolf Eigenmann, William Y. B. Chang, John C. Cherniavsky, Almadena Y. Chtchelkanova, Cheryl L. Eavey, Evelyn Goldfield, Sol Greenspan, Daryl W. Hess, Peter H. McCartney, Bogdan Mihaila, Dimitrios V. Papavassiliou, Andrew D. Pollington, Barbara Ransom, Thomas Russell, Massimo Ruzzene, Nigel A. Sharp, Paul Werbos, Eva Zanzerkia – Formerly-involved program officers: Manish Parashar, Gabrielle Allen, Sumanta Acharya, Eduardo Misawa, Jean Cottam-Allen, Thomas Siegmund • WSSSPE: – Organizers: Daniel S. Katz, Gabrielle Allen, Neil Chue Hong, Karen Cranston, Manish Parashar, David Proctor, Matthew Turk, Colin C. Venters, Nancy Wilkins- Diehr – WSSSPE1 summary paper authors: Daniel S. Katz, Sou-Cheng T. Choi, Hilmar Lapp, Ketan Maheshwari, Frank Löffler, Matthew Turk, Marcus D. Hanwell, Nancy Wilkins-Diehr, James Hetherington, James Howison, Shel Swenson, Gabrielle D. Allen, Anne C. Elster, Bruce Berriman, Colin Venters – Keynote speakers: Phil Bourne, Arfon Smith, Kaitlin Thaney, Neil Chue Hong • Project CRediT – Leads: Liz Allen, Amy Brandt, full group at http://projectcredit.net/ • NIH Software Discovery Index – http://softwarediscoveryindex.org/ • Force11 community – http://force11.org/