SlideShare a Scribd company logo
Only use this slide to present a screenshot of an application.
As no style is applied, the screenshot can take up the whole
slide. For all other information please use the slide with
preset style!
Athens, 8 Sept 2017
AGENDA
• Introduction to the FAIRdat tool and its future goals -
20 minutes
• Explore FAIRdat in small groups: assessment of datasets
from various disciplines - 45 minutes
• Feedback and suggestions for improvement - 25 minutes
www.dans.knaw.nl
DANS is an institute of KNAW and NWO
@pkdoorn @dansknaw
FAIR metrics - Starring your Data Sets, Athens, 8 Sept 2017
A Lightweight FAIR Data Assessment Tool
Peter Doorn, Elly Dijk & Marjan Grootveld, DANS
Thanks to Emily Thomas and Eleftheria Tsoupra
Institute of
Dutch Academy
and Research
Funding
Organisation
(KNAW & NWO)
since 2005
First predecessor
dates back to
1964 (Steinmetz
Foundation),
Historical Data
Archive 1989
Mission:
promote and
provide
permanent
access to digital
research
resources
DANS is about keeping data FAIR
DANS and Data Seal of Approval
• 2005: DANS to promote and provide permanent
access to digital research resources
• Formulate quality guidelines for digital repositories
including DANS
• 2006: 5 basic principles as basis for 16 DSA
guidelines
• 2009: international DSA Board
• Almost 70 seals acquired around the globe, but with
a focus on Europe
• 2016: collaboration of DSA and World Data System,
creating new Core Trust Seal
https://www.datasealofapproval.org/en/
https://goo.gl/kZb1Ga
The Certification Framework
ISO 16363:2012 - Audit and certification
of trustworthy digital repositories
http://www.iso16363.org/
DIN 31644 standard “Criteria for trustworthy
digital archives”
http://www.langzeitarchivierung.de
http://www.datasealofapproval.org/
https://www.icsu-wds.org/
https://goo.gl/kZb1Ga
Repository requirements dealing with “FAIRness”
R2. The repository maintains all applicable licenses covering data access and use and
monitors compliance.
R3. The repository has a continuity plan to ensure ongoing access to and preservation
of its holdings.
R4. The repository ensures, to the extent possible, that data are created, curated,
accessed, and used in compliance with disciplinary and ethical norms.
R7. The repository guarantees the integrity and authenticity of the data.
R8. The repository accepts data and metadata based on defined criteria to ensure
relevance and understandability for data users.
R10. The repository assumes responsibility for long-term preservation and manages
this function in a planned and documented way.
R11. The repository has appropriate expertise to address technical data and metadata
quality and ensures that sufficient information is available for end users to make
quality-related evaluations.
R13. The repository enables users to discover the data and refer to them in a
persistent way through proper citation.
R14. The repository enables reuse of the data over time, ensuring that appropriate
metadata are available to support the understanding and use of the data.
Resemblance DSA – FAIR principles
DSA Principles (for data repositories) FAIR Principles (for data sets)
data can be found on the internet Findable
data are accessible Accessible
data are in a usable format Interoperable
data are reliable Reusable
data can be referred to (citable)
The resemblance is not perfect:
• usable format (DSA) is an aspect of interoperability (FAIR)
• FAIR explicitly addresses machine readability
• etc.
A certified TDR already offers a baseline data quality level
OSFair2017 Training | FAIR metrics - Starring your data sets
Combine and operationalize: DSA & FAIR
• Growing demand for quality criteria for
research datasets and ways to assess their
fitness for use
• Combine the principles of core repository
certification and FAIR
• Use the principles as quality criteria:
• Core certification – digital repositories
• FAIR principles – research data (sets)
• Operationalize the principles as an
instrument to assess FAIRness of existing
datasets in certified TDRs
Experiences with Data Reviews at DANS
started in 2011
M. Grootveld, J. van Egmond
en B. Sørensen
https://goo.gl/Tf4HFN
Badges for assessing aspects of data
quality and “openness”
These badges do not define good practice, they
certify that a particular practice was followed.
Sources: Open data institute (UK), Centre for open science (US), Tim-Berners Lee
5-star deployment scheme for Open Data
Different approaches to FAIR
Requirements
for new data
creation
Simple assessment of the FAIR profile
for existing data in certified repositories
Horizon 2020 Commission
Expert Group on turning
FAIR data into reality:
https://goo.gl/9WZqyq
WDS/RDA Assessment of
Data Fitness for Use WG:
https://www.rd-
alliance.org/groups/asses
sment-data-fitness-use
GO-FAIR Metrics Group:
https://www.dtls.nl/fair-
data/fair-metrics-group/
and http://fairmetrics.org/
Framework for FAIR assessment:
Define metrics enabling automated
assessment of the degree to which online
resources comply with the FAIR Principles
FAIR badge scheme
• Proxy for data “quality” or “fitness
for (re-)use”
• Prevent interactions among
dimensions to ease scoring
• Consider Reusability as the
resultant of the other three:
– the average FAIRness as an indicator
of data quality
– (F+A+I)/3=R
• Manual and automatic scoring
F A I R
2 User Reviews
1 Archivist Assessment
24 Downloads
First we attempted to operationalise R –
Reusable as well… but we did not succeed
Reusable – is it a separate dimension? By definition subjective:
reusability depends on what you want to use the data for!
Idea for operationalization Solution
R1. plurality of accurate and relevant attributes ≈ F2: “data are described
with rich metadata”  F
R1.1. clear and accessible data usage license  A
R1.2. provenance (for replication and reuse)  F
R1.3. meet domain-relevant community standards  I
Data is in a TDR – unsustained data will not remain usable Aspect of Repository  Data
Seal of Approval
Explication on how data was or can be used is available  F
Data is automatically usable by machines  I
Findable (defined by metadata (PID included) and documentation)
1. No PID nor metadata/documentation
2. PID without or with insufficient metadata
3. Sufficient/limited metadata without PID
4. PID with sufficient metadata
5. Extensive metadata and rich additional documentation available
Accessible (defined by presence of user license)
1. Metadata nor data are accessible
2. Metadata are accessible but data is not accessible (no clear terms of reuse in
license)
3. User restrictions apply (i.e. privacy, commercial interests, embargo period)
4. Public access (after registration)
5. Open access unrestricted
Interoperable (defined by data format)
1. Proprietary (privately owned), non-open format data
2. Proprietary format, accepted by Certified Trustworthy Data Repository
3. Non-proprietary, open format = ‘preferred format’
4. As well as in the preferred format, data is standardised using a standard
vocabulary format (for the research field to which the data pertain)
5. Data additionally linked to other data to provide context
Creating the FAIR Data Assessment Tool
Prototype:
https://www.surveymonkey.com/r/fairdat
Using an online questionnaire system
Explanatory documentation
Website FAIRDAT under construction
• To contain FAIR data
assessments from any
repository or website,
linking to the location of
the data set via
(persistent) identifier
• The repository can show
the resultant badge,
linking back to the
FAIRDAT website
F A I R
2 User Reviews
1 Archivist
Assessment
24 Downloads
Neutral, Independent
Analogous to DSA website
Display FAIR badges in any repository (Zenodo,
Dataverse, Mendeley Data, figshare, B2SAFE, …)
Can FAIR Data Assessment be automatic?
Criterion Automatic?
Y/N/Semi
Subjective?
Y/N/Semi
Comments
F1 No PID / No Metadata Y N Dealt with by Repository
F2 PID / Insuff. Metadata S S Insufficient metadata is subjective
F3 No PID / Suff. Metadata S S Sufficient metadata is subjective
F4 PID / Sufficient Metadata S S Sufficient metadata is subjective
F5 PID / Rich Metadata S S Rich metadata is subjective
A1 No License / No Access Y N Dealt with by Repository
A2 Metadata Accessible Y N Dealt with by Repository
A3 User Restrictions Y N Dealt with by Repository
A4 Public Access Y N Dealt with by Repository
A5 Open Access Y N Dealt with by Repsoitory
I1 Proprietary Format S N Depends on list of proprietary formats
I2 Accepted Format S S Depends on list of accepted formats
I3 Archival Format S S Depends on list of preferred formats
I4 + Harmonized N S Depends on domain vocabularies
I5 + Linked S N Depends on semantic methods used
Optional: qualitative assessment / data review
Testing the FAIRdat prototype
• The tool runs a series of questions (maximum of 5 per principle) which
follow routing options to display the star rating scored per principle.
• Explore FAIRdat in small groups: assessment of datasets from various
disciplines - 45 minutes
• Feedback and suggestions for improvement - 25 minutes
Links:
Handout: https://goo.gl/749dmf
FAIRdat prototype: https://www.surveymonkey.com/r/fairdat
Feedback form: https://www.surveymonkey.com/r/fair_feedback
Towards a FAIR Framework?
Analogous to Certification Framework?
Formal
-----------------------------------
Extended
-----------------------------------------
Core
All noses in the same direction?
Thank you for listening!
peter.doorn@dans.knaw.nl
www.dans.knaw.nl
http://www.dtls.nl/go-fair/
https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-
webinar
Thanks to Ingrid Dillo and Emily Thomas for their contributions

More Related Content

PPTX
CARARE: Can I use this data? FAIR into practice
PDF
Preparing Data for Sharing: The FAIR Principles
PDF
Mendeley Data FAIR hackathon
PPTX
DTL Partners Event - FAIR Data Tech overview - Day 1
PPTX
Fair traits data 20180517
PPTX
Increasing the Reputation of your Published Data on the Web
PDF
FAIR Data Knowledge Graphs–from Theory to Practice
PDF
"Cool" metadata for FAIR data
CARARE: Can I use this data? FAIR into practice
Preparing Data for Sharing: The FAIR Principles
Mendeley Data FAIR hackathon
DTL Partners Event - FAIR Data Tech overview - Day 1
Fair traits data 20180517
Increasing the Reputation of your Published Data on the Web
FAIR Data Knowledge Graphs–from Theory to Practice
"Cool" metadata for FAIR data

What's hot (20)

PPTX
PPTX
BioPharma and FAIR Data, a Collaborative Advantage
PDF
FAIR BioData Management
PPTX
Linked Data for Biopharma
PPTX
D4Science Data infrastructure: a facilitator for a FAIR data management
PPTX
Open Science goes FAIR
PPTX
DTL Integrator's meeting
PPTX
Building blocks for success: criteria for trusted institutional repositories
PPTX
Why institutions need to raise their capabilities to support FAIR
PDF
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
PPTX
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
PDF
Organizational Identifiers - Crossref LIVE Hannover
PPTX
ROER4D Open Data Initiative
PPT
A Data Citation Roadmap for Scholarly Data Repositories
PDF
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
PPTX
Software Sustainability: Better Software Better Science
PDF
Engaging Information Professionals in the Process of Authoritative Interlinki...
PPTX
Building blocks for success: criteria for trusted institutional repositories
PPTX
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
PDF
Basics of Research Data Management
BioPharma and FAIR Data, a Collaborative Advantage
FAIR BioData Management
Linked Data for Biopharma
D4Science Data infrastructure: a facilitator for a FAIR data management
Open Science goes FAIR
DTL Integrator's meeting
Building blocks for success: criteria for trusted institutional repositories
Why institutions need to raise their capabilities to support FAIR
Updates on the FAIR Data Maturity Model RDA Working Group & the DG RTD FAIR i...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Organizational Identifiers - Crossref LIVE Hannover
ROER4D Open Data Initiative
A Data Citation Roadmap for Scholarly Data Repositories
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Software Sustainability: Better Software Better Science
Engaging Information Professionals in the Process of Authoritative Interlinki...
Building blocks for success: criteria for trusted institutional repositories
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021
Basics of Research Data Management
Ad

Similar to OSFair2017 Training | FAIR metrics - Starring your data sets (20)

PPTX
OSFair2017 workshop | Monitoring the FAIRness of data sets - Introducing the ...
PPTX
FAIR data and data management
PDF
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
PPTX
Data sharing in the Netherlands
PPTX
Kr slides fair astronomy 20181019
PPTX
PARTHENOS Common Policies and Implementation Strategies
PPTX
FAIR data: what it means, how we achieve it, and the role of RDA
PPTX
The future of FAIR
PPTX
#1 FAIR: Into to FAIR and F for Findable
PPTX
Essentials 4 Data Support: a fine course in FAIR Data Support
PDF
FAIR Ddata in trustworthy repositories: the basics
PPTX
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
PPTX
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
PDF
Dataverse as a FAIR Data Repository (Mercè Crosas)
PPTX
FAIR data
PPTX
LIBER Webinar: Are the FAIR Data Principles really fair?
PPTX
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
PPT
Webinar@AIMS_FAIR Principles and Data Management Planning
PPTX
Fair data vs 5 star open data final
PPT
H2020 data pilot openaire
OSFair2017 workshop | Monitoring the FAIRness of data sets - Introducing the ...
FAIR data and data management
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
Data sharing in the Netherlands
Kr slides fair astronomy 20181019
PARTHENOS Common Policies and Implementation Strategies
FAIR data: what it means, how we achieve it, and the role of RDA
The future of FAIR
#1 FAIR: Into to FAIR and F for Findable
Essentials 4 Data Support: a fine course in FAIR Data Support
FAIR Ddata in trustworthy repositories: the basics
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Dataverse as a FAIR Data Repository (Mercè Crosas)
FAIR data
LIBER Webinar: Are the FAIR Data Principles really fair?
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
Webinar@AIMS_FAIR Principles and Data Management Planning
Fair data vs 5 star open data final
H2020 data pilot openaire
Ad

More from Open Science Fair (20)

PDF
OSFair2017 workshop | Monitoring open science trends in europe
PDF
OSFair2017 Worksop | NUCLEUS project - Are you ready to perform in RRI ecosys...
PDF
OSFair2017 Workshop | Data Analytics meets Social Sciences: New Frontiers of ...
PDF
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
PDF
OSFair2017 Workshop | Research lifecycle in Arts, Humanities and Social Sciences
PPTX
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
PDF
OSFair2017 Workshop | Big Mechanism: deep reading for cancer biology
PPTX
OSFair2017 Workshop | Text mining
PPTX
OSFair2017 Workshop | EOSCpilot governance
PPTX
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
PPTX
OSFair2017 Workshop | Service provisioning for excellent sciences
PPTX
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
PPTX
OSFair2017 Theatrical Workshop | Nucleus H2020 EU project
PDF
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
PDF
OSFair2017 Training | Reproducibility in critical care research
PDF
OSFair2017 Training | Big data and evidence-based medicine in Greece
PPTX
OSFair2017 Training | What is Open Science and why should I care?
PDF
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
PDF
OSFair2017 Training | Designing & implementing open access, open data & open ...
PDF
OSFair2017 Training | Best practice in Open Science
OSFair2017 workshop | Monitoring open science trends in europe
OSFair2017 Worksop | NUCLEUS project - Are you ready to perform in RRI ecosys...
OSFair2017 Workshop | Data Analytics meets Social Sciences: New Frontiers of ...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Research lifecycle in Arts, Humanities and Social Sciences
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
OSFair2017 Workshop | Big Mechanism: deep reading for cancer biology
OSFair2017 Workshop | Text mining
OSFair2017 Workshop | EOSCpilot governance
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
OSFair2017 Theatrical Workshop | Nucleus H2020 EU project
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
OSFair2017 Training | Reproducibility in critical care research
OSFair2017 Training | Big data and evidence-based medicine in Greece
OSFair2017 Training | What is Open Science and why should I care?
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
OSFair2017 Training | Designing & implementing open access, open data & open ...
OSFair2017 Training | Best practice in Open Science

Recently uploaded (20)

PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
The scientific heritage No 166 (166) (2025)
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
neck nodes and dissection types and lymph nodes levels
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPT
protein biochemistry.ppt for university classes
PPTX
2Systematics of Living Organisms t-.pptx
PDF
Sciences of Europe No 170 (2025)
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
Comparative Structure of Integument in Vertebrates.pptx
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
ECG_Course_Presentation د.محمد صقران ppt
Biophysics 2.pdffffffffffffffffffffffffff
The KM-GBF monitoring framework – status & key messages.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
The scientific heritage No 166 (166) (2025)
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
7. General Toxicologyfor clinical phrmacy.pptx
Phytochemical Investigation of Miliusa longipes.pdf
Cell Membrane: Structure, Composition & Functions
Introduction to Cardiovascular system_structure and functions-1
HPLC-PPT.docx high performance liquid chromatography
neck nodes and dissection types and lymph nodes levels
POSITIONING IN OPERATION THEATRE ROOM.ppt
protein biochemistry.ppt for university classes
2Systematics of Living Organisms t-.pptx
Sciences of Europe No 170 (2025)
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Comparative Structure of Integument in Vertebrates.pptx

OSFair2017 Training | FAIR metrics - Starring your data sets

  • 1. Only use this slide to present a screenshot of an application. As no style is applied, the screenshot can take up the whole slide. For all other information please use the slide with preset style! Athens, 8 Sept 2017
  • 2. AGENDA • Introduction to the FAIRdat tool and its future goals - 20 minutes • Explore FAIRdat in small groups: assessment of datasets from various disciplines - 45 minutes • Feedback and suggestions for improvement - 25 minutes
  • 3. www.dans.knaw.nl DANS is an institute of KNAW and NWO @pkdoorn @dansknaw FAIR metrics - Starring your Data Sets, Athens, 8 Sept 2017 A Lightweight FAIR Data Assessment Tool Peter Doorn, Elly Dijk & Marjan Grootveld, DANS Thanks to Emily Thomas and Eleftheria Tsoupra
  • 4. Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research resources DANS is about keeping data FAIR
  • 5. DANS and Data Seal of Approval • 2005: DANS to promote and provide permanent access to digital research resources • Formulate quality guidelines for digital repositories including DANS • 2006: 5 basic principles as basis for 16 DSA guidelines • 2009: international DSA Board • Almost 70 seals acquired around the globe, but with a focus on Europe • 2016: collaboration of DSA and World Data System, creating new Core Trust Seal https://www.datasealofapproval.org/en/ https://goo.gl/kZb1Ga
  • 6. The Certification Framework ISO 16363:2012 - Audit and certification of trustworthy digital repositories http://www.iso16363.org/ DIN 31644 standard “Criteria for trustworthy digital archives” http://www.langzeitarchivierung.de http://www.datasealofapproval.org/ https://www.icsu-wds.org/ https://goo.gl/kZb1Ga
  • 7. Repository requirements dealing with “FAIRness” R2. The repository maintains all applicable licenses covering data access and use and monitors compliance. R3. The repository has a continuity plan to ensure ongoing access to and preservation of its holdings. R4. The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms. R7. The repository guarantees the integrity and authenticity of the data. R8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users. R10. The repository assumes responsibility for long-term preservation and manages this function in a planned and documented way. R11. The repository has appropriate expertise to address technical data and metadata quality and ensures that sufficient information is available for end users to make quality-related evaluations. R13. The repository enables users to discover the data and refer to them in a persistent way through proper citation. R14. The repository enables reuse of the data over time, ensuring that appropriate metadata are available to support the understanding and use of the data.
  • 8. Resemblance DSA – FAIR principles DSA Principles (for data repositories) FAIR Principles (for data sets) data can be found on the internet Findable data are accessible Accessible data are in a usable format Interoperable data are reliable Reusable data can be referred to (citable) The resemblance is not perfect: • usable format (DSA) is an aspect of interoperability (FAIR) • FAIR explicitly addresses machine readability • etc. A certified TDR already offers a baseline data quality level
  • 10. Combine and operationalize: DSA & FAIR • Growing demand for quality criteria for research datasets and ways to assess their fitness for use • Combine the principles of core repository certification and FAIR • Use the principles as quality criteria: • Core certification – digital repositories • FAIR principles – research data (sets) • Operationalize the principles as an instrument to assess FAIRness of existing datasets in certified TDRs
  • 11. Experiences with Data Reviews at DANS started in 2011 M. Grootveld, J. van Egmond en B. Sørensen https://goo.gl/Tf4HFN
  • 12. Badges for assessing aspects of data quality and “openness” These badges do not define good practice, they certify that a particular practice was followed. Sources: Open data institute (UK), Centre for open science (US), Tim-Berners Lee 5-star deployment scheme for Open Data
  • 13. Different approaches to FAIR Requirements for new data creation Simple assessment of the FAIR profile for existing data in certified repositories Horizon 2020 Commission Expert Group on turning FAIR data into reality: https://goo.gl/9WZqyq WDS/RDA Assessment of Data Fitness for Use WG: https://www.rd- alliance.org/groups/asses sment-data-fitness-use GO-FAIR Metrics Group: https://www.dtls.nl/fair- data/fair-metrics-group/ and http://fairmetrics.org/ Framework for FAIR assessment: Define metrics enabling automated assessment of the degree to which online resources comply with the FAIR Principles
  • 14. FAIR badge scheme • Proxy for data “quality” or “fitness for (re-)use” • Prevent interactions among dimensions to ease scoring • Consider Reusability as the resultant of the other three: – the average FAIRness as an indicator of data quality – (F+A+I)/3=R • Manual and automatic scoring F A I R 2 User Reviews 1 Archivist Assessment 24 Downloads
  • 15. First we attempted to operationalise R – Reusable as well… but we did not succeed Reusable – is it a separate dimension? By definition subjective: reusability depends on what you want to use the data for! Idea for operationalization Solution R1. plurality of accurate and relevant attributes ≈ F2: “data are described with rich metadata”  F R1.1. clear and accessible data usage license  A R1.2. provenance (for replication and reuse)  F R1.3. meet domain-relevant community standards  I Data is in a TDR – unsustained data will not remain usable Aspect of Repository  Data Seal of Approval Explication on how data was or can be used is available  F Data is automatically usable by machines  I
  • 16. Findable (defined by metadata (PID included) and documentation) 1. No PID nor metadata/documentation 2. PID without or with insufficient metadata 3. Sufficient/limited metadata without PID 4. PID with sufficient metadata 5. Extensive metadata and rich additional documentation available Accessible (defined by presence of user license) 1. Metadata nor data are accessible 2. Metadata are accessible but data is not accessible (no clear terms of reuse in license) 3. User restrictions apply (i.e. privacy, commercial interests, embargo period) 4. Public access (after registration) 5. Open access unrestricted Interoperable (defined by data format) 1. Proprietary (privately owned), non-open format data 2. Proprietary format, accepted by Certified Trustworthy Data Repository 3. Non-proprietary, open format = ‘preferred format’ 4. As well as in the preferred format, data is standardised using a standard vocabulary format (for the research field to which the data pertain) 5. Data additionally linked to other data to provide context
  • 17. Creating the FAIR Data Assessment Tool Prototype: https://www.surveymonkey.com/r/fairdat Using an online questionnaire system
  • 19. Website FAIRDAT under construction • To contain FAIR data assessments from any repository or website, linking to the location of the data set via (persistent) identifier • The repository can show the resultant badge, linking back to the FAIRDAT website F A I R 2 User Reviews 1 Archivist Assessment 24 Downloads Neutral, Independent Analogous to DSA website
  • 20. Display FAIR badges in any repository (Zenodo, Dataverse, Mendeley Data, figshare, B2SAFE, …)
  • 21. Can FAIR Data Assessment be automatic? Criterion Automatic? Y/N/Semi Subjective? Y/N/Semi Comments F1 No PID / No Metadata Y N Dealt with by Repository F2 PID / Insuff. Metadata S S Insufficient metadata is subjective F3 No PID / Suff. Metadata S S Sufficient metadata is subjective F4 PID / Sufficient Metadata S S Sufficient metadata is subjective F5 PID / Rich Metadata S S Rich metadata is subjective A1 No License / No Access Y N Dealt with by Repository A2 Metadata Accessible Y N Dealt with by Repository A3 User Restrictions Y N Dealt with by Repository A4 Public Access Y N Dealt with by Repository A5 Open Access Y N Dealt with by Repsoitory I1 Proprietary Format S N Depends on list of proprietary formats I2 Accepted Format S S Depends on list of accepted formats I3 Archival Format S S Depends on list of preferred formats I4 + Harmonized N S Depends on domain vocabularies I5 + Linked S N Depends on semantic methods used Optional: qualitative assessment / data review
  • 22. Testing the FAIRdat prototype • The tool runs a series of questions (maximum of 5 per principle) which follow routing options to display the star rating scored per principle. • Explore FAIRdat in small groups: assessment of datasets from various disciplines - 45 minutes • Feedback and suggestions for improvement - 25 minutes Links: Handout: https://goo.gl/749dmf FAIRdat prototype: https://www.surveymonkey.com/r/fairdat Feedback form: https://www.surveymonkey.com/r/fair_feedback
  • 23. Towards a FAIR Framework? Analogous to Certification Framework? Formal ----------------------------------- Extended ----------------------------------------- Core All noses in the same direction?
  • 24. Thank you for listening! peter.doorn@dans.knaw.nl www.dans.knaw.nl http://www.dtls.nl/go-fair/ https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories- webinar Thanks to Ingrid Dillo and Emily Thomas for their contributions