Towards a FAIR data lifecycle
Jessica Parland-von Essen
22.10.2020
https://orcid.org/0000-0003-4460-3906
2
F
A
I
R
FINDABLE
• Described in relevant catalog with enough detail
• Landing page with globally unique persistent identifier
ACCESSIBLE
• Can be retrieved over the internet
• Versioning and lifecycle are documented
• Tombstone page if data is deleted
INTEROPERABLE
• Common, documented, and open formats
RE-USABLE
• Well documented and intelligible
• Rights clearly stated
https://doi.org/10.5281/zenodo.4045402
FAIR Ecosystem Components and FAIR Digital Objects
3 http://doi.org/10.5281/zenodo.3565428 https://doi.org/doi:10.2777/1524
Shallow FAIR and Deep FAIR
4
Necessary
research
information, PIDs,
machine readable
license
All data
elements are
machine
accessible
Research
Information
Research
Data
ACTIVE DATA
Raw, continuously
updated
DYNAMIC
RESEARCH DATA
Version
controlled,
possible to cite
RESEARCH
DATASET
PUBLICATION
Immutable
Documentation, validation
Research
Research Data Types
https://doi.org/10.23978/inf.77419
Discovering
Acquireing
Accessing
Ingesting
Documenting
Preparing
Processing
Documenting
Storing
Sharing
Publishing
Citing
Preser-
vation
Meta
data
Data
base
HPC
CODE
Work-
flows
articles LAM
Semantic
artefacts
PIDs
LEVEL 0
Output from automated data
collection
LEVEL 1
Near Real Time data
Metadata
Control
LEVEL 1
Internal Working data
LEVEL 2
Final quality-checked gap-
filled dataset
LEVEL 3
Elaborated Data Products
Metadata
Control
EXTER-
NAL
Data requirements on different levels for enabling FAIR?
Interoperability and persistance
• SSHOC reference ontology
• FAIRsFAIR Recommendations for semantic artefacts
• Choosing open formats and protocols
• Good data lifecycle management planning
• Using FAIR enabling services
• Managing reproducibility vs citations
8
A PID should be globally unique, i.e. nobody
else in the world should use the same string to
refer to anything else. In practice this means
that a PID has a controlled syntax and a
governed namespace (generally consisting of
a name space indicator (prefix) and a local
identifier (suffix)) and be issued and managed
by a clearly specified registration authority.
A PID should be resolvable, i.e. provide a way
for both machines and humans to access the
digital object itself, the state information
and/or landing page (in current practice this
means the identifier can be translated to a
fully defined URI, at any moment, without the
requirement that it resolves to the same URL
over time).
A PID it should be persistent, i.e. remain
unique and resolvable with a persistent
syntax. The object it represents should ideally
also be persistent, but even if that last
persistence is
10,11 broken the PID should guarantee not to
be reused for any other object in the future.
Persistent Identifiers
https://doi.org/10.5281/zenodo.4001631
Co-creation &
co-development
23/10/20209
Always design a thing by considering it in its next
larger context – a chair in a room, a room in a
house, a house in an environment, an
environment in a city plan.
Eliel Saarinen, Finnish architect (1873--1950)
LA2 / CC BY-S. Wikimedia
(https://creativecommons.org/licenses/by-sa/4.0)

More Related Content

PPT
Using Semantic Wiki as a Semantic Web Workbench
PPTX
Open Science goes FAIR
PPTX
Introduction to Persistent Identifiers| www.eudat.eu |
PPTX
FAIR Workflows and Research Objects get a Workout
PDF
Towards collective hyperlocal contextual awareness among heterogeneous RFID s...
PPTX
How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu |
PDF
EUDAT Brochure - B2SHARE.pdf
PPTX
FAIR Computational Workflows
Using Semantic Wiki as a Semantic Web Workbench
Open Science goes FAIR
Introduction to Persistent Identifiers| www.eudat.eu |
FAIR Workflows and Research Objects get a Workout
Towards collective hyperlocal contextual awareness among heterogeneous RFID s...
How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu |
EUDAT Brochure - B2SHARE.pdf
FAIR Computational Workflows

Similar to Towards a FAIR lifecycle (20)

PPTX
FAIR Computational Workflows
PPTX
EUDAT B2Service Suite| - A new version is available at http://ow.ly/fsCi30grKHV
PPTX
EUDAT Research Data Services for all | www.eudat.eu |
PPTX
OSFair2017 Workshop | EPOS: European Plate Observing System
PDF
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
PDF
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
PPTX
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
PPTX
FAIR Computational Workflows
PPTX
A Finnish perspective on FAIRsFAIR outputs
PPTX
Persistent Identifiers in EUDAT services| www.eudat.eu |
PPTX
Web open standards for linked data and knowledge graphs as enablers of EU dig...
PPTX
FAIRy stories: the FAIR Data principles in theory and in practice
PDF
Rights Enforcement and Licensing Understanding for RDF Stores Aggregating Ope...
PPTX
EUDAT B2Service Suite - November 2017 | www.eudat.eu |
PPTX
RO-Crate: A framework for packaging research products into FAIR Research Objects
PPTX
RO-Crate: packaging metadata love notes into FAIR Digital Objects
PDF
EUDAT Brochure - B2HANDLE.pdf
PPT
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
PPTX
Toward universal information access on the digital object cloud
PDF
Handling data and workflows in computational materials science: the AiiDA ini...
FAIR Computational Workflows
EUDAT B2Service Suite| - A new version is available at http://ow.ly/fsCi30grKHV
EUDAT Research Data Services for all | www.eudat.eu |
OSFair2017 Workshop | EPOS: European Plate Observing System
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
Towards FAIR Open Science with PID Kernel Information: RPID Testbed
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
FAIR Computational Workflows
A Finnish perspective on FAIRsFAIR outputs
Persistent Identifiers in EUDAT services| www.eudat.eu |
Web open standards for linked data and knowledge graphs as enablers of EU dig...
FAIRy stories: the FAIR Data principles in theory and in practice
Rights Enforcement and Licensing Understanding for RDF Stores Aggregating Ope...
EUDAT B2Service Suite - November 2017 | www.eudat.eu |
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
EUDAT Brochure - B2HANDLE.pdf
EUDAT Collaborative Data Infrastructure: Data Access and Re-use Service Area
Toward universal information access on the digital object cloud
Handling data and workflows in computational materials science: the AiiDA ini...
Ad

More from Jessica Parland-von Essen (20)

PPTX
Planning a Finnish PID Roadmap
PPTX
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
PDF
Pid landscape in finland
PPTX
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
PPTX
Metatiedot tunnisteet tutkimisdata
PPTX
Persistence and Interoperability
PPTX
Collections meet the researcher. Digitalization, disintegration and disillusi...
PPTX
Supporting FAIR data principles with data categorization
PPTX
Research data management for historians
PDF
FAIR data and the Etsin service
PPTX
Yhteiskuntatieteen aineistot
PPTX
Avoimen suomen historia
PPTX
Open Science Process
PPTX
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
PDF
AffarerAllianserAnseende
PPTX
Avoin tiede Suomessa
PDF
Forskningsdataforhumanister
PPTX
Data Management in Research
PDF
Biblioteksdagarna 2014 i Åbo
PPT
Biblioteksdagarna2014
Planning a Finnish PID Roadmap
Tutkimusaineistojen kuvailu, metadata ja yhteentoimivuus
Pid landscape in finland
Fairdata-palvelut ja tutkimusaineistojen pitkäaikaissäilytys
Metatiedot tunnisteet tutkimisdata
Persistence and Interoperability
Collections meet the researcher. Digitalization, disintegration and disillusi...
Supporting FAIR data principles with data categorization
Research data management for historians
FAIR data and the Etsin service
Yhteiskuntatieteen aineistot
Avoimen suomen historia
Open Science Process
Tutkimusaineistoihiin viittaaminen, pysyvät tunnisteet ja linkittäminen
AffarerAllianserAnseende
Avoin tiede Suomessa
Forskningsdataforhumanister
Data Management in Research
Biblioteksdagarna 2014 i Åbo
Biblioteksdagarna2014
Ad

Recently uploaded (20)

PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Benefits of Physical activity for teenagers.pptx
PPT
Module 1.ppt Iot fundamentals and Architecture
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Five Habits of High-Impact Board Members
sbt 2.0: go big (Scala Days 2025 edition)
CloudStack 4.21: First Look Webinar slides
Hindi spoken digit analysis for native and non-native speakers
Benefits of Physical activity for teenagers.pptx
Module 1.ppt Iot fundamentals and Architecture
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
A contest of sentiment analysis: k-nearest neighbor versus neural network
Microsoft Excel 365/2024 Beginner's training
UiPath Agentic Automation session 1: RPA to Agents
The influence of sentiment analysis in enhancing early warning system model f...
OpenACC and Open Hackathons Monthly Highlights July 2025
A review of recent deep learning applications in wood surface defect identifi...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Abstractive summarization using multilingual text-to-text transfer transforme...
Getting started with AI Agents and Multi-Agent Systems
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
A proposed approach for plagiarism detection in Myanmar Unicode text
Five Habits of High-Impact Board Members

Towards a FAIR lifecycle

  • 1. Towards a FAIR data lifecycle Jessica Parland-von Essen 22.10.2020 https://orcid.org/0000-0003-4460-3906
  • 2. 2 F A I R FINDABLE • Described in relevant catalog with enough detail • Landing page with globally unique persistent identifier ACCESSIBLE • Can be retrieved over the internet • Versioning and lifecycle are documented • Tombstone page if data is deleted INTEROPERABLE • Common, documented, and open formats RE-USABLE • Well documented and intelligible • Rights clearly stated https://doi.org/10.5281/zenodo.4045402
  • 3. FAIR Ecosystem Components and FAIR Digital Objects 3 http://doi.org/10.5281/zenodo.3565428 https://doi.org/doi:10.2777/1524
  • 4. Shallow FAIR and Deep FAIR 4 Necessary research information, PIDs, machine readable license All data elements are machine accessible Research Information Research Data
  • 5. ACTIVE DATA Raw, continuously updated DYNAMIC RESEARCH DATA Version controlled, possible to cite RESEARCH DATASET PUBLICATION Immutable Documentation, validation Research Research Data Types https://doi.org/10.23978/inf.77419
  • 7. LEVEL 0 Output from automated data collection LEVEL 1 Near Real Time data Metadata Control LEVEL 1 Internal Working data LEVEL 2 Final quality-checked gap- filled dataset LEVEL 3 Elaborated Data Products Metadata Control EXTER- NAL Data requirements on different levels for enabling FAIR?
  • 8. Interoperability and persistance • SSHOC reference ontology • FAIRsFAIR Recommendations for semantic artefacts • Choosing open formats and protocols • Good data lifecycle management planning • Using FAIR enabling services • Managing reproducibility vs citations 8 A PID should be globally unique, i.e. nobody else in the world should use the same string to refer to anything else. In practice this means that a PID has a controlled syntax and a governed namespace (generally consisting of a name space indicator (prefix) and a local identifier (suffix)) and be issued and managed by a clearly specified registration authority. A PID should be resolvable, i.e. provide a way for both machines and humans to access the digital object itself, the state information and/or landing page (in current practice this means the identifier can be translated to a fully defined URI, at any moment, without the requirement that it resolves to the same URL over time). A PID it should be persistent, i.e. remain unique and resolvable with a persistent syntax. The object it represents should ideally also be persistent, but even if that last persistence is 10,11 broken the PID should guarantee not to be reused for any other object in the future. Persistent Identifiers https://doi.org/10.5281/zenodo.4001631
  • 9. Co-creation & co-development 23/10/20209 Always design a thing by considering it in its next larger context – a chair in a room, a room in a house, a house in an environment, an environment in a city plan. Eliel Saarinen, Finnish architect (1873--1950) LA2 / CC BY-S. Wikimedia (https://creativecommons.org/licenses/by-sa/4.0)

Editor's Notes

  • #3: F = Findable, kun aineistolla on pysyvä tunniste esim doi, linkki aineistoon toimii aina vaikka säilytyspaikka muuttuisi A = Accessible, tutkimusaineiston tunniste toimii hyperlinkkinä jonka avulla dataan ja sen kuvailutietoihin pääsee käsiksi verkkoselaimella I = Interoperable yhteentoimivuuden periaate edellyttää avoimia tiedostomuotoja ja yhteisiä standardeja, ei enää tiedostoja jotka eivät aukea R = Re-usable (datan kuvailu tukee tätä), dataa voidaan käyttää kun sillä on metatietoja ja käyttöehdoista kertova lisenssi
  • #4: Figure 8 lähde: TFiR https://doi.org/doi:10.2777/1524 Diagram 2 lähde : http://doi.org/10.5281/zenodo.3565428
  • #5: The first use case is the ​visibility of your work and outputs​. When reporting on your work, to funders, and publishing outputs, a basic level of FAIRness and PID use is sufficient to enable findability, simple citation and output registration with core descriptive metadata. This is the context of what is usually called ​research information (sometimes referred to as current research information). The most common and useful PIDs for this are the research output DOI and the ORCID for the creator(s)/contributor(s). There are also other systems available to identify other kinds of entities to help further linking of information, such as organisations or protocols. Funders and employers might for instance require linking to some other contextual reference data like lists of grants, funders and affiliated organisations. This kind of information is becoming more important, but the actual data quality is depending on the functionalities each service provides. If the services used for dataset publication or reporting don’t require PIDs or don’t offer reference (meta)datasets or integration with PIDs for these kinds of things, it is difficult for the researcher to provide this information in an unambiguous way. The other use case for PIDs is the ​management of the research data itself​. Here the PIDs can have different functions: (a) creating deep FAIR research datasets as ​research outputs​, where all individual data elements are machine accessible, see panel F in Figure 1, or (b) when managing and documenting the actual workflow and data and related information ​during research to ensure reproducibility of research results.
  • #6: The archive or generic repository usually operates with research dataset publications, that are are a sort of publication, albeit complex, but immutable, archived as output and evidence for research. This case is quite easy, pid wise. But in real life there are many steps and varieties of data before this- This should be taken into account when citing, for instance. How can we support sufficient reproduciblity without overflowing all systems with PID – that should be kept and maintained forever! Dynamic data citation
  • #9: It is NOT recommended that the researcher or any individual person is the PID owner, but this, as well as management, should be governed in a sustainable way. ●  Data Versioning: For retrieving earlier states of datasets, the data needs to be versioned. Markers shall indicate inserts, updates and deletes of data in the database. ●  Data Timestamping: Ensure that operations on data are timestamped, i.e. any additions, deletions are marked with a timestamp. ●  Data Identification: The data used shall be identified via a PID pointing to a time-stamped query, resolving to a landing page.