SlideShare a Scribd company logo
dans.knaw.nl
DANS is een instituut van KNAW en NWO
CESSDA Persistent Identifiers
Workshop PID Information Types for the Social Sciences
May 29, 2017, The Hague
Vyacheslav Tykhonov
Senior Information Scientist (DANS)
vyacheslav.tykhonov@dans.knaw.nl
dans.knaw.nl
DANS is een instituut van KNAW en NWO
DANS data repositories with Persistent Identifiers
Within the context of DANS’ mission, it is obligatory that every (digital) object
archived via DANS has a PID, so that it can be (re)located and cited. DANS
uses PIDs for both (digital) objects and people.
DataverseNL for ongoing research projects
• every dataset has its own handle (for Dutch Universities)
• revisions of dataset don’t change the handle, every new version changing only
citation
EASY for permanent data archiving (DOIs)
• archived dataset has DOI
• every version of dataset archived from DataverseNL producing new DOI
dans.knaw.nl
DANS is een instituut van KNAW en NWO
• DANS has developed Plugin to archive datasets deposited
in Dataverse temporary storage to Trusted Digital
Repositories (TDR)
• Before putting datasets in the long term archive users
should create account in TDR and get proper permissions
to archive their data
• Archival Plugin is open source software and can be easily
extended by support of any TDRs:
https://github.com/DANS-KNAW/dataverse-bridge
dans.knaw.nl
DANS is een instituut van KNAW en NWO
“Archive” button is
available for local
Dataverse administrators
to push datasets to EASY
archive for long term
preservation
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Administrator can make
choice where to archive
the dataset:
Archivematica, Islandora,
FEDORA or DANS EASY
(EASY is default option)
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Archiving process will run
in background to extract
data and metadata from
dataset and will create
archived (bagit) package
containing all files and
checksums
dans.knaw.nl
DANS is een instituut van KNAW en NWO
After process of archiving
will be finished button
“Archive” will disappear on
the page. Dataset citation
will be extended with DOI
pointing to archived version
of the dataset in EASY
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Archived version of the
dataset is available on
EASY landing page and
can be cited in
research papers
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Archived dataset
automatically will get
DOI and URN pointing
to archived revision
(version) of dataset
dans.knaw.nl
DANS is een instituut van KNAW en NWO
All files from
dataset will get
permission levels
corresponding to
versions of files
stored in Dataverse
Dataverse as Archival Service
• We’re working on the extension of Dataverse with DOIs
generated for every version of dataset to make it work as
permanent storage
• Citations can contain duplicate metadata but dataset content
(data files) should be different
• Archival part can be hosted by the same Dataverse
depending from plugin settings
CESSDA PID plugin
• Universal plugin to get DOIs and handles in the same
Dataverse instance
• Prefix of every organisation will be generated based on the
configuration and authentication settings of the plugin
• switch Dataverse between support of ongoing research and
archive (in separate subdataverses)
Challenges
• We need PID “Proxy” Service collecting information about all
DOIs generated for different versions of datasets with handles
• depending from the location and status of dataset every
citation should contain handle (Netherlands), URN:NBN
(Europe) and DOI (worldwide)
• statistics about all citations of datasets in research papers
should be aggregated and provided as part of “Proxy” Service
to build own “PageRank” index
• Big Data and Linked Open Data archiving with Persistent
Identifiers
• higher level of granularity for separate files, subsets,
fragments, time services to make citation more accurate
• tombstone pages maintenance
Big Data repository with Persistent Identifiers
The approach is suitable for product development companies (industry) and
organisations and institutions (CESSDA) looking for sustainable (Big) data
archiving services.
Big Data object in Dataverse consists of:
• metadata with authorship and citation information
• data usage licence
• persistent DOI or handle
• information how to obtain key (API token) to start use API endpoint(s)
• link to API endpoint delivering data
• representation of API (interactive documentation, Swagger)
• data provenance
• controlled vocabularies to meet domain specific community standards (optional)
Public demonstration is available on Dataverse demo website.
Linked Data hubs as archived object
Source: PID object
dans.knaw.nl
DANS is een instituut van KNAW en NWO
Questions?

More Related Content

PPTX
FAIR Dataverse
 
PPTX
API economy
 
PDF
Dataverse opportunities
 
PPT
Global registries initiative frumkin omodei
PDF
6.15.17 DSpace-Cris Webinar Presentation Slides
PPTX
RDA-WDS Publishing Data Interest Group
PPTX
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
PDF
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...
FAIR Dataverse
 
API economy
 
Dataverse opportunities
 
Global registries initiative frumkin omodei
6.15.17 DSpace-Cris Webinar Presentation Slides
RDA-WDS Publishing Data Interest Group
Optimising Scientific Knowledge Transfer: How Collective Sensemaking Can Ena...
December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Pa...

What's hot (20)

PPTX
Publishing the Full Research Data Lifecycle
PPTX
The Rocky Road to Reuse
PPTX
PPTX
D4Science Data infrastructure: a facilitator for a FAIR data management
PDF
Preparing Data for Sharing: The FAIR Principles
PPTX
Collaboratively creating a network of ideas, data and software
PDF
Increasing research impact: the national data registry - Alex Ball - Jisc Dig...
PDF
"Cool" metadata for FAIR data
PDF
DSpace-CRIS & OpenAIRE
PDF
BioSharing - Update - Feb2016
PDF
FAIR Data Management and FAIR Data Sharing
PPT
The Big Metadata
PDF
Linked Open Data in the World of Patents
PPTX
LIBER Webinar: Are the FAIR Data Principles really fair?
PPTX
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
PPTX
INSTRUCT - Integrated Structural Biology Infrastructure
PDF
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
PDF
From Big Data to Fast Data
PDF
Mendeley Data FAIR hackathon
PDF
Dataverse, Cloud Dataverse, and DataTags
Publishing the Full Research Data Lifecycle
The Rocky Road to Reuse
D4Science Data infrastructure: a facilitator for a FAIR data management
Preparing Data for Sharing: The FAIR Principles
Collaboratively creating a network of ideas, data and software
Increasing research impact: the national data registry - Alex Ball - Jisc Dig...
"Cool" metadata for FAIR data
DSpace-CRIS & OpenAIRE
BioSharing - Update - Feb2016
FAIR Data Management and FAIR Data Sharing
The Big Metadata
Linked Open Data in the World of Patents
LIBER Webinar: Are the FAIR Data Principles really fair?
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
INSTRUCT - Integrated Structural Biology Infrastructure
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...
From Big Data to Fast Data
Mendeley Data FAIR hackathon
Dataverse, Cloud Dataverse, and DataTags
Ad

Similar to CESSDA Persistent Identifiers (20)

PPTX
DataverseNL as structured data hub
 
PPTX
Peter Doorn Introductie DANS E4DS
PPTX
Persistent identifiers in DataverseEU project
 
PDF
Enhanced publications: an introduction – Arjan Hogenaar, DANS
PDF
Data management and the online e-depot for Dutch Archaeology at DANS
PPTX
Data quality, preservation and access: a DANS perspective
PDF
02 2019 caa_krakowvg
PPTX
Apa frascati november 2012
PDF
Exposing the data from NARCIS with VIVO
PPTX
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
PPTX
Opendatasessions
PDF
The e-depot for Dutch Archaeology: Archiving and publication of archaeologica...
PPTX
Ingrid Dillo - Digital humanities challenges and the Research Data Alliance
PPTX
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
 
PPT
Information science in practice - research at a Trusted Digital Archive
PPTX
Data sharing in the Netherlands
PPTX
Building an electronic repository and archives on Dataverse in the European O...
 
PDF
PIDapalooza 2016 Keynote
PDF
The universe of identifiers and how ANDS is using them
PPTX
Training in Data Curation as Service in a Federated Data Infrastructure - the...
DataverseNL as structured data hub
 
Peter Doorn Introductie DANS E4DS
Persistent identifiers in DataverseEU project
 
Enhanced publications: an introduction – Arjan Hogenaar, DANS
Data management and the online e-depot for Dutch Archaeology at DANS
Data quality, preservation and access: a DANS perspective
02 2019 caa_krakowvg
Apa frascati november 2012
Exposing the data from NARCIS with VIVO
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
Opendatasessions
The e-depot for Dutch Archaeology: Archiving and publication of archaeologica...
Ingrid Dillo - Digital humanities challenges and the Research Data Alliance
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
 
Information science in practice - research at a Trusted Digital Archive
Data sharing in the Netherlands
Building an electronic repository and archives on Dataverse in the European O...
 
PIDapalooza 2016 Keynote
The universe of identifiers and how ANDS is using them
Training in Data Curation as Service in a Federated Data Infrastructure - the...
Ad

More from vty (20)

PPTX
Decentralised identifiers and knowledge graphs
 
PPTX
Decentralisation and knowledge graphs
 
PPTX
Decentralised identifiers for CLARIAH infrastructure
 
PPTX
Dataverse repository for research data in the COVID-19 Museum
 
PPTX
Metaverse for Dataverse
 
PPTX
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
PPTX
External CV support in Dataverse 5.7
 
PPTX
Building COVID-19 Knowledge Graph at CoronaWhy
 
PPTX
CLARIN CMDI use case and flexible metadata schemes
 
PPTX
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
PPTX
Controlled vocabularies and ontologies in Dataverse data repository
 
PPTX
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
PPTX
Fighting COVID-19 with Artificial Intelligence
 
PPTX
Building COVID-19 Museum as Open Science Project
 
PPTX
External controlled vocabularies support in Dataverse
 
PPTX
Setting up Dataverse repository for research data
 
PPTX
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
PPTX
5 years of Dataverse evolution
 
PPTX
Ontologies, controlled vocabularies and Dataverse
 
PPTX
CLARIN CMDI support in Dataverse
 
Decentralised identifiers and knowledge graphs
 
Decentralisation and knowledge graphs
 
Decentralised identifiers for CLARIAH infrastructure
 
Dataverse repository for research data in the COVID-19 Museum
 
Metaverse for Dataverse
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
External CV support in Dataverse 5.7
 
Building COVID-19 Knowledge Graph at CoronaWhy
 
CLARIN CMDI use case and flexible metadata schemes
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Controlled vocabularies and ontologies in Dataverse data repository
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
Fighting COVID-19 with Artificial Intelligence
 
Building COVID-19 Museum as Open Science Project
 
External controlled vocabularies support in Dataverse
 
Setting up Dataverse repository for research data
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
5 years of Dataverse evolution
 
Ontologies, controlled vocabularies and Dataverse
 
CLARIN CMDI support in Dataverse
 

Recently uploaded (20)

PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPT
protein biochemistry.ppt for university classes
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
diccionario toefl examen de ingles para principiante
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
Microbiology with diagram medical studies .pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
protein biochemistry.ppt for university classes
Phytochemical Investigation of Miliusa longipes.pdf
Introduction to Fisheries Biotechnology_Lesson 1.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
ECG_Course_Presentation د.محمد صقران ppt
Taita Taveta Laboratory Technician Workshop Presentation.pptx
diccionario toefl examen de ingles para principiante
neck nodes and dissection types and lymph nodes levels
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Comparative Structure of Integument in Vertebrates.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
microscope-Lecturecjchchchchcuvuvhc.pptx
Microbiology with diagram medical studies .pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
AlphaEarth Foundations and the Satellite Embedding dataset

CESSDA Persistent Identifiers

  • 1. dans.knaw.nl DANS is een instituut van KNAW en NWO CESSDA Persistent Identifiers Workshop PID Information Types for the Social Sciences May 29, 2017, The Hague Vyacheslav Tykhonov Senior Information Scientist (DANS) vyacheslav.tykhonov@dans.knaw.nl
  • 2. dans.knaw.nl DANS is een instituut van KNAW en NWO DANS data repositories with Persistent Identifiers Within the context of DANS’ mission, it is obligatory that every (digital) object archived via DANS has a PID, so that it can be (re)located and cited. DANS uses PIDs for both (digital) objects and people. DataverseNL for ongoing research projects • every dataset has its own handle (for Dutch Universities) • revisions of dataset don’t change the handle, every new version changing only citation EASY for permanent data archiving (DOIs) • archived dataset has DOI • every version of dataset archived from DataverseNL producing new DOI
  • 3. dans.knaw.nl DANS is een instituut van KNAW en NWO • DANS has developed Plugin to archive datasets deposited in Dataverse temporary storage to Trusted Digital Repositories (TDR) • Before putting datasets in the long term archive users should create account in TDR and get proper permissions to archive their data • Archival Plugin is open source software and can be easily extended by support of any TDRs: https://github.com/DANS-KNAW/dataverse-bridge
  • 4. dans.knaw.nl DANS is een instituut van KNAW en NWO “Archive” button is available for local Dataverse administrators to push datasets to EASY archive for long term preservation
  • 5. dans.knaw.nl DANS is een instituut van KNAW en NWO Administrator can make choice where to archive the dataset: Archivematica, Islandora, FEDORA or DANS EASY (EASY is default option)
  • 6. dans.knaw.nl DANS is een instituut van KNAW en NWO Archiving process will run in background to extract data and metadata from dataset and will create archived (bagit) package containing all files and checksums
  • 7. dans.knaw.nl DANS is een instituut van KNAW en NWO After process of archiving will be finished button “Archive” will disappear on the page. Dataset citation will be extended with DOI pointing to archived version of the dataset in EASY
  • 8. dans.knaw.nl DANS is een instituut van KNAW en NWO Archived version of the dataset is available on EASY landing page and can be cited in research papers
  • 9. dans.knaw.nl DANS is een instituut van KNAW en NWO Archived dataset automatically will get DOI and URN pointing to archived revision (version) of dataset
  • 10. dans.knaw.nl DANS is een instituut van KNAW en NWO All files from dataset will get permission levels corresponding to versions of files stored in Dataverse
  • 11. Dataverse as Archival Service • We’re working on the extension of Dataverse with DOIs generated for every version of dataset to make it work as permanent storage • Citations can contain duplicate metadata but dataset content (data files) should be different • Archival part can be hosted by the same Dataverse depending from plugin settings
  • 12. CESSDA PID plugin • Universal plugin to get DOIs and handles in the same Dataverse instance • Prefix of every organisation will be generated based on the configuration and authentication settings of the plugin • switch Dataverse between support of ongoing research and archive (in separate subdataverses)
  • 13. Challenges • We need PID “Proxy” Service collecting information about all DOIs generated for different versions of datasets with handles • depending from the location and status of dataset every citation should contain handle (Netherlands), URN:NBN (Europe) and DOI (worldwide) • statistics about all citations of datasets in research papers should be aggregated and provided as part of “Proxy” Service to build own “PageRank” index • Big Data and Linked Open Data archiving with Persistent Identifiers • higher level of granularity for separate files, subsets, fragments, time services to make citation more accurate • tombstone pages maintenance
  • 14. Big Data repository with Persistent Identifiers The approach is suitable for product development companies (industry) and organisations and institutions (CESSDA) looking for sustainable (Big) data archiving services. Big Data object in Dataverse consists of: • metadata with authorship and citation information • data usage licence • persistent DOI or handle • information how to obtain key (API token) to start use API endpoint(s) • link to API endpoint delivering data • representation of API (interactive documentation, Swagger) • data provenance • controlled vocabularies to meet domain specific community standards (optional) Public demonstration is available on Dataverse demo website.
  • 15. Linked Data hubs as archived object Source: PID object
  • 16. dans.knaw.nl DANS is een instituut van KNAW en NWO Questions?