SlideShare a Scribd company logo
http://data.odw.tw
Metadata as Linked Data for Research Data Repositories
Cheng-Jen Lee, Andrea Wei-Ching Huang, Tyng-Ruey Chuang
Institute of Information Science, Academia Sinica, Taiwan
Analysis and synthesis of metadata goals for scientific data. Journal of the
American Society for Information Science and Technology 63(8): 1505–1520,
DOI: http://dx.doi.org/10.1002/asi.22683
The metadata of research data
increases the access to and
reuse of the data.
(Willis, Greenberg and White, 2012)
Analysis and synthesis of metadata goals for scientific data. Journal of the American
Society for Information Science and Technology 63(8): 1505–1520, DOI:
http://dx.doi.org/10.1002/asi.22683
INTRODUCTION
METHODS
Current research repositories can not
meet the needs of innovative solutions
providing feature-rich services for helping
data publishing such as visualization,
validation & reuse in different
applications.
(Assante, Candela, Castelli & Tani, 2016)
Are scientific data repositories coping with research data publishing?. Data Science
Journal, 15, p.6. DOI: http://doi.org/10.5334/dsj-2016-006
From 2014 to 2018, Stanford, Harvard,
and Cornell are collaboratively working on
linked data. The goal is to gather
contextual information about research
resources such as books, articles, serials,
datasets, and multimedia into a
semantic-web-based Scholarly Resource
Semantic Information Store (SRSIS).
https://www.ld4l.org/ and https://wiki.duraspace.org/display/ld4l/Project+Rationale
RESULTS
Voc4odw Ontology
http://voc.odw.tw
Before After
CULTURAL OBJECT
http://data.odw.tw/record/d2148340
1. An Use Case for Curation, Publication & Reuse
of Metadata as Linked Data.
2. A New Method to Manage Data for General-
use & Discipline-specific Repositories.
3. Data Semantically Enriched with Vocabularies and
Knowledge Bases via Adaptable Mechanisms.
o 843,309 CC licensed metadata
records of 14 domains reused from
the Union Catalog of Digital
Archives Taiwan.
o 44,806,400 triples (Linked Data)
encoded with Dublin Core 15
Elements and Provenance
Information.
o 25,913,304 triples from 832,803
records semantically refined with
spatial & temporal normalization,
mapping, and linking with domain
knowledges (external vocabularies,
ontologies. and knowledges bases).
o 14 domains include Archaeology,
Architecture, Archives, Artifacts,
Biology, Geology, Manuscript,
Multimedia, NewsMedia, PaintCal ,
RareBook, ResearchReuse, StoneRub.
o 80 projects and 74 agents associated
with metadata records are curated
by their linked data formats and
Wikidata ID: they have roles in NGO
(2), Museum (5), Library (2),
Government (9), Archive (1) and
Academia (55).
o For Open Science: using the CKAN
(Comprehensive Knowledge Archive
Network) as a major solution that
makes linked metadata available,
citable, and validated.
o Availability: data shared with
multiple formats, CSV, XML, Turtle,
RDF/XML, JSON-LD, consumed both
by human & machine.
o Validation and Reproducibility: each
data encoded with provenance in
details while at the same time a
complete mechanism for publishing
article, data and code is designed
and implemented.
o A flexible and adaptable ontology
for describing different data
context (common knowledge or
domain knowledge), event
concepts (people, place, time)
and objects collected by
meaningful groups of different
vocabularies is provided .
o Data Visualization is enhanced
and integrated through spatial
and temporal mapping, filtering
and linking system design.
o 18 international vocabularies used
for modeling common knowledge,
and 5 domain specific vocabularies
for place, time, art and humanity,
or biology are applied. 3
knowledge bases like GeoNames,
Wikidata, and Encyclopedia of Life
are mapped and linked.
o The use of SPARQL language and
endpoints provide data analytic
semantic queries both in local and
external. In addition, data from the
RDF triplestore can be easily used
in 3rd-party applications.
o Multiple DataClean Versions Mechanism: we treat
data cleaning as a kind of interpretation. Refined
Versions (R Versions i.e. r1, r2, r3… ) provide
different contexts to different needs of users.
o Multiple LinkedKnowledge Bases Mechanism:
more knowledge bases like DBpedia, WordCat, or
LinkedGeoData can be linked in future via different
R Versions without sacrificing the integrality of
the original Version, encoded with DC 15 .
o Multiple SemanticStructure Versions Mechanism:
different interpretations results from the use of
different vocabularies . Co-exists of multiple R
Versions with different vocabularies or
transforming vocabularies via SPARQL are
solutions.
@prefix agent: <http://data.odw.tw/agent/> .
@prefix cc: <http://creativecommons.org/ns#>.
@prefix data: <http://data.odw.tw/record/>.
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcat: <http://www.w3.org/ns/dcat#>.
@prefix dct: <http://purl.org/dc/terms/>.
@prefix gns: <http://sws.geonames.org/>.
@prefix project: <http://data.odw.tw/project/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix r4r: <http://guava.iis.sinica.edu.tw/r4r/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix schema: <http://schema.org/>.
@prefix voc: <http://voc.odw.tw/ontology#>.
@prefix xml: <http://www.w3.org/XML/1998/namespace>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
data:d2148340 a data:Reused,
r4r:RRObject,
dcat:Dataset ;
r4r:hasLicense <http://creativecommons.org/licenses/by-nc-nd/2.5/> ;
r4r:hasProvenance data:p20160530-d2148340 ;
r4r:locateAt data:d2148340 ;
dc:contributor "採集者: 呂文賓"^^rdf:PlainLiteral,
"採集者(英文): Wen-Pen Leu"^^rdf:PlainLiteral ;
dc:coverage "國家: 台灣"^^rdf:PlainLiteral,
"最低海拔: 1650"^^rdf:PlainLiteral,
"行政區: 宜蘭縣大同鄉"^^rdf:PlainLiteral ;
dc:creator "鑑訂者: 吉占和"^^rdf:PlainLiteral ;
dc:date "採集日期: 1993-04-25"^^rdf:PlainLiteral ;
dc:identifier "標本館號: 43501"^^rdf:PlainLiteral,
"編目號: Wen-Pen Leu 2018"^^rdf:PlainLiteral ;
dc:language "中文"^^rdf:PlainLiteral ;
dc:publisher "中央研究院生物多樣性研究中心"^^rdf:PlainLiteral ;
dc:rights "中央研究院 生物多樣性研究中心 植物標本館 Herbarium, Research Center for Biodiversity, Academia Sinica, Taipei
(HAST)"^^rdf:PlainLiteral ;
dc:source "台灣本土植物資料庫(http://taiwanflora.sinica.edu.tw/)"^^rdf:PlainLiteral ;
dc:subject "屬: 一葉蘭屬"^^rdf:PlainLiteral,
"屬(英文): Pleione"^^rdf:PlainLiteral,
"界: 植物界"^^rdf:PlainLiteral,
"界(英文): Plantae"^^rdf:PlainLiteral,
"目: 天門冬目"^^rdf:PlainLiteral,
"目(英文): Asparagales"^^rdf:PlainLiteral,
"科: 蘭科"^^rdf:PlainLiteral,
"科(英文): ORCHIDACEAE"^^rdf:PlainLiteral,
"綱: 單子葉植物綱"^^rdf:PlainLiteral,
"綱(英文): Monocotyledons"^^rdf:PlainLiteral,
"門: 種子植物門"^^rdf:PlainLiteral,
"門(英文): Spermatophyta"^^rdf:PlainLiteral ;
dc:title "中文種名: 台灣一葉蘭"^^rdf:PlainLiteral,
"學名: Pleione formosana Hayata"^^rdf:PlainLiteral ;
dcat:themeTaxonomy data:Biology ;
prov:hadPrimarySource voc:CatalogRecord ;
prov:wasGeneratedBy project:q21095860 .
data:p20160530-d2148340 a data:Provenance,
r4r:Provenance,
prov:Activity ;
r4r:isPackagedWith data:d2148340 ;
prov:atLocation <http://sws.geonames.org/6728700> ;
prov:endedAtTime "2016-10-19T17:48:39.392807+08:00"^^xsd:DateTime ;
prov:startedAtTime "2016-10-19T17:48:39.388657+08:00"^^xsd:DateTime ;
prov:wasAssociatedWith <http://data.odw.tw>,
agent:q20872470 ;
prov:wasStartedBy <http://catalog.digitalarchives.tw/item/00/20/c7/f4.html> .
<http://catalog.digitalarchives.tw/item/00/20/c7/f4.html> a voc:CatalogRecord,
dcat:Catalog,
prov:Revision ;
cc:license <http://creativecommons.org/licenses/by-nc-nd/2.5/> ;
data:digiArchiveID "43501"^^rdf:PlainLiteral ;
data:objectID "2148340"^^rdf:PlainLiteral ;
schema:thumbnail <http://image.digitalarchives.tw/ImageCache/00/5b/4e/1e.jpg> ;
prov:atLocation <http://sws.geonames.org/7280290> ;
prov:generatedAtTime "2011-05-13"^^rdf:PlainLiteral ;
prov:hadPrimarySource <http://www.hast.biodiv.tw/specimens/SpecimenDetailC.aspx?specimenOrderNum=43501> ;
prov:value "內容主題:生物:植物界:種子植物門:單子葉植物綱:天門冬目:蘭科"@zh,
"典藏機構與計畫:中央研究院:生物多樣性研究中心:台灣本土植物數位化典藏"@zh ;
prov:wasGeneratedBy project:q21095859 .
<http://image.digitalarchives.tw/ImageCache/00/5b/4e/1e.jpg> a dct:MediaTypeOrExtent ;
prov:wasRevisionOf <http://img.hast.biodiv.tw/specimenSmall/specimenSmall004/3/S_043501.jpg> .
<http://www.hast.biodiv.tw/specimens/SpecimenDetailC.aspx?specimenOrderNum=43501> a voc:PrimarySource,
prov:PrimarySource .
<http://creativecommons.org/licenses/by-nc-nd/2.5/> a cc:License,
dct:RightsStatement ;
rdfs:label "CC2.5:BY-NC-ND"@en ;
rdfs:comment "ICON license"@en,
"MetaDesc license"@en .
data:d2148340 a data:Refined,
r4r:Data,
dcat:Dataset ;
r4r:hasProvenance data:p20160912-d2148340 ;
r4r:locateAt data:d2148340 ;
txn:hasEOLPage <http://eol.org/pages/1134120> ;
dct:requires evt84:event-d2148340,
evt84:phyCre-d2148340 ;
dcat:landingPage <http://data.odw.tw/r1/r1-r2148340> ;
dcat:themeTaxonomy data:Biology .
evt84:event-d2148340 a event:Event,
voc:UnKnownEvent ;
event:product dwc:Location,
schema:GeoShape,
voc:Object ;
dwc:minimumElevationInMeters "1650.0" ;
gn:parentCountry <http://sws.geonames.org/1668284> ;
gn:parentFeature <http://sws.geonames.org/1667637>,
<http://sws.geonames.org/1674197> ;
skos:inScheme dwc:Occurrence,
voc:Context ;
skos:scopeNote "something happened at some place" .
evt84:phyCre-d2148340 a schema:CreateAction,
voc:KnownEvent ;
event:factor dct:PhysicalResource,
voc:Object ;
event:product dwc:PreservedSpecimen,
voc:Object ;
dwc:eventDate "1993-04-25" ;
skos:editorialNote "採集日期" ;
skos:inScheme dwc:HumanObservation,
voc:Context ;
skos:scopeNote "specimen collection process" .
data:p20160912-d2148340 a data:Provenance,
r4r:Provenance,
prov:Activity ;
r4r:isPackagedWith data:d2148340 ;
prov:atLocation <http://sws.geonames.org/6728700> ;
prov:endedAtTime "2016-10-19T17:56:35.505929+08:00" ;
prov:startedAtTime "2016-10-19T17:56:35.498577+08:00" ;
prov:wasAssociatedWith <http://data.odw.tw>,
agent:q20872470 ;
prov:wasInfluencedBy r1:r1-r2148340 .
<http://eol.org/pages/1134120> a dwc:Taxon ;
rdfs:label "Pleione formosana" .
<http://sws.geonames.org/1667637> a voc:Place ; rdfs:label "Datong Xiang", "大同鄉" .
<http://sws.geonames.org/1668284> a voc:Place ; rdfs:label "Taiwan", "台湾" .
<http://sws.geonames.org/1674197> a voc:Place ; rdfs:label "Yilan", "宜蘭縣" .
Data in Different Contexts
Before 1993-04-25,
a natural plant may be recognized as
Pleione Formosana, “the plant”.
“The plant” was in somewhere around
Datong , Yilan, Taiwan.
A researcher, Wen-Pen Leu , took a
specimen collection activity on 1993-04-
25, and then made “the plant” as a
specimen for science. “The plant” has
been curated in the Herbarium,
Research Center For Biodiversity,
Academia Sinica, Taipei (HAST).
Since 2003, the HAST, have completed
the "Database of Native Plants in
Taiwan“, and digitalized “the plant” with
its image of herbarium specimen and
metadata information. “The plant “
becomes a science object served for
natural scientists via internet and web.
The HAST uses its own data schema to
store “the plant” fitting their database
and domain knowledge requirements.
Around 2011-05-13, the HAST
collaboratively worked with the Union
Catalog of Digital Archives Taiwan
(CATDAT). “The plant “ was then
converted its own data schema and
imported the metadata as a catalogue
record and a cultural object to CATDAT
based on Dublin Core 15 Elements.
A Reused Data:
“The Plant” is a derivation data, now as
a data:d2148340, (data:Reused)
modelled from the science and cultural
object of this Pleione Formosana.
It shares the meaning of the
r4r:RRObject in the R4R Ontology:
that any resource served as a
component for reuse is defined as a
Reusing Related Object (RRObject).
A Semantically Refined Data:
A derivation data, r1:r1-r2148340
(data:Refined), is extracted something
from the data:Reused as to enrich
semantics of the resource. Semantic
meanings of different interpretations
are provided from the conceptual
model. At the same time,
different interpretations or derivation
processes are curated in different
refined versions (ex. r1, r2, r3...).
A dcat:Dataset:
The data:d2148340 at data.odw.tw is a
collection of data (which includes
different versions of refined data),
published or curated by a single agent,
and is available for access or download
in one or more formats.
Pre-digitalContexts
Semantics in LOD Context
http://data.odw.tw/r1/r1-r2148340
data: d2148340 dwc:taxonRank
txn:taxonRank
biol:rank
界: 植物界
綱(英文): Monocotyledons
目: 天門冬目
界(英文): Plantae
目(英文): Asparagales
屬(英文): Pleione
屬: 一葉蘭屬
門(英文): Spermatophyta
綱: 單子葉植物綱
門: 種子植物門
科: 蘭科
科(英文): ORCHIDACEAE
Post-digitalContexts
SCIENCE OBJECT
THE STORY OF “THE PLANT” FOR HUMAN FOR MACHINE
FILTERING & VISUALIZATION FEATURES PROVIDED BY CKANHerbarium, Research Center for
Biodiversity, Academia Sinica
界(英文): Plantae
界: 植物界
門(英文): Spermatophyta
門: 種子植物門
綱(英文): Monocotyledons
綱: 單子葉植物綱
目(英文): Asparagales
目: 天門冬目
科(英文): Orchidaceae
科: 蘭科
屬(英文): Pleione
屬: 一葉蘭屬
Institute of Ecology and Evolutionary Biology,
College of Life Science, National Taiwan University
o 界(英文):Plantae
o 界:植物界
o 門(英文):Spermatophyta
o 門:胚胎植物門
o 綱(英文):Angiospermae
o 綱:被子植物綱
o 目(英文):Orchidales
o 目:蘭目
o 科(英文):Orchidaceae
o 科:蘭科
o 屬(英文):Pleione
o 屬:一葉蘭屬
o 種小名:bulbocodioides

More Related Content

PDF
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
PPTX
Dataset Metadata, Tools and Approaches for Access and Preservation
PPTX
Validata: A tool for testing profile conformance
PDF
The DATS model: datasets descriptions for data discovery in DataMed
PPTX
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
PPTX
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
PDF
DataTags, The Tags Toolset, and Dataverse Integration
PPTX
Supporting Dataset Descriptions in the Life Sciences
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
Dataset Metadata, Tools and Approaches for Access and Preservation
Validata: A tool for testing profile conformance
The DATS model: datasets descriptions for data discovery in DataMed
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
DataTags, The Tags Toolset, and Dataverse Integration
Supporting Dataset Descriptions in the Life Sciences

What's hot (20)

PPTX
The Rhetoric of Research Objects
PDF
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
PPTX
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
PPTX
Washington Linked Data Authority Service at University of Houston
PPTX
A Big Picture in Research Data Management
PPTX
PPTX
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
PPTX
The Dataverse Commons
PPTX
Linking Scientific Metadata (presented at DC2010)
PDF
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
PDF
IASSIST identifiers By Joan Starr
PDF
The DataTags System: Sharing Sensitive Data with Confidence
PDF
Eswc2018 wimu slides
PPTX
Research Data Sharing: A Basic Framework
PPTX
The Research Object Initiative: Frameworks and Use Cases
PPTX
Dataverse on the MOC
PPTX
Sources of Change in Modern Knowledge Organization Systems
PPTX
An Identifier Scheme for the Digitising Scotland Project
PPTX
The Neuroscience Information Framework: A Scalable Platform for Information E...
PPTX
Leveraging publication metadata to help overcome the data ingest bottleneck
The Rhetoric of Research Objects
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Washington Linked Data Authority Service at University of Houston
A Big Picture in Research Data Management
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
The Dataverse Commons
Linking Scientific Metadata (presented at DC2010)
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
IASSIST identifiers By Joan Starr
The DataTags System: Sharing Sensitive Data with Confidence
Eswc2018 wimu slides
Research Data Sharing: A Basic Framework
The Research Object Initiative: Frameworks and Use Cases
Dataverse on the MOC
Sources of Change in Modern Knowledge Organization Systems
An Identifier Scheme for the Digitising Scotland Project
The Neuroscience Information Framework: A Scalable Platform for Information E...
Leveraging publication metadata to help overcome the data ingest bottleneck
Ad

Similar to Metadata as Linked Data for Research Data Repositories (20)

PDF
Reuse of Structured Data: Semantics, Linkage, and Realization
PDF
20160818 Semantics and Linkage of Archived Catalogs
PDF
Interlinking Standardized OpenStreetMap Data and Citizen Science Data in the ...
PDF
Interpretation, Context, and Metadata: Examples from Open Context
PPTX
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
PDF
20120411 travelalliancemcguinnessfinal
PDF
The Semantic Web: RPI ITWS Capstone (Fall 2012)
PDF
Open Government Data on the Web - A Semantic Approach
PPTX
“Open Data Web” – A Linked Open Data Repository Built with CKAN
PDF
20120419 linkedopendataandteamsciencemcguinnesschicago
PDF
Semantic citation
PPT
A Framework for Ontology Usage Analysis
PDF
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
PDF
ITWS Capstone Lecture (Spring 2013)
PDF
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
PPTX
20130622 okfn hackathon t2
PPTX
Realizing Semantic Web - Light Weight semantics and beyond
PDF
Semantics and Linked Data for CyberGIS -- AAG 2013 Frontiers and Roadmaps Se...
PPTX
鏈結資料在圖書館的應用20131107
PDF
Semantic Linking & Retrieval for Digital Libraries
Reuse of Structured Data: Semantics, Linkage, and Realization
20160818 Semantics and Linkage of Archived Catalogs
Interlinking Standardized OpenStreetMap Data and Citizen Science Data in the ...
Interpretation, Context, and Metadata: Examples from Open Context
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
20120411 travelalliancemcguinnessfinal
The Semantic Web: RPI ITWS Capstone (Fall 2012)
Open Government Data on the Web - A Semantic Approach
“Open Data Web” – A Linked Open Data Repository Built with CKAN
20120419 linkedopendataandteamsciencemcguinnesschicago
Semantic citation
A Framework for Ontology Usage Analysis
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
ITWS Capstone Lecture (Spring 2013)
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
20130622 okfn hackathon t2
Realizing Semantic Web - Light Weight semantics and beyond
Semantics and Linked Data for CyberGIS -- AAG 2013 Frontiers and Roadmaps Se...
鏈結資料在圖書館的應用20131107
Semantic Linking & Retrieval for Digital Libraries
Ad

More from andrea huang (14)

PDF
結構資料的再次使用:語意、連結與實作
PDF
20160602 典藏目錄的語意與連結
PDF
How to clean data less through Linked (Open Data) approach?
PDF
A preliminary study on Wikipedia Dbpdeia and Wikidata
PDF
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
PDF
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
PDF
20130805 Activating Linked Open Data in Libraries Archives and Museums
PDF
101203 An event ontology for crisis-disaster information
PDF
081016 Social Tagging, Online Communication, and Peircean Semiotics
PDF
060817 Participation Collaboration Mapping
PDF
070928 Collaborative Geospatial Mapping And Data Authorization
PDF
041018 Community Gis
PDF
051102 Online Community Mapping
PDF
051207 Commonsense Geography Meets Web Technology
結構資料的再次使用:語意、連結與實作
20160602 典藏目錄的語意與連結
How to clean data less through Linked (Open Data) approach?
A preliminary study on Wikipedia Dbpdeia and Wikidata
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
20130805 Activating Linked Open Data in Libraries Archives and Museums
101203 An event ontology for crisis-disaster information
081016 Social Tagging, Online Communication, and Peircean Semiotics
060817 Participation Collaboration Mapping
070928 Collaborative Geospatial Mapping And Data Authorization
041018 Community Gis
051102 Online Community Mapping
051207 Commonsense Geography Meets Web Technology

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Empathic Computing: Creating Shared Understanding
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Spectroscopy.pptx food analysis technology
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
Assigned Numbers - 2025 - Bluetooth® Document
Building Integrated photovoltaic BIPV_UPV.pdf
Tartificialntelligence_presentation.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Empathic Computing: Creating Shared Understanding
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Encapsulation theory and applications.pdf
Machine Learning_overview_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Spectroscopy.pptx food analysis technology
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools
SOPHOS-XG Firewall Administrator PPT.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
A comparative analysis of optical character recognition models for extracting...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Group 1 Presentation -Planning and Decision Making .pptx

Metadata as Linked Data for Research Data Repositories

  • 1. http://data.odw.tw Metadata as Linked Data for Research Data Repositories Cheng-Jen Lee, Andrea Wei-Ching Huang, Tyng-Ruey Chuang Institute of Information Science, Academia Sinica, Taiwan Analysis and synthesis of metadata goals for scientific data. Journal of the American Society for Information Science and Technology 63(8): 1505–1520, DOI: http://dx.doi.org/10.1002/asi.22683 The metadata of research data increases the access to and reuse of the data. (Willis, Greenberg and White, 2012) Analysis and synthesis of metadata goals for scientific data. Journal of the American Society for Information Science and Technology 63(8): 1505–1520, DOI: http://dx.doi.org/10.1002/asi.22683 INTRODUCTION METHODS Current research repositories can not meet the needs of innovative solutions providing feature-rich services for helping data publishing such as visualization, validation & reuse in different applications. (Assante, Candela, Castelli & Tani, 2016) Are scientific data repositories coping with research data publishing?. Data Science Journal, 15, p.6. DOI: http://doi.org/10.5334/dsj-2016-006 From 2014 to 2018, Stanford, Harvard, and Cornell are collaboratively working on linked data. The goal is to gather contextual information about research resources such as books, articles, serials, datasets, and multimedia into a semantic-web-based Scholarly Resource Semantic Information Store (SRSIS). https://www.ld4l.org/ and https://wiki.duraspace.org/display/ld4l/Project+Rationale RESULTS Voc4odw Ontology http://voc.odw.tw Before After CULTURAL OBJECT http://data.odw.tw/record/d2148340 1. An Use Case for Curation, Publication & Reuse of Metadata as Linked Data. 2. A New Method to Manage Data for General- use & Discipline-specific Repositories. 3. Data Semantically Enriched with Vocabularies and Knowledge Bases via Adaptable Mechanisms. o 843,309 CC licensed metadata records of 14 domains reused from the Union Catalog of Digital Archives Taiwan. o 44,806,400 triples (Linked Data) encoded with Dublin Core 15 Elements and Provenance Information. o 25,913,304 triples from 832,803 records semantically refined with spatial & temporal normalization, mapping, and linking with domain knowledges (external vocabularies, ontologies. and knowledges bases). o 14 domains include Archaeology, Architecture, Archives, Artifacts, Biology, Geology, Manuscript, Multimedia, NewsMedia, PaintCal , RareBook, ResearchReuse, StoneRub. o 80 projects and 74 agents associated with metadata records are curated by their linked data formats and Wikidata ID: they have roles in NGO (2), Museum (5), Library (2), Government (9), Archive (1) and Academia (55). o For Open Science: using the CKAN (Comprehensive Knowledge Archive Network) as a major solution that makes linked metadata available, citable, and validated. o Availability: data shared with multiple formats, CSV, XML, Turtle, RDF/XML, JSON-LD, consumed both by human & machine. o Validation and Reproducibility: each data encoded with provenance in details while at the same time a complete mechanism for publishing article, data and code is designed and implemented. o A flexible and adaptable ontology for describing different data context (common knowledge or domain knowledge), event concepts (people, place, time) and objects collected by meaningful groups of different vocabularies is provided . o Data Visualization is enhanced and integrated through spatial and temporal mapping, filtering and linking system design. o 18 international vocabularies used for modeling common knowledge, and 5 domain specific vocabularies for place, time, art and humanity, or biology are applied. 3 knowledge bases like GeoNames, Wikidata, and Encyclopedia of Life are mapped and linked. o The use of SPARQL language and endpoints provide data analytic semantic queries both in local and external. In addition, data from the RDF triplestore can be easily used in 3rd-party applications. o Multiple DataClean Versions Mechanism: we treat data cleaning as a kind of interpretation. Refined Versions (R Versions i.e. r1, r2, r3… ) provide different contexts to different needs of users. o Multiple LinkedKnowledge Bases Mechanism: more knowledge bases like DBpedia, WordCat, or LinkedGeoData can be linked in future via different R Versions without sacrificing the integrality of the original Version, encoded with DC 15 . o Multiple SemanticStructure Versions Mechanism: different interpretations results from the use of different vocabularies . Co-exists of multiple R Versions with different vocabularies or transforming vocabularies via SPARQL are solutions. @prefix agent: <http://data.odw.tw/agent/> . @prefix cc: <http://creativecommons.org/ns#>. @prefix data: <http://data.odw.tw/record/>. @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dcat: <http://www.w3.org/ns/dcat#>. @prefix dct: <http://purl.org/dc/terms/>. @prefix gns: <http://sws.geonames.org/>. @prefix project: <http://data.odw.tw/project/> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix r4r: <http://guava.iis.sinica.edu.tw/r4r/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix schema: <http://schema.org/>. @prefix voc: <http://voc.odw.tw/ontology#>. @prefix xml: <http://www.w3.org/XML/1998/namespace>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . data:d2148340 a data:Reused, r4r:RRObject, dcat:Dataset ; r4r:hasLicense <http://creativecommons.org/licenses/by-nc-nd/2.5/> ; r4r:hasProvenance data:p20160530-d2148340 ; r4r:locateAt data:d2148340 ; dc:contributor "採集者: 呂文賓"^^rdf:PlainLiteral, "採集者(英文): Wen-Pen Leu"^^rdf:PlainLiteral ; dc:coverage "國家: 台灣"^^rdf:PlainLiteral, "最低海拔: 1650"^^rdf:PlainLiteral, "行政區: 宜蘭縣大同鄉"^^rdf:PlainLiteral ; dc:creator "鑑訂者: 吉占和"^^rdf:PlainLiteral ; dc:date "採集日期: 1993-04-25"^^rdf:PlainLiteral ; dc:identifier "標本館號: 43501"^^rdf:PlainLiteral, "編目號: Wen-Pen Leu 2018"^^rdf:PlainLiteral ; dc:language "中文"^^rdf:PlainLiteral ; dc:publisher "中央研究院生物多樣性研究中心"^^rdf:PlainLiteral ; dc:rights "中央研究院 生物多樣性研究中心 植物標本館 Herbarium, Research Center for Biodiversity, Academia Sinica, Taipei (HAST)"^^rdf:PlainLiteral ; dc:source "台灣本土植物資料庫(http://taiwanflora.sinica.edu.tw/)"^^rdf:PlainLiteral ; dc:subject "屬: 一葉蘭屬"^^rdf:PlainLiteral, "屬(英文): Pleione"^^rdf:PlainLiteral, "界: 植物界"^^rdf:PlainLiteral, "界(英文): Plantae"^^rdf:PlainLiteral, "目: 天門冬目"^^rdf:PlainLiteral, "目(英文): Asparagales"^^rdf:PlainLiteral, "科: 蘭科"^^rdf:PlainLiteral, "科(英文): ORCHIDACEAE"^^rdf:PlainLiteral, "綱: 單子葉植物綱"^^rdf:PlainLiteral, "綱(英文): Monocotyledons"^^rdf:PlainLiteral, "門: 種子植物門"^^rdf:PlainLiteral, "門(英文): Spermatophyta"^^rdf:PlainLiteral ; dc:title "中文種名: 台灣一葉蘭"^^rdf:PlainLiteral, "學名: Pleione formosana Hayata"^^rdf:PlainLiteral ; dcat:themeTaxonomy data:Biology ; prov:hadPrimarySource voc:CatalogRecord ; prov:wasGeneratedBy project:q21095860 . data:p20160530-d2148340 a data:Provenance, r4r:Provenance, prov:Activity ; r4r:isPackagedWith data:d2148340 ; prov:atLocation <http://sws.geonames.org/6728700> ; prov:endedAtTime "2016-10-19T17:48:39.392807+08:00"^^xsd:DateTime ; prov:startedAtTime "2016-10-19T17:48:39.388657+08:00"^^xsd:DateTime ; prov:wasAssociatedWith <http://data.odw.tw>, agent:q20872470 ; prov:wasStartedBy <http://catalog.digitalarchives.tw/item/00/20/c7/f4.html> . <http://catalog.digitalarchives.tw/item/00/20/c7/f4.html> a voc:CatalogRecord, dcat:Catalog, prov:Revision ; cc:license <http://creativecommons.org/licenses/by-nc-nd/2.5/> ; data:digiArchiveID "43501"^^rdf:PlainLiteral ; data:objectID "2148340"^^rdf:PlainLiteral ; schema:thumbnail <http://image.digitalarchives.tw/ImageCache/00/5b/4e/1e.jpg> ; prov:atLocation <http://sws.geonames.org/7280290> ; prov:generatedAtTime "2011-05-13"^^rdf:PlainLiteral ; prov:hadPrimarySource <http://www.hast.biodiv.tw/specimens/SpecimenDetailC.aspx?specimenOrderNum=43501> ; prov:value "內容主題:生物:植物界:種子植物門:單子葉植物綱:天門冬目:蘭科"@zh, "典藏機構與計畫:中央研究院:生物多樣性研究中心:台灣本土植物數位化典藏"@zh ; prov:wasGeneratedBy project:q21095859 . <http://image.digitalarchives.tw/ImageCache/00/5b/4e/1e.jpg> a dct:MediaTypeOrExtent ; prov:wasRevisionOf <http://img.hast.biodiv.tw/specimenSmall/specimenSmall004/3/S_043501.jpg> . <http://www.hast.biodiv.tw/specimens/SpecimenDetailC.aspx?specimenOrderNum=43501> a voc:PrimarySource, prov:PrimarySource . <http://creativecommons.org/licenses/by-nc-nd/2.5/> a cc:License, dct:RightsStatement ; rdfs:label "CC2.5:BY-NC-ND"@en ; rdfs:comment "ICON license"@en, "MetaDesc license"@en . data:d2148340 a data:Refined, r4r:Data, dcat:Dataset ; r4r:hasProvenance data:p20160912-d2148340 ; r4r:locateAt data:d2148340 ; txn:hasEOLPage <http://eol.org/pages/1134120> ; dct:requires evt84:event-d2148340, evt84:phyCre-d2148340 ; dcat:landingPage <http://data.odw.tw/r1/r1-r2148340> ; dcat:themeTaxonomy data:Biology . evt84:event-d2148340 a event:Event, voc:UnKnownEvent ; event:product dwc:Location, schema:GeoShape, voc:Object ; dwc:minimumElevationInMeters "1650.0" ; gn:parentCountry <http://sws.geonames.org/1668284> ; gn:parentFeature <http://sws.geonames.org/1667637>, <http://sws.geonames.org/1674197> ; skos:inScheme dwc:Occurrence, voc:Context ; skos:scopeNote "something happened at some place" . evt84:phyCre-d2148340 a schema:CreateAction, voc:KnownEvent ; event:factor dct:PhysicalResource, voc:Object ; event:product dwc:PreservedSpecimen, voc:Object ; dwc:eventDate "1993-04-25" ; skos:editorialNote "採集日期" ; skos:inScheme dwc:HumanObservation, voc:Context ; skos:scopeNote "specimen collection process" . data:p20160912-d2148340 a data:Provenance, r4r:Provenance, prov:Activity ; r4r:isPackagedWith data:d2148340 ; prov:atLocation <http://sws.geonames.org/6728700> ; prov:endedAtTime "2016-10-19T17:56:35.505929+08:00" ; prov:startedAtTime "2016-10-19T17:56:35.498577+08:00" ; prov:wasAssociatedWith <http://data.odw.tw>, agent:q20872470 ; prov:wasInfluencedBy r1:r1-r2148340 . <http://eol.org/pages/1134120> a dwc:Taxon ; rdfs:label "Pleione formosana" . <http://sws.geonames.org/1667637> a voc:Place ; rdfs:label "Datong Xiang", "大同鄉" . <http://sws.geonames.org/1668284> a voc:Place ; rdfs:label "Taiwan", "台湾" . <http://sws.geonames.org/1674197> a voc:Place ; rdfs:label "Yilan", "宜蘭縣" . Data in Different Contexts Before 1993-04-25, a natural plant may be recognized as Pleione Formosana, “the plant”. “The plant” was in somewhere around Datong , Yilan, Taiwan. A researcher, Wen-Pen Leu , took a specimen collection activity on 1993-04- 25, and then made “the plant” as a specimen for science. “The plant” has been curated in the Herbarium, Research Center For Biodiversity, Academia Sinica, Taipei (HAST). Since 2003, the HAST, have completed the "Database of Native Plants in Taiwan“, and digitalized “the plant” with its image of herbarium specimen and metadata information. “The plant “ becomes a science object served for natural scientists via internet and web. The HAST uses its own data schema to store “the plant” fitting their database and domain knowledge requirements. Around 2011-05-13, the HAST collaboratively worked with the Union Catalog of Digital Archives Taiwan (CATDAT). “The plant “ was then converted its own data schema and imported the metadata as a catalogue record and a cultural object to CATDAT based on Dublin Core 15 Elements. A Reused Data: “The Plant” is a derivation data, now as a data:d2148340, (data:Reused) modelled from the science and cultural object of this Pleione Formosana. It shares the meaning of the r4r:RRObject in the R4R Ontology: that any resource served as a component for reuse is defined as a Reusing Related Object (RRObject). A Semantically Refined Data: A derivation data, r1:r1-r2148340 (data:Refined), is extracted something from the data:Reused as to enrich semantics of the resource. Semantic meanings of different interpretations are provided from the conceptual model. At the same time, different interpretations or derivation processes are curated in different refined versions (ex. r1, r2, r3...). A dcat:Dataset: The data:d2148340 at data.odw.tw is a collection of data (which includes different versions of refined data), published or curated by a single agent, and is available for access or download in one or more formats. Pre-digitalContexts Semantics in LOD Context http://data.odw.tw/r1/r1-r2148340 data: d2148340 dwc:taxonRank txn:taxonRank biol:rank 界: 植物界 綱(英文): Monocotyledons 目: 天門冬目 界(英文): Plantae 目(英文): Asparagales 屬(英文): Pleione 屬: 一葉蘭屬 門(英文): Spermatophyta 綱: 單子葉植物綱 門: 種子植物門 科: 蘭科 科(英文): ORCHIDACEAE Post-digitalContexts SCIENCE OBJECT THE STORY OF “THE PLANT” FOR HUMAN FOR MACHINE FILTERING & VISUALIZATION FEATURES PROVIDED BY CKANHerbarium, Research Center for Biodiversity, Academia Sinica 界(英文): Plantae 界: 植物界 門(英文): Spermatophyta 門: 種子植物門 綱(英文): Monocotyledons 綱: 單子葉植物綱 目(英文): Asparagales 目: 天門冬目 科(英文): Orchidaceae 科: 蘭科 屬(英文): Pleione 屬: 一葉蘭屬 Institute of Ecology and Evolutionary Biology, College of Life Science, National Taiwan University o 界(英文):Plantae o 界:植物界 o 門(英文):Spermatophyta o 門:胚胎植物門 o 綱(英文):Angiospermae o 綱:被子植物綱 o 目(英文):Orchidales o 目:蘭目 o 科(英文):Orchidaceae o 科:蘭科 o 屬(英文):Pleione o 屬:一葉蘭屬 o 種小名:bulbocodioides