SlideShare a Scribd company logo
A Linkset Quality Metric
Measuring Multilingual Gain
In SKOS Thesauri
name.surname@ge.imati.cnr.it
IMATI - Istituto di Matematica Applicata e Tecnologie
Informatiche "Enrico Magenes” – GENOVA Section,
Consiglio Nazionale delle Ricerche, Italy
Riccardo Albertoni, Monica De Martino, Paola PodestàRiccardo Albertoni, Monica De Martino, Paola Podestà
MOTIVATIONS
So many bubbles
there, THAT’S SO
COOL!!
BUT ….
Can I exploit that
third party data for
my OWN
ANALYSES?
Linking Open Data cloud diagram 2014, by Max
Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard
Cyganiak. http://lod-cloud.net/
MOTIVATION
“QUALITY IS THE ISSUE
WHEN REUSE”
Well-founded works on
quality for DATASETS, …
but …
What arrows in LOD
are good for ?!?
NO GROUND CONCEPTs
About what makes a
LINKSET suitable for A
TARGET APPLICATION
Considering LOD’s promise:
to make the web
Evolving into a Global Data
Space
LINKSET QUALITY
should be
AS IMPORTANT AS
DATASET QUALITY
Riccardo Albertoni
WHAT IS a LINKSET ? (see VOID)
Riccardo Albertoni, Asunción Gómez-Pérez:
Assessing linkset quality for complementing third-party
datasets. EDBT/ICDT Workshops 2013: 52-59
owl:sameAs linksets
skos:exactMatch linksets THIS WORK!!!
Linkset LSubject X Object Y
Inspired by LusTRE, a framework of interlinked
Environmental Thesauri in the EU project
PROPOSAL: LINKSET IMPORTING
A quality scoring function on Linkset to answer
LkImp4p, given a link, evaluates the percentage of
values not present in the subject, but “gainable”
from the Object, through the link
Importing on Links
LkImp4p(ObjEntity, Property, Link, Language) --> [0 %, 100%]
The Overall Linkset Importing is defined as the average
contribution of importing on single links of the Linkset
How good is the Linkset to import object dataset’s
INFO into the subject SKOS Thesaurus?
LINK IMPORTING: Examples
Accurate formal definition in the paper,
here I am using a simplified version
Lkimp4p(l,P) =100*(1-
| Val4PropertyP_in_l's_Subject|
|Val4PropertyP_in_l's_Subject È Val4PropertyP_in_l's_Object|
)
if denominator = 0 then we define Lkimp4p(...)= 0
Lkimp4p(...)=100% iff there are values in the object and no values in the subject
LkImp4p assumes the link is
correct…
LkImp4pL(x3, skos:prefLabel, l2, ’en’) = 0
No Multilingual gain as “Dog@en” is already in the subject
LINK IMPORTING: Examples
LkImp4p(ObjEntity,Property,Link, Language) --> [0 %, 100%]
✗
LkImp4p(ObjEntity,Property,Link,Language) --> [0 %, 100%]
LkImp4pL(x3, skos:altLabel, l2, ’it’) =100%
As Cagnolino@it is gained and it is the only altLabel in the
complemeneted Subject
✓
LINK IMPORTING: Examples
LkImp4p(ObjEntity,Property,L,Language) --> [0 %, 100%]
LkImp4pL(x3,skos:altLabel, l2, _ )
Disregarding the language, “_ “ means Unspecified Language
= 50% ,
x3 has another altLabel (Puppy@en)
Then we gain one out two of the altlabels.
✗
✓
LINK IMPORTING: Examples
LkImp4pL(x5, skos:broader, l3, _)
It can be applied to any properties, e.g., skos:broader,
In this case, we gain entities instead of RDF literals
= 50% ,
x3 and y3 are mapped so only y6 is considered as a entity
gained
✗
✓
LINKSET IMPORTING: Examples
EXAMPLE of APPLICATION (eENVPlus)
LINKSET IMPORTING can be exploited to check the
complementation potential of any SKOS property..
But in the application, we focus on skos:prefLabel
and skos:altLabel in order to address
INCOMPLETE LANGUAGE COVERAGE
INCOMPLETE LANGUAGE COVERAGE arises when
skos:prefLabel and skos:altLabel are provided in all
the expected languages only for a subset of the
thesaurus concepts
EXAMPLE of APPLICATION (eENVPlus)
Multilingual Gain for skos:prefLabel Multilingual Gain for skos:altLabel
EARTh LINKSETS to
• GEMET (E2GEM - 4365 links)
• AGROVOC (E2AGR - 1436 links)
Is E2GEM better than E2AGR ??
In term of # links sure!! (4365 links >> 1436 links)
The best depends on which set of languages
and properties we are focusing on…
But al least our Importing quality measure
enables in a deeper analysis than
#link or link coverage
CONCLUSION
1
2
1
2
We draw the community attention to the critical
issue of linkset quality
Applicable for estimating multilingual gain
Importing as estimator for the “completeness of
complented thesauri” (experimental validation)
Two-fold contribution
We propose LINKSET IMPORTING to measure the
gain when complementing Thesauri
Example in the context of EU project eENVplus
Future work
Further scoring functions to fully characterize
linkset quality space and dimensions
2° Workshop on Linked Data Quality LDQ2015
at ESWC 2015 - June 1, 2015 – Portorož, Slovenia
THANK YOU
Questions ??
ALBERTONI@GE.IMATI.CNR.IT

More Related Content

PDF
An Introduction to SPARQL
PDF
Ashutosh's resume (3)
PDF
IRJET- Hosting NLP based Chatbot on AWS Cloud using Docker
PDF
Java EE 7 from an HTML5 Perspective, JavaLand 2015
PDF
Introduction to Multimodal LLMs with LLaVA
PDF
Introduction to Multimodal LLMs with LLaVA
PDF
The LINQ Between XML and Database
PPT
Introduction to Semantic Web for GIS Practitioners
An Introduction to SPARQL
Ashutosh's resume (3)
IRJET- Hosting NLP based Chatbot on AWS Cloud using Docker
Java EE 7 from an HTML5 Perspective, JavaLand 2015
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
The LINQ Between XML and Database
Introduction to Semantic Web for GIS Practitioners

Similar to Albertoni ldq workshop ESWC 2015 (20)

PDF
SwiftRiver 2011 Overview
PDF
Portland Splunk User Group May 2020
PPTX
Zeroshot multimodal named entity disambiguation for noisy social media posts
PPT
Semantically-aware Networks and Services for Training and Knowledge Managemen...
PPTX
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
PPTX
Large Scale Indexing
PDF
BALWANT SINGH_RESUME
PPTX
NativeScript Developer Day Keynote - Todd Anglin & Burke Holland
PDF
Resume - Alsey Coleman Miller - iOS Developer
PDF
Sudipta_Mukherjee_Resume-Nov_2022.pdf
PDF
A Strong Object Recognition Using Lbp, Ltp And Rlbp
PPT
C-ing the Future
PDF
Key projects in AI, ML and Generative AI
PDF
Resume - Alsey Coleman Miller - iOS Developer
PDF
Empower your Enterprise with language intelligence_Francisco Webber
PDF
Atlanta MLconf Machine Learning Conference 09-23-2016
PDF
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
PPT
Accessing the Linked Open Data Cloud via ODBC
DOC
bakkesh_php_mysql_javascript_jquery_5.5yrs_Exp
PPTX
The nature.com ontologies portal: nature.com/ontologies
SwiftRiver 2011 Overview
Portland Splunk User Group May 2020
Zeroshot multimodal named entity disambiguation for noisy social media posts
Semantically-aware Networks and Services for Training and Knowledge Managemen...
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Large Scale Indexing
BALWANT SINGH_RESUME
NativeScript Developer Day Keynote - Todd Anglin & Burke Holland
Resume - Alsey Coleman Miller - iOS Developer
Sudipta_Mukherjee_Resume-Nov_2022.pdf
A Strong Object Recognition Using Lbp, Ltp And Rlbp
C-ing the Future
Key projects in AI, ML and Generative AI
Resume - Alsey Coleman Miller - iOS Developer
Empower your Enterprise with language intelligence_Francisco Webber
Atlanta MLconf Machine Learning Conference 09-23-2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Accessing the Linked Open Data Cloud via ODBC
bakkesh_php_mysql_javascript_jquery_5.5yrs_Exp
The nature.com ontologies portal: nature.com/ontologies
Ad

More from Riccardo Albertoni (10)

PPTX
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
PPTX
Presentation at MTSR 2012
PPT
LusTRE: a Linked Thesaurus fRamework for Environment
PPTX
Linkset quality (LWDM 2013)
PPTX
Linkset quality
PPTX
SSONDE: Semantic Similarity On liNked Data Entities
PPTX
An ontology driven module for accessing chronic pathology literature- CHRONIO...
PDF
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
PPTX
Semantic Similarity and Selection of Resources Published According to Linked ...
PPTX
SKOS and semantic web best practice to access terminological resources: Natur...
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Presentation at MTSR 2012
LusTRE: a Linked Thesaurus fRamework for Environment
Linkset quality (LWDM 2013)
Linkset quality
SSONDE: Semantic Similarity On liNked Data Entities
An ontology driven module for accessing chronic pathology literature- CHRONIO...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity and Selection of Resources Published According to Linked ...
SKOS and semantic web best practice to access terminological resources: Natur...
Ad

Recently uploaded (20)

PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
Cell Membrane: Structure, Composition & Functions
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Microbiology with diagram medical studies .pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
Sciences of Europe No 170 (2025)
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
AlphaEarth Foundations and the Satellite Embedding dataset
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Viruses (History, structure and composition, classification, Bacteriophage Re...
Cell Membrane: Structure, Composition & Functions
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
TOTAL hIP ARTHROPLASTY Presentation.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
ECG_Course_Presentation د.محمد صقران ppt
Taita Taveta Laboratory Technician Workshop Presentation.pptx
INTRODUCTION TO EVS | Concept of sustainability
2. Earth - The Living Planet Module 2ELS
Microbiology with diagram medical studies .pptx
The KM-GBF monitoring framework – status & key messages.pptx
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
Sciences of Europe No 170 (2025)
Classification Systems_TAXONOMY_SCIENCE8.pptx

Albertoni ldq workshop ESWC 2015

  • 1. A Linkset Quality Metric Measuring Multilingual Gain In SKOS Thesauri name.surname@ge.imati.cnr.it IMATI - Istituto di Matematica Applicata e Tecnologie Informatiche "Enrico Magenes” – GENOVA Section, Consiglio Nazionale delle Ricerche, Italy Riccardo Albertoni, Monica De Martino, Paola PodestàRiccardo Albertoni, Monica De Martino, Paola Podestà
  • 2. MOTIVATIONS So many bubbles there, THAT’S SO COOL!! BUT …. Can I exploit that third party data for my OWN ANALYSES? Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
  • 3. MOTIVATION “QUALITY IS THE ISSUE WHEN REUSE” Well-founded works on quality for DATASETS, … but … What arrows in LOD are good for ?!? NO GROUND CONCEPTs About what makes a LINKSET suitable for A TARGET APPLICATION Considering LOD’s promise: to make the web Evolving into a Global Data Space LINKSET QUALITY should be AS IMPORTANT AS DATASET QUALITY Riccardo Albertoni
  • 4. WHAT IS a LINKSET ? (see VOID) Riccardo Albertoni, Asunción Gómez-Pérez: Assessing linkset quality for complementing third-party datasets. EDBT/ICDT Workshops 2013: 52-59 owl:sameAs linksets skos:exactMatch linksets THIS WORK!!! Linkset LSubject X Object Y Inspired by LusTRE, a framework of interlinked Environmental Thesauri in the EU project
  • 5. PROPOSAL: LINKSET IMPORTING A quality scoring function on Linkset to answer LkImp4p, given a link, evaluates the percentage of values not present in the subject, but “gainable” from the Object, through the link Importing on Links LkImp4p(ObjEntity, Property, Link, Language) --> [0 %, 100%] The Overall Linkset Importing is defined as the average contribution of importing on single links of the Linkset How good is the Linkset to import object dataset’s INFO into the subject SKOS Thesaurus?
  • 6. LINK IMPORTING: Examples Accurate formal definition in the paper, here I am using a simplified version Lkimp4p(l,P) =100*(1- | Val4PropertyP_in_l's_Subject| |Val4PropertyP_in_l's_Subject È Val4PropertyP_in_l's_Object| ) if denominator = 0 then we define Lkimp4p(...)= 0 Lkimp4p(...)=100% iff there are values in the object and no values in the subject LkImp4p assumes the link is correct…
  • 7. LkImp4pL(x3, skos:prefLabel, l2, ’en’) = 0 No Multilingual gain as “Dog@en” is already in the subject LINK IMPORTING: Examples LkImp4p(ObjEntity,Property,Link, Language) --> [0 %, 100%] ✗
  • 8. LkImp4p(ObjEntity,Property,Link,Language) --> [0 %, 100%] LkImp4pL(x3, skos:altLabel, l2, ’it’) =100% As Cagnolino@it is gained and it is the only altLabel in the complemeneted Subject ✓ LINK IMPORTING: Examples
  • 9. LkImp4p(ObjEntity,Property,L,Language) --> [0 %, 100%] LkImp4pL(x3,skos:altLabel, l2, _ ) Disregarding the language, “_ “ means Unspecified Language = 50% , x3 has another altLabel (Puppy@en) Then we gain one out two of the altlabels. ✗ ✓ LINK IMPORTING: Examples
  • 10. LkImp4pL(x5, skos:broader, l3, _) It can be applied to any properties, e.g., skos:broader, In this case, we gain entities instead of RDF literals = 50% , x3 and y3 are mapped so only y6 is considered as a entity gained ✗ ✓ LINKSET IMPORTING: Examples
  • 11. EXAMPLE of APPLICATION (eENVPlus) LINKSET IMPORTING can be exploited to check the complementation potential of any SKOS property.. But in the application, we focus on skos:prefLabel and skos:altLabel in order to address INCOMPLETE LANGUAGE COVERAGE INCOMPLETE LANGUAGE COVERAGE arises when skos:prefLabel and skos:altLabel are provided in all the expected languages only for a subset of the thesaurus concepts
  • 12. EXAMPLE of APPLICATION (eENVPlus) Multilingual Gain for skos:prefLabel Multilingual Gain for skos:altLabel EARTh LINKSETS to • GEMET (E2GEM - 4365 links) • AGROVOC (E2AGR - 1436 links) Is E2GEM better than E2AGR ?? In term of # links sure!! (4365 links >> 1436 links) The best depends on which set of languages and properties we are focusing on… But al least our Importing quality measure enables in a deeper analysis than #link or link coverage
  • 13. CONCLUSION 1 2 1 2 We draw the community attention to the critical issue of linkset quality Applicable for estimating multilingual gain Importing as estimator for the “completeness of complented thesauri” (experimental validation) Two-fold contribution We propose LINKSET IMPORTING to measure the gain when complementing Thesauri Example in the context of EU project eENVplus Future work Further scoring functions to fully characterize linkset quality space and dimensions
  • 14. 2° Workshop on Linked Data Quality LDQ2015 at ESWC 2015 - June 1, 2015 – Portorož, Slovenia THANK YOU Questions ?? ALBERTONI@GE.IMATI.CNR.IT