SlideShare a Scribd company logo
{RDF} Data Quality Assessment
Connecting the pieces...
About me...
●
●
●
●
●
●
Overview
●
●
●
What is
data quality?
What is
??? quality?
Quality of life...
Image credits
Quality of OS
Multidimensional
image credits
Data Quality is
Which one is better?
ex:Foo
a dbo:Person ;
dbo:birthDate ”2000-01-01”^^xsd:date .
ex:Bar
a foaf:Person ;
foaf:age 18 .
ex:Baz
wkd:p31 wk:Q5 ;
wkd:p569 ”2000-01-01”^^xsd:date .
Would you use this information for …
ex:Chickenpox
a ex:InfectiousDisease ;
ex:symptoms ”rash”, “fever”, “headache” ;
ex:treatWithVaccine ex:VaricellaVaccine .
ex:VaricellaVaccine
a ex:Vaccine ;
ex:treats ex:Chickenpox, ex:HerpesZoster .
- a visualization?
- a disease website?
- automated treatment?
Fitness for use
Data Quality is
Data Quality Dimension themes
Accessibility Dimensions
Contextual Dimensions
Intrinsic Dimensions
Representational Dimensions
How good do you need it to get?
Quality Cost ($)
Where things
can go wrong
Where data
can go wrong
Where data can go wrong
Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...
birthDate max cardinality 1
birthDate min cardinality 1
dirthDate must be xsd:date
Evolution & quality
↻
↻
↻
↻
↻
↻
See http://aligned-project.eu
Sounds good so
far… now what?
Strategies for managing quality
> Needs good tool support
> Generic tools missing
> Validation engines improved
Validate closer to the source of the error
↻
↻
↻
↻
↻
> Always in the K range
> Scales with source size
> Errors scale as well
Automate, automate & automate...
ex:name
a rdf:Property ;
rdfs:range rdf:langString .
Schema.ttl
ex:Foo
a dbo:Person ;
ex:name “Foo @en” .
Data.ttl
Automate, automate & automate...
ex:name
a rdf:Property ;
rdfs:range rdf:langString .
Schema.ttl
ex:Foo
a dbo:Person ;
ex:name “Foo @en” .
ex:name “Foo”@en .
Data.ttl
CI/CD is your best friend
Recap
Thank you! Questions?
@jimkont
kontokostas.com
slideshare.net/jimkont

More Related Content

PDF
RDF Data Quality Assessment - connecting the pieces
PPTX
Data Quality
PPTX
Data Quality in Healthcare: An Important Challenge
PDF
AN EXTENDED DATA OBJECT-DRIVEN APPROACH TO DATA QUALITY EVALUATION: CONTEXTUA...
PPTX
Enhancing educational data quality in heterogeneous learning contexts using p...
PPTX
Data Quality Analytics: Understanding what is in your data, before using it
PDF
Data Quality
PDF
Optimize Your Healthcare Data Quality Investment: Three Ways to Accelerate Ti...
RDF Data Quality Assessment - connecting the pieces
Data Quality
Data Quality in Healthcare: An Important Challenge
AN EXTENDED DATA OBJECT-DRIVEN APPROACH TO DATA QUALITY EVALUATION: CONTEXTUA...
Enhancing educational data quality in heterogeneous learning contexts using p...
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality
Optimize Your Healthcare Data Quality Investment: Three Ways to Accelerate Ti...

Similar to Data quality assessment - connecting the pieces... (8)

PPT
Lecture 22
PDF
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
PDF
Data Quality
PDF
Mappings Validation
PDF
A step towards a data quality theory
PPT
Data Quality and Data Cleaning Presentation
DOCX
Dual Assessment of Data Quality in Customer DatabasesADIR EVEN B.docx
PDF
1000 track 2 redman_using our laptop
Lecture 22
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Data Quality
Mappings Validation
A step towards a data quality theory
Data Quality and Data Cleaning Presentation
Dual Assessment of Data Quality in Customer DatabasesADIR EVEN B.docx
1000 track 2 redman_using our laptop
Ad

More from Dimitris Kontokostas (14)

PDF
Introduction to apache kafka
PDF
Graph databases & data integration v2
PDF
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
PDF
Data quality in Real Estate
PDF
8th DBpedia meeting / California 2016
PDF
Semantically enhanced quality assurance in the jurion business use case
PDF
Graph databases & data integration - the case of RDF
PDF
DBpedia past, present & future
PDF
DBpedia+ / DBpedia meeting in Dublin
PDF
DBpedia ♥ Commons
PDF
NLP Data Cleansing Based on Linguistic Ontology Constraints
PDF
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
PDF
DBpedia Viewer - LDOW 2014
ODP
DBpedia i18n - Amsterdam Meeting (30/01/2014)
Introduction to apache kafka
Graph databases & data integration v2
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
Data quality in Real Estate
8th DBpedia meeting / California 2016
Semantically enhanced quality assurance in the jurion business use case
Graph databases & data integration - the case of RDF
DBpedia past, present & future
DBpedia+ / DBpedia meeting in Dublin
DBpedia ♥ Commons
NLP Data Cleansing Based on Linguistic Ontology Constraints
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
DBpedia Viewer - LDOW 2014
DBpedia i18n - Amsterdam Meeting (30/01/2014)
Ad

Recently uploaded (20)

PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Computer network topology notes for revision
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Mega Projects Data Mega Projects Data
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPT
Quality review (1)_presentation of this 21
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
ISS -ESG Data flows What is ESG and HowHow
Computer network topology notes for revision
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Supervised vs unsupervised machine learning algorithms
Mega Projects Data Mega Projects Data
Qualitative Qantitative and Mixed Methods.pptx
Fluorescence-microscope_Botany_detailed content
IBA_Chapter_11_Slides_Final_Accessible.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Miokarditis (Inflamasi pada Otot Jantung)
Quality review (1)_presentation of this 21
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
climate analysis of Dhaka ,Banglades.pptx

Data quality assessment - connecting the pieces...