The RDF Report
                                           Card

                               Beyond the Triple Count

                                      26th September 2011

                                        SemTechBiz 2011
Leigh Dodds
@ldodds

http://kasabi.com
http://slideshare.net/ldodds
Triple counts tell us nothing
Triple counts are not a quality
           indicator
http://dbpedia.org/resource/London
6 triples for Population Density

Property                                                       Count   Value


http://dbpedia.org/ontology/PopulatedPlace/populationDensity   2       4807.0
                                                                       4806.971873853451
http://dbpedia.org/ontology/populationDensity                  2       4806.971874
                                                                       4807.000000
http://dbpedia.org/property/populationDensityKm                1       4807

http://dbpedia.org/property/populationDensitySqMi              1       12450
12 triples for Location (1)

Property                Count   Value


georss:point            1        51.507222222222225
                                -0.1275
geo:geometry            1       POINT(-0.1275 51.5072)

geo:lat                 1       51.507221

geo:long                1       -0.127500
12 triples for Location (2)
Property                Count   Value


dbpprop:latd            1       51

dbpprop:latm            1       30

dbpprop:lats            1       26

dbpprop:latns           1       N

dbpprop:longd           1       0

dbpprop:longm           1       7

dbpprop:longs           1       39

dbpprop:longew          1       W
~4.6m redundant triples
Triple counts don't indicate utility
http://bbc.co.uk/programmes

2.5 million unique users per week, 60 req/s *


 *
  http://www.guardian.co.uk/media/pda/2011/apr/06/bbc-yves-raimond
http://bbc.co.uk/programmes

  Dataset is less than 50 million triples
Beyond the Triple Count
Dataset Information Spectrum


Low Detail                         High Detail



Summary and overview      Detailed data model
of dataset content     documentation & guides
Dataset Information Spectrum


Low Detail                                        High Detail



Summary and overview                     Detailed data model
of dataset content                    documentation & guides




                       More Information
Dataset Information Spectrum
Low Detail                          High Detail




Metadata     ● Title, Description
             ● Provenance

             ● Publication dates

             ● Licensing

             ● Usage cues

             ● Related datasets
Dataset Information Spectrum
Low Detail                          High Detail




Scope        ● What types of entity?
             ● How many of each type?

             ● Coverage

               ● Geographic

               ● Events (time)
Dataset Information Spectrum
Low Detail                           High Detail




Structure    ● URI Scheme
             ● Vocabulary meshing

                ● How is a person described?
Dataset Information Spectrum
Low Detail                          High Detail




Internals    ● List of Schemas & RDF terms
             ● Class/property usage counts

             ● Triple counts

             ● Named graph structure

             ● Source files
RDF Report Card Example
Summarising Content of a Dataset
●   Find all classes in all datasets in Kasabi

●   Tag each class against a pre-defined set of
    categories
     ●   Customized version of top-level schema.org
         classes

●   Generate a report card for each dataset listing
    types of entity
Report Card Categories
Ordnance Survey




http://beta.kasabi.com/dataset/ordnance-survey-linked-data
BBC Music




http://beta.kasabi.com/dataset/bbc-music
British National Bibliography




http://beta.kasabi.com/dataset/british-national-bibliography-bnb
NHS Performance Data




http://beta.kasabi.com/dataset/nhs-performance-data
Summary
●   Triple counts tell us nothing
●   Vital to present the quality & utility of our data
    ●   Data publishing platforms should support this
●   "Progressive disclosure"
    ●   Right detail at the right time
●   Dataset analysis can generate useful
    summaries
    ●   e.g. an RDF report card
The RDF Report Card: Beyond the Triple Count

More Related Content

PPT
Talis Platform: A Linked Data Engine
PPTX
RDTF Metadata Guidelines: an update
PPTX
The CIARD RINGValeri
PDF
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
PDF
Beyond 2022 project presentation 2021
PDF
Adventures in Linked Data Land (presentation by Richard Light)
PPT
Scripting User Contributed Interlinking
PPT
The Power of Semantic Technologies to Explore Linked Open Data
Talis Platform: A Linked Data Engine
RDTF Metadata Guidelines: an update
The CIARD RINGValeri
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Beyond 2022 project presentation 2021
Adventures in Linked Data Land (presentation by Richard Light)
Scripting User Contributed Interlinking
The Power of Semantic Technologies to Explore Linked Open Data

What's hot (20)

PPSX
The Web of data and web data commons
PDF
Discovering Related Data Sources in Data Portals
ODP
Open Data and CKAN Data Catalogues
PPTX
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
ODP
Data Integration And Visualization
PDF
Querying the Wikidata Knowledge Graph
PDF
Smart Data Applications powered by the Wikidata Knowledge Graph
PDF
Ephedra: efficiently combining RDF data and services using SPARQL federation
PDF
Putting Historical Data in Context: how to use DSpace-GLAM
PDF
Using the whole web as your dataset
PPTX
LD4KD 2015 - Demos and tools
PDF
CKAN - the open source data portal platform
PPT
Achieving time effective federated information from scalable rdf data using s...
PDF
Linked Data Experiences at Springer Nature
PDF
Semantic Markup
PPTX
Or2019 DSpace 7 Enhanced submission & workflow
PDF
ESWC 2017 Tutorial Knowledge Graphs
PDF
Managing RDF data with graph databases
PPTX
Scaling up Linked Data
PPT
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
The Web of data and web data commons
Discovering Related Data Sources in Data Portals
Open Data and CKAN Data Catalogues
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
Data Integration And Visualization
Querying the Wikidata Knowledge Graph
Smart Data Applications powered by the Wikidata Knowledge Graph
Ephedra: efficiently combining RDF data and services using SPARQL federation
Putting Historical Data in Context: how to use DSpace-GLAM
Using the whole web as your dataset
LD4KD 2015 - Demos and tools
CKAN - the open source data portal platform
Achieving time effective federated information from scalable rdf data using s...
Linked Data Experiences at Springer Nature
Semantic Markup
Or2019 DSpace 7 Enhanced submission & workflow
ESWC 2017 Tutorial Knowledge Graphs
Managing RDF data with graph databases
Scaling up Linked Data
An Introduction to the Open Archives Initiative Object Reuse and Exchange (OA...
Ad

Viewers also liked (8)

PDF
Data on the web the benefits of linking
ODP
Kasabi Linked Data Marketplace
KEY
Commercial Break: Linked Data for Business
PDF
Executive Whispering for Linked Data
PDF
Leigh Dodds Presentation
PPT
Fanhu.bz
PPTX
In praise of inconsistency - the long tail of small data
KEY
Open GIS Talk
Data on the web the benefits of linking
Kasabi Linked Data Marketplace
Commercial Break: Linked Data for Business
Executive Whispering for Linked Data
Leigh Dodds Presentation
Fanhu.bz
In praise of inconsistency - the long tail of small data
Open GIS Talk
Ad

Similar to The RDF Report Card: Beyond the Triple Count (20)

PDF
Tese phd
PPTX
Timbuctoo 2 EASY
PDF
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
PDF
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
PDF
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
PPT
SemanticWeb Nuts 'n Bolts
PDF
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
PPTX
Metadata for digital humanities
PPTX
Sharing a Startup’s Big Data Lessons
KEY
History and Background of the USEWOD Data Challenge
PDF
Dats nih-dccpc-kc7-april2018-prs-uoxf
PDF
Applying large scale text analytics with graph databases
PDF
The web of interlinked data and knowledge stripped
PDF
Scaling Out With Hadoop And HBase
PPT
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
PDF
NIH BD2K DataMed data index - DATS model
PPTX
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
PPTX
Querying the Web of Data
PPT
Using Semantic Web Technologies to Facilitate XBRL-based Financial Data Compa...
PPTX
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Tese phd
Timbuctoo 2 EASY
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
SemanticWeb Nuts 'n Bolts
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
Metadata for digital humanities
Sharing a Startup’s Big Data Lessons
History and Background of the USEWOD Data Challenge
Dats nih-dccpc-kc7-april2018-prs-uoxf
Applying large scale text analytics with graph databases
The web of interlinked data and knowledge stripped
Scaling Out With Hadoop And HBase
Multimedia Data Navigation and the Semantic Web (SemTech 2006)
NIH BD2K DataMed data index - DATS model
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
Querying the Web of Data
Using Semantic Web Technologies to Facilitate XBRL-based Financial Data Compa...
Force11 JDDCP workshop presentation, @ Force2015, Oxford

More from Leigh Dodds (20)

PDF
Being a data magpie
PDF
How you (yes, you!) can contribute to open data
ODP
Accessible Bath Training
ODP
Accessible Bath
PDF
Cheap bots done quick lightning talk
PDF
Open data in bath
PDF
Bath: Hacked Learning Night: Introduction to CartoDB
ODP
Dungeons and Dragons and Data
ODP
Love the Environment Pre-Meetup
PPT
Why I love open data and you should too
ODP
Introduction to Open Data & Bath: Hacked
ODP
Bath: Hacked: open data, the arts and cultural heritage
PDF
Introduction to Open Data & Linked Data
PDF
Time Travelling with Open Data
PPT
Ignite for Good: Why I Love Open Data and You Should Too
ODP
Oil and Water: When Data Licences Don't Mix
PDF
Linked Data Patterns
PDF
Digital Grafitti for Digital Cities
PDF
Layered Data: An Example
PDF
Data Foundations for Digital Cities
Being a data magpie
How you (yes, you!) can contribute to open data
Accessible Bath Training
Accessible Bath
Cheap bots done quick lightning talk
Open data in bath
Bath: Hacked Learning Night: Introduction to CartoDB
Dungeons and Dragons and Data
Love the Environment Pre-Meetup
Why I love open data and you should too
Introduction to Open Data & Bath: Hacked
Bath: Hacked: open data, the arts and cultural heritage
Introduction to Open Data & Linked Data
Time Travelling with Open Data
Ignite for Good: Why I Love Open Data and You Should Too
Oil and Water: When Data Licences Don't Mix
Linked Data Patterns
Digital Grafitti for Digital Cities
Layered Data: An Example
Data Foundations for Digital Cities

Recently uploaded (20)

PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Five Habits of High-Impact Board Members
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
WOOl fibre morphology and structure.pdf for textiles
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
The various Industrial Revolutions .pptx
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Getting Started with Data Integration: FME Form 101
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
CloudStack 4.21: First Look Webinar slides
Benefits of Physical activity for teenagers.pptx
Group 1 Presentation -Planning and Decision Making .pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Enhancing emotion recognition model for a student engagement use case through...
Five Habits of High-Impact Board Members
A contest of sentiment analysis: k-nearest neighbor versus neural network
1 - Historical Antecedents, Social Consideration.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
WOOl fibre morphology and structure.pdf for textiles
Module 1.ppt Iot fundamentals and Architecture
The various Industrial Revolutions .pptx
Getting started with AI Agents and Multi-Agent Systems
Getting Started with Data Integration: FME Form 101
NewMind AI Weekly Chronicles – August ’25 Week III
Zenith AI: Advanced Artificial Intelligence
observCloud-Native Containerability and monitoring.pptx
Hindi spoken digit analysis for native and non-native speakers
sustainability-14-14877-v2.pddhzftheheeeee
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
CloudStack 4.21: First Look Webinar slides

The RDF Report Card: Beyond the Triple Count