SlideShare a Scribd company logo
Institute for Web Science & Technologies – WeST
Perplexity of Index Models over
Evolving Linked Data
Thomas Gottron, Christian Gottron
May 27th, 2014
ESWC, Crete
Thomas Gottron ESWC 27.5.2014, 2Perplexity of Index Models Over Evolving LOD
Motivation
Index
Once upon a time... ... some time later
New
index
???
Accuracy?
Thomas Gottron ESWC 27.5.2014, 3Perplexity of Index Models Over Evolving LOD
Index Models
Over Linked Data
Thomas Gottron ESWC 27.5.2014, 4Perplexity of Index Models Over Evolving LOD
Data Format
 Linked Data as N-Quads:
triple – what is the information?
context URI – where does it come from?
s op
c
( )s op c
Thomas Gottron ESWC 27.5.2014, 5Perplexity of Index Models Over Evolving LOD
Ã( )
(Abstract) Index Models
 D : Data elements to be retrieved (payload)
 K : Key elements to access the data (index elements)
 σ : Selection function: How to get data for a key
k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3 ...
DK s
Searchdata
structure
Efficientstorage
andretrieval
Data items (payload)Keys
Thomas Gottron ESWC 27.5.2014, 6Perplexity of Index Models Over Evolving LOD
Concrete Example: Subject Based Index Model
ukob:Gottron
ukob:Staab
ukob:Schegi
...
tud:CGottron
(ukob:Gottron, rdf:type, foaf:Person)
(ukob:Gottron, foaf:knows, ukob:Staab)
...
(ukob:Staab, swrc:institution, ukob:WeST)
(ukob:Staab, foaf:name, „Steffen Staab“)
...
(ukob:Schegi, rdf:type, foaf:Person)
(ukob:Schegi, foaf:name, „Stefan Scheglmann“)
(tud:CGottron, swrc:institution, tud:KOM)
(tud:CGottron, foaf:knows, ukob:Gottron)
...
Thomas Gottron ESWC 27.5.2014, 7Perplexity of Index Models Over Evolving LOD
12 Implemented Index Models
 Triple based
 Meta data
 Schema-level
https://github.com/gottron/lod-index-models
s ops
s opp
s opo
s opterm
s opc
s opPLD
type s
SchemEX s
t
t st t
p
p sp p
p-1
p-1 op-1p-1
t
p sp t
Thomas Gottron ESWC 27.5.2014, 8Perplexity of Index Models Over Evolving LOD
Index Accuracy
over Evolving Data
Thomas Gottron ESWC 27.5.2014, 9Perplexity of Index Models Over Evolving LOD
Comparing Indices
Once upon a time... ... some time later
???k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3
k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3
Thomas Gottron ESWC 27.5.2014, 10Perplexity of Index Models Over Evolving LOD
Metrics
 First indicator of interest:
 Stability of the key element set
Jaccard K1,K2( )=
K1 ÇK2
K1 ÈK2
Relative size of the
overlap of two sets
Jaccard Similarity
Thomas Gottron ESWC 27.5.2014, 11Perplexity of Index Models Over Evolving LOD
How to Measure Accuracy?
 Queries?
 No established query log
for used data set
 Different key elements
require different queries
 Cover all of the index
 Distributions!
 Relevant to several
applications
 Established metrics for
comparison
Thomas Gottron ESWC 27.5.2014, 12Perplexity of Index Models Over Evolving LOD
Obtaining a Distribution from an Index
k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3 ...
à D( )K s
Thomas Gottron ESWC 27.5.2014, 13Perplexity of Index Models Over Evolving LOD
Obtaining a Distribution from an Index
k1
k2
k3
...
kn
4
2
10
8
K s(k)
count
Relative frequencies
...
K
p
P k( )=
s(k)
M
M
Thomas Gottron ESWC 27.5.2014, 14Perplexity of Index Models Over Evolving LOD
Comparing Indices
Once upon a time... ... some time later
???
K
q
K
p
Thomas Gottron ESWC 27.5.2014, 15Perplexity of Index Models Over Evolving LOD
Comparing Distributions
 Information theoretic measures
???
KK
q p
H P( )= - P(x)ld(P(x))
xÎK
å
Entropy of P Expected length (in bits) for an optimal
encoding of a (randomly chosen) key
Thomas Gottron ESWC 27.5.2014, 16Perplexity of Index Models Over Evolving LOD
Metrics
H P,Q( )= - P(x)ld(Q(x))
xÎK
å
Expected length when the encoding
is based on a different distribution
Cross-Entropy of P and Q
PP P,Q( )= 2H(P,Q)
How many uniformly distributed
keys would have the same entropy
Perplexity
PP P,Q( )Norm
=
2H (P,Q)
K
Perplexity relative to a uniform
distribution over the keys
Normalized Perplexity
Thomas Gottron ESWC 27.5.2014, 17Perplexity of Index Models Over Evolving LOD
Metrics: How to Interpret Perplexity
 Perplexity based on cross entropy
 „How surprised are you about the outcome
of an experiment given you have some
expections?“
p
1
10
1
2
1 2 3 4 5 6
PP = 22.585
= 6
q
1
6
1 2 3 4 5 6
PPNorm =
22.585
6
=1
Unfair die
model
Thomas Gottron ESWC 27.5.2014, 18Perplexity of Index Models Over Evolving LOD
q
1
10
1
2
1 2 3 4 5 6
Metrics: How to Interpret Perplexity
 Perplexity based on cross entropy
 „How surprised are you about the outcome
of an experiment given you have some
expections?“
PP = 22.161
= 4.472 PPNorm = 0.745
p
1
10
1
2
1 2 3 4 5 6
Unfair die
Thomas Gottron ESWC 27.5.2014, 19Perplexity of Index Models Over Evolving LOD
Metrics: How to Interpret Perplexity
 Perplexity based on cross entropy
 „How surprised are you about the outcome
of an experiment given you have some
expections?“
PP = 22.287
= 4.880
q
3
25
2
5
1 2 3 4 5 6
PPNorm = 0.813
p
1
10
1
2
1 2 3 4 5 6
Unfair die
Thomas Gottron ESWC 27.5.2014, 20Perplexity of Index Models Over Evolving LOD
Metrics: How to Interpret Perplexity
 Perplexity based on cross entropy
 „How surprised are you about the outcome
of an experiment given you have some
expections?“
PP = 23.090
=8.513
q
1 2 3 4 5 6
1
10
1
2
PPNorm =1.418
p
1
10
1
2
1 2 3 4 5 6
Unfair die
Thomas Gottron ESWC 27.5.2014, 21Perplexity of Index Models Over Evolving LOD
Stability of Index Models
over Evolving Data
Thomas Gottron ESWC 27.5.2014, 22Perplexity of Index Models Over Evolving LOD
Comparing Indices
Once upon a time... ... some time later
Jaccard
Perplexity
k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3
k1
k2
k3
...
kn
d1,1 d1,2 d1,3 ...
d2,1 d2,2
d3,1 d3,2 d3,3 ...
dn,1 dn,2 dn,3
Thomas Gottron ESWC 27.5.2014, 23Perplexity of Index Models Over Evolving LOD
Experimental Setup
Index construction / Estimation of distributions
...
T0 (Base)
...
...
T1 T2
T3 TnTn-1
T0
„deviation“
T1 T2
T3 TnTn-1
Thomas Gottron ESWC 27.5.2014, 24Perplexity of Index Models Over Evolving LOD
Results: Jaccard Similarity of Key Set
Thomas Gottron ESWC 27.5.2014, 25Perplexity of Index Models Over Evolving LOD
Results: Normalised Perplexity
Thomas Gottron ESWC 27.5.2014, 26Perplexity of Index Models Over Evolving LOD
Results: Normalised Perplexity (Zoom in)
Thomas Gottron ESWC 27.5.2014, 27Perplexity of Index Models Over Evolving LOD
Conclusion
Summary
 Evaluation of stability of 12 LOD index models
 Application independent evaluation framework
 Good stability of schema-level indices
Future Work
 Index specific assessment of quality based on samples
 Accuracy in answering queries
Thomas Gottron ESWC 27.5.2014, 28Perplexity of Index Models Over Evolving LOD
Thanks!
Contact:
Thomas Gottron
WeST – Institute for Web Science and Technologies
Universität Koblenz-Landau
gottron@uni-koblenz.de
#eswc2014GottronTC

More Related Content

PDF
Shor's discrete logarithm quantum algorithm for elliptic curves
PPTX
2019 GDRR: Blockchain Data Analytics - ChainNet: Learning on Blockchain Graph...
PDF
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
PDF
2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...
PDF
Radix-2 Algorithms for realization of Type-II Discrete Sine Transform and Typ...
PDF
CSC446: Pattern Recognition (LN6)
PDF
Fast Algorithm for Computing the Discrete Hartley Transform of Type-II
PDF
Executing Boolean Queries on an Encrypted Bitmap Index
Shor's discrete logarithm quantum algorithm for elliptic curves
2019 GDRR: Blockchain Data Analytics - ChainNet: Learning on Blockchain Graph...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...
Radix-2 Algorithms for realization of Type-II Discrete Sine Transform and Typ...
CSC446: Pattern Recognition (LN6)
Fast Algorithm for Computing the Discrete Hartley Transform of Type-II
Executing Boolean Queries on an Encrypted Bitmap Index

What's hot (20)

PDF
CSC446: Pattern Recognition (LN7)
PDF
Neural Networks: Support Vector machines
PDF
LITTLE DRAGON TWO: AN EFFICIENT MULTIVARIATE PUBLIC KEY CRYPTOSYSTEM
PDF
Neural Networks: Model Building Through Linear Regression
PDF
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
PDF
Neural Networks: Radial Bases Functions (RBF)
PDF
B.Sc.IT: Semester - VI (April - 2017) [CBSGS - 75:25 Pattern | Question Paper]
PDF
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
PDF
Exploring Quantum Supremacy in Access Structures of Secret Sharing by Coding ...
PDF
A Universal Session Based Bit Level Symmetric Key Cryptographic Technique to ...
PDF
Csc446: Pattern Recognition
PDF
D143136
PPTX
Digit recognizer by convolutional neural network
PDF
Matrix and Tensor Tools for Computer Vision
PDF
A Random Forest using a Multi-valued Decision Diagram on an FPGa
PPTX
Dijkstra s algorithm
PDF
Learning multifractal structure in large networks (KDD 2014)
PDF
2015 16combinepdf
DOCX
Bt0064, logic design
PDF
CSMR11b.ppt
CSC446: Pattern Recognition (LN7)
Neural Networks: Support Vector machines
LITTLE DRAGON TWO: AN EFFICIENT MULTIVARIATE PUBLIC KEY CRYPTOSYSTEM
Neural Networks: Model Building Through Linear Regression
PEC - AN ALTERNATE AND MORE EFFICIENT PUBLIC KEY CRYPTOSYSTEM
Neural Networks: Radial Bases Functions (RBF)
B.Sc.IT: Semester - VI (April - 2017) [CBSGS - 75:25 Pattern | Question Paper]
N-gram IDF: A Global Term Weighting Scheme Based on Information Distance (WWW...
Exploring Quantum Supremacy in Access Structures of Secret Sharing by Coding ...
A Universal Session Based Bit Level Symmetric Key Cryptographic Technique to ...
Csc446: Pattern Recognition
D143136
Digit recognizer by convolutional neural network
Matrix and Tensor Tools for Computer Vision
A Random Forest using a Multi-valued Decision Diagram on an FPGa
Dijkstra s algorithm
Learning multifractal structure in large networks (KDD 2014)
2015 16combinepdf
Bt0064, logic design
CSMR11b.ppt
Ad

Viewers also liked (20)

PPT
причастие как часть речи урок 1
PPTX
J o b s new sky 1 unit 36
PDF
Captcha Recognition and Robustness Measurement using Image Processing Techniques
PDF
F0443041
PDF
B0941214
PDF
A0960104
PPTX
PDF
Studying the Impact of the Solar Activity on the Maximum Usable Frequency Pa...
PPTX
Proyecto ingles
PDF
C0330818
PDF
Design Test-bed for assessing load utilising using Multicast Forwarding Appro...
PDF
Performance Analysis of New Light Weight Cryptographic Algorithms
PDF
Aman narain , viva la revolution how banking should and will be disrupted and...
PDF
D0311824
PDF
E0953336
PDF
Investigation on the Efficacy of Salmonella Bivalent Vaccine
PPTX
How To Create Your Own Info Product
PDF
An Adaptive Masker for the Differential Evolution Algorithm
PPT
Творческое чтение и развитие одарённости
PPTX
Why U.S. Bank Lost Its Case against Ibanez on a Foreclosed Property
причастие как часть речи урок 1
J o b s new sky 1 unit 36
Captcha Recognition and Robustness Measurement using Image Processing Techniques
F0443041
B0941214
A0960104
Studying the Impact of the Solar Activity on the Maximum Usable Frequency Pa...
Proyecto ingles
C0330818
Design Test-bed for assessing load utilising using Multicast Forwarding Appro...
Performance Analysis of New Light Weight Cryptographic Algorithms
Aman narain , viva la revolution how banking should and will be disrupted and...
D0311824
E0953336
Investigation on the Efficacy of Salmonella Bivalent Vaccine
How To Create Your Own Info Product
An Adaptive Masker for the Differential Evolution Algorithm
Творческое чтение и развитие одарённости
Why U.S. Bank Lost Its Case against Ibanez on a Foreclosed Property
Ad

Similar to Perplexity of Index Models over Evolving Linked Data (20)

PPTX
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
PDF
Geo exploration simplified with Elastic Maps
PDF
On Continuum Limits of Markov Chains and Network Modeling
PDF
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
PDF
Interactive High-Dimensional Visualization of Social Graphs
PPTX
Geospatial Indexing and Search at Scale with Apache Lucene
PDF
Automated Security Response through Online Learning with Adaptive Con jectures
PDF
Making Use of the Linked Data Cloud: The Role of Index Structures
PPTX
Large Scale Data Clustering: an overview
PDF
Slides on Photosynth.net, from my MSc at Imperial
PPTX
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
PDF
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
PDF
data science training in hyderabad
PDF
Data scientist course in hyderabad
PDF
Data science certification
PDF
data science training in mumbai
PDF
Data science course in chennai (3)
PDF
Data science online course
PDF
data science institute in bangalore
PDF
Best data science training, best data science training institute in hyderabad.
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
Geo exploration simplified with Elastic Maps
On Continuum Limits of Markov Chains and Network Modeling
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
Interactive High-Dimensional Visualization of Social Graphs
Geospatial Indexing and Search at Scale with Apache Lucene
Automated Security Response through Online Learning with Adaptive Con jectures
Making Use of the Linked Data Cloud: The Role of Index Structures
Large Scale Data Clustering: an overview
Slides on Photosynth.net, from my MSc at Imperial
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
Developing fast low-rank tensor methods for solving PDEs with uncertain coef...
data science training in hyderabad
Data scientist course in hyderabad
Data science certification
data science training in mumbai
Data science course in chennai (3)
Data science online course
data science institute in bangalore
Best data science training, best data science training institute in hyderabad.

More from Thomas Gottron (7)

PDF
Focused Exploration of Geospatial Context on Linked Open Data
PDF
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
PPTX
 Challenges in Managing Online Business Communities
PPTX
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
PPTX
Challenging Retrieval Scenarios: Social Media and Linked Open Data
PPTX
Get the Google Feeling! Supporting Users in Finding Relevant Sources
PPTX
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
Focused Exploration of Geospatial Context on Linked Open Data
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
 Challenges in Managing Online Business Communities
ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Informa...
Challenging Retrieval Scenarios: Social Media and Linked Open Data
Get the Google Feeling! Supporting Users in Finding Relevant Sources
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...

Recently uploaded (20)

PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
2Systematics of Living Organisms t-.pptx
PDF
. Radiology Case Scenariosssssssssssssss
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPT
protein biochemistry.ppt for university classes
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
2. Earth - The Living Planet earth and life
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
famous lake in india and its disturibution and importance
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
2Systematics of Living Organisms t-.pptx
. Radiology Case Scenariosssssssssssssss
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
INTRODUCTION TO EVS | Concept of sustainability
protein biochemistry.ppt for university classes
Biophysics 2.pdffffffffffffffffffffffffff
2. Earth - The Living Planet earth and life
Introduction to Fisheries Biotechnology_Lesson 1.pptx
famous lake in india and its disturibution and importance
TOTAL hIP ARTHROPLASTY Presentation.pptx
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Phytochemical Investigation of Miliusa longipes.pdf
POSITIONING IN OPERATION THEATRE ROOM.ppt
7. General Toxicologyfor clinical phrmacy.pptx
Comparative Structure of Integument in Vertebrates.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Cell Membrane: Structure, Composition & Functions
neck nodes and dissection types and lymph nodes levels
AlphaEarth Foundations and the Satellite Embedding dataset

Perplexity of Index Models over Evolving Linked Data

  • 1. Institute for Web Science & Technologies – WeST Perplexity of Index Models over Evolving Linked Data Thomas Gottron, Christian Gottron May 27th, 2014 ESWC, Crete
  • 2. Thomas Gottron ESWC 27.5.2014, 2Perplexity of Index Models Over Evolving LOD Motivation Index Once upon a time... ... some time later New index ??? Accuracy?
  • 3. Thomas Gottron ESWC 27.5.2014, 3Perplexity of Index Models Over Evolving LOD Index Models Over Linked Data
  • 4. Thomas Gottron ESWC 27.5.2014, 4Perplexity of Index Models Over Evolving LOD Data Format  Linked Data as N-Quads: triple – what is the information? context URI – where does it come from? s op c ( )s op c
  • 5. Thomas Gottron ESWC 27.5.2014, 5Perplexity of Index Models Over Evolving LOD Ã( ) (Abstract) Index Models  D : Data elements to be retrieved (payload)  K : Key elements to access the data (index elements)  σ : Selection function: How to get data for a key k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... DK s Searchdata structure Efficientstorage andretrieval Data items (payload)Keys
  • 6. Thomas Gottron ESWC 27.5.2014, 6Perplexity of Index Models Over Evolving LOD Concrete Example: Subject Based Index Model ukob:Gottron ukob:Staab ukob:Schegi ... tud:CGottron (ukob:Gottron, rdf:type, foaf:Person) (ukob:Gottron, foaf:knows, ukob:Staab) ... (ukob:Staab, swrc:institution, ukob:WeST) (ukob:Staab, foaf:name, „Steffen Staab“) ... (ukob:Schegi, rdf:type, foaf:Person) (ukob:Schegi, foaf:name, „Stefan Scheglmann“) (tud:CGottron, swrc:institution, tud:KOM) (tud:CGottron, foaf:knows, ukob:Gottron) ...
  • 7. Thomas Gottron ESWC 27.5.2014, 7Perplexity of Index Models Over Evolving LOD 12 Implemented Index Models  Triple based  Meta data  Schema-level https://github.com/gottron/lod-index-models s ops s opp s opo s opterm s opc s opPLD type s SchemEX s t t st t p p sp p p-1 p-1 op-1p-1 t p sp t
  • 8. Thomas Gottron ESWC 27.5.2014, 8Perplexity of Index Models Over Evolving LOD Index Accuracy over Evolving Data
  • 9. Thomas Gottron ESWC 27.5.2014, 9Perplexity of Index Models Over Evolving LOD Comparing Indices Once upon a time... ... some time later ???k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3
  • 10. Thomas Gottron ESWC 27.5.2014, 10Perplexity of Index Models Over Evolving LOD Metrics  First indicator of interest:  Stability of the key element set Jaccard K1,K2( )= K1 ÇK2 K1 ÈK2 Relative size of the overlap of two sets Jaccard Similarity
  • 11. Thomas Gottron ESWC 27.5.2014, 11Perplexity of Index Models Over Evolving LOD How to Measure Accuracy?  Queries?  No established query log for used data set  Different key elements require different queries  Cover all of the index  Distributions!  Relevant to several applications  Established metrics for comparison
  • 12. Thomas Gottron ESWC 27.5.2014, 12Perplexity of Index Models Over Evolving LOD Obtaining a Distribution from an Index k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 ... Ã D( )K s
  • 13. Thomas Gottron ESWC 27.5.2014, 13Perplexity of Index Models Over Evolving LOD Obtaining a Distribution from an Index k1 k2 k3 ... kn 4 2 10 8 K s(k) count Relative frequencies ... K p P k( )= s(k) M M
  • 14. Thomas Gottron ESWC 27.5.2014, 14Perplexity of Index Models Over Evolving LOD Comparing Indices Once upon a time... ... some time later ??? K q K p
  • 15. Thomas Gottron ESWC 27.5.2014, 15Perplexity of Index Models Over Evolving LOD Comparing Distributions  Information theoretic measures ??? KK q p H P( )= - P(x)ld(P(x)) xÎK å Entropy of P Expected length (in bits) for an optimal encoding of a (randomly chosen) key
  • 16. Thomas Gottron ESWC 27.5.2014, 16Perplexity of Index Models Over Evolving LOD Metrics H P,Q( )= - P(x)ld(Q(x)) xÎK å Expected length when the encoding is based on a different distribution Cross-Entropy of P and Q PP P,Q( )= 2H(P,Q) How many uniformly distributed keys would have the same entropy Perplexity PP P,Q( )Norm = 2H (P,Q) K Perplexity relative to a uniform distribution over the keys Normalized Perplexity
  • 17. Thomas Gottron ESWC 27.5.2014, 17Perplexity of Index Models Over Evolving LOD Metrics: How to Interpret Perplexity  Perplexity based on cross entropy  „How surprised are you about the outcome of an experiment given you have some expections?“ p 1 10 1 2 1 2 3 4 5 6 PP = 22.585 = 6 q 1 6 1 2 3 4 5 6 PPNorm = 22.585 6 =1 Unfair die model
  • 18. Thomas Gottron ESWC 27.5.2014, 18Perplexity of Index Models Over Evolving LOD q 1 10 1 2 1 2 3 4 5 6 Metrics: How to Interpret Perplexity  Perplexity based on cross entropy  „How surprised are you about the outcome of an experiment given you have some expections?“ PP = 22.161 = 4.472 PPNorm = 0.745 p 1 10 1 2 1 2 3 4 5 6 Unfair die
  • 19. Thomas Gottron ESWC 27.5.2014, 19Perplexity of Index Models Over Evolving LOD Metrics: How to Interpret Perplexity  Perplexity based on cross entropy  „How surprised are you about the outcome of an experiment given you have some expections?“ PP = 22.287 = 4.880 q 3 25 2 5 1 2 3 4 5 6 PPNorm = 0.813 p 1 10 1 2 1 2 3 4 5 6 Unfair die
  • 20. Thomas Gottron ESWC 27.5.2014, 20Perplexity of Index Models Over Evolving LOD Metrics: How to Interpret Perplexity  Perplexity based on cross entropy  „How surprised are you about the outcome of an experiment given you have some expections?“ PP = 23.090 =8.513 q 1 2 3 4 5 6 1 10 1 2 PPNorm =1.418 p 1 10 1 2 1 2 3 4 5 6 Unfair die
  • 21. Thomas Gottron ESWC 27.5.2014, 21Perplexity of Index Models Over Evolving LOD Stability of Index Models over Evolving Data
  • 22. Thomas Gottron ESWC 27.5.2014, 22Perplexity of Index Models Over Evolving LOD Comparing Indices Once upon a time... ... some time later Jaccard Perplexity k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3 k1 k2 k3 ... kn d1,1 d1,2 d1,3 ... d2,1 d2,2 d3,1 d3,2 d3,3 ... dn,1 dn,2 dn,3
  • 23. Thomas Gottron ESWC 27.5.2014, 23Perplexity of Index Models Over Evolving LOD Experimental Setup Index construction / Estimation of distributions ... T0 (Base) ... ... T1 T2 T3 TnTn-1 T0 „deviation“ T1 T2 T3 TnTn-1
  • 24. Thomas Gottron ESWC 27.5.2014, 24Perplexity of Index Models Over Evolving LOD Results: Jaccard Similarity of Key Set
  • 25. Thomas Gottron ESWC 27.5.2014, 25Perplexity of Index Models Over Evolving LOD Results: Normalised Perplexity
  • 26. Thomas Gottron ESWC 27.5.2014, 26Perplexity of Index Models Over Evolving LOD Results: Normalised Perplexity (Zoom in)
  • 27. Thomas Gottron ESWC 27.5.2014, 27Perplexity of Index Models Over Evolving LOD Conclusion Summary  Evaluation of stability of 12 LOD index models  Application independent evaluation framework  Good stability of schema-level indices Future Work  Index specific assessment of quality based on samples  Accuracy in answering queries
  • 28. Thomas Gottron ESWC 27.5.2014, 28Perplexity of Index Models Over Evolving LOD Thanks! Contact: Thomas Gottron WeST – Institute for Web Science and Technologies Universität Koblenz-Landau gottron@uni-koblenz.de #eswc2014GottronTC