SlideShare a Scribd company logo
DL-FOIL: Class Expression Learning Revisited
Nicola Fanizzi, Giuseppe Rizzo, Claudia d’Amato, Floriana Esposito
LACAM - Dipartimento di Informatica, Universit`a degli Studi di Bari Aldo Moro
EKAW 2018, Nancy, France – 15th November 2018
Outline
1 Introduction
2 The problem
3 DL-Foil
4 Evaluation
5 Conclusions, Ongoing & Future Work
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Introduction
Introduction
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Introduction
Motivations
Goal
Eliciting candidate concept descriptions for semi-automatic knowledge base
completion
TBox: candidate (equivalence) axioms
ABox: candidate (class) assertions by classifying individuals
Solutions
(Supervised) Machine learning methods:
E.g. concept learning: symbolic methods for producing a concept
description using a set of pos./neg./unlabeled. examples
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Introduction
Motivations:
Previous solutions and current limits
DL-Foil: produces a concept description in a disjunctive form providing a
consistent classification of the examples
ternary problem (pos., neg., unlabeled ex.s) – OWA
partial description generated on-the-fly to cover the largest number of pos.
ex.s as possible
selection among a set of candidates generated according to an heuristic
Problems:
generated descriptions not covering positive examples
unlabeled individuals equally contribute to the score for candidate evaluation
Contribution: improving both the specialization procedure and and the
heuristic considering the actual number of unlabeled individuals
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
The problem
The problem
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
The problem
The learning problem
Let K = T , A be a DL knowledge base.
Given a concept name C
a training set Tr = (Ps, Ns, Us)
Find a concept description D, such that, letting K = K ∪ {C ≡ D}, A ,
the following entailments hold:
∀a ∈ Ps: K |= C(a)
∀b ∈ Ns: K |= ¬C(b)
i.e. correct w.r.t. the examples and general for predictive purposes
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
The problem
The learning problem: an example
Let K = { Man Person, Man ¬Woman
Woman Person,
Man (a), Man (b), Man(c), hasChild(a,d), hasChild (b,e), Woman (d), Woman
(f), Artist(e), Dog(z) }
Target concept: Father, i.e. a man with at least a child
Ps={ a, b}
Ns ={ d,f} ( due to Man ¬Woman)
Us= { c, e,z }
induced concept: Father ≡ Man ∃ hasChild
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
DL-Foil
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The algorithm
Given Tr and a partial description C in a disjunctive form (initialized C = ⊥):
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The algorithm
Given Tr and a partial description C in a disjunctive form (initialized C = ⊥):
C’=
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The algorithm
Given Tr and a partial description C in a disjunctive form (initialized C = ⊥):
C’= Refines C
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The algorithm
Given Tr and a partial description C in a disjunctive form (initialized C = ⊥):
C’= Refines C Find the best Di
{Di |Di C }
neg./unl. example covered
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The algorithm
Given Tr and a partial description C in a disjunctive form (initialized C = ⊥):
C’= Refines C Find the best Di
C = C Di
{Di |Di C }
neg./unl. example covered
no neg.exs coveredRemove pos examples
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
DL-Foil: Covering Procedure Example
Input:
Tr = {a, b, d, f, c, e, z}
C = ⊥
Trace of the algorithm
1st ref. step: C = (covered examples: Tr)
D∗
= ¬Woman (covered examples: a, b, c) (c ∈ Us — further specialization
required)
2nd ref. step: C = ¬Woman
D∗
= ¬Woman ∃hasChild.Person (covered examples: a, b –all i ∈ Ps)
C = ⊥ ¬Woman ∃hasChild.Person
Ps = Ps  {a, b} = ∅
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept
Generation n concept D C performing a sort of random sampling in the DL
concept space
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept
Generation n concept D C performing a sort of random sampling in the DL
concept space
ρ1 D = C A
ρ2 D = C ¬A
ρ3 D = C ∀R. Add a conjunct (randomly selected from the signature)
ρ4 D = C ∃R.
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept
Generation n concept D C performing a sort of random sampling in the DL
concept space
Refines an existing sub-description (randomly selected)
ρ5 D = C1 · · · B · · · Cn
if C = C1 · · · A · · · Cn and B A
ρ6 D = C1 · · · ¬B · · · Cn
if C = C1 · · · ¬A · · · Cn and A B
ρ7 D = C1 · · · ∃R.F · · · Cn
if C = C1 · · · ∃R.E · · · Cn and F ∈ ρ(E)
ρ8 D = C1 · · · ∀R.F · · · Cn
if C = C1 · · · ∀R.E · · · Cn and F ∈ ρ(E)
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept
Generation n concept D C performing a sort of random sampling in the DL
concept space
ρ1 D = C A
ρ2 D = C ¬A
ρ3 D = C ∀R.
ρ4 D = C ∃R.
ρ5 D = C1 · · · B · · · Cn
if C = C1 · · · A · · · Cn and B A
ρ6 D = C1 · · · ¬B · · · Cn
if C = C1 · · · ¬A · · · Cn and A B
ρ7 D = C1 · · · ∃R.F · · · Cn
if C = C1 · · · ∃R.E · · · Cn and F ∈ ρ(E)
ρ8 D = C1 · · · ∀R.F · · · Cn
if C = C1 · · · ∀R.E · · · Cn and F ∈ ρ(E)
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept: examples
Let C = Person the concept to be refined
D = Person Man ( using ρ1)
D = Person ¬Woman( using ρ2)
D = Person ∃hasChild. (using ρ3)
D = Person ∀hasChild. (using ρ4)
Let C = Person ∃hasChild.
D = Person ∃hasChild.Man
D = Person ∃hasChild.Dog
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Specializing a concept: examples
Let C = Person the concept to be refined
D = Person Man ( using ρ1)
D = Person ¬Woman( using ρ2)
D = Person ∃hasChild. (using ρ3)
D = Person ∀hasChild. (using ρ4)
Let C = Person ∃hasChild.
D = Person ∃hasChild.Man
D = Person ∃hasChild.Dog ← Satisfiable w.r.t KB but without pos.exs.!
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
Improving concept specialization
Further constraints are used in DL-FOIL for avoiding ”uninformative”
concepts:
the specialization procedure implementing ρ generates concepts that
covers at least a positive example
the score of each specialization exceeds a threshold
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
DL-Foil
The score
DL-Foil selects the concept maximizing an information-theoretics heuristic:
g(D0, D1) = p1 · log
p1
p1 + n1 + u1
− log
p0
p0 + n0 + u0
D0: the former partial definition
D1: the specialization
p1, n1, u1: the actual number of pos., neg., unl. exs. covered by D1
p0, n0, u0: the actual number of pos., neg., unl. exs. covered by D0
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Evaluation
Evaluation
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Evaluation
Preliminary Experiments
Concept membership prediction
5 publicly available ontologies
15 artificially generated datasets:
random target concept generation
ground truth: individuals labeled according to the membership w.r.t. target
Competitor: CELOE
0.632 bootstrap as the design of the experiment
Indices: membership w.r.t. the induced concept against membership
w.r.t. the target
actual
value
Prediction outcome
pos. neg. unl.
pos. match (M) commission (C) omission (O)
neg. commission match omission
unl. induction (I) induction match
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Evaluation
Outcomes
Dataset index DL-FOIL CELOE
BIOPAX M% 95.73 ± 03.74 94.53 ± 01.17
C% 00.13 ± 00.20 03.24 ± 00.85
O% 01.90 ± 03.31 01.62 ± 00.38
I% 02.23 ± 00.40 00.61 ± 00.18
NTN M% 97.78 ± 05.05 97.41 ± 00.15
C% 00.05 ± 00.07 00.00 ± 00.00
O% 02.17 ± 05.00 00.00 ± 00.00
I% 00.01 ± 00.01 02.59 ± 00.15
HDISEASE M% 88.75 ± 01.09 88.08 ± 01.09
C% 00.04 ± 00.10 00.00 ± 00.00
O% 03.64 ± 01.30 07.69 ± 00.90
I% 07.57 ± 01.42 04.23 ± 00.24
FINANCIAL M% 93.52 ± 01.02 87.40 ± 04.74
C% 00.22 ± 00.21 06.33 ± 04.33
O% 00.00 ± 00.00 00.00 ± 00.01
I% 06.26 ± 00.88 06.26 ± 00.52
GEOSKILLS M% 82.60 ± 04.69 50.20 ± 02.31
C% 00.00 ± 00.00 23.66 ± 02.61
O% 13.33 ± 04.43 01.34 ± 00.12
I% 04.07 ± 04.09 24.80 ± 00.89
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Conclusions, Ongoing & Future Work
Conclusions, Ongoing & Future Work
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Conclusions, Ongoing & Future Work
Conclusions & Extensions
Modified version of DL-Foil with a different specialization procedure and
heuristic
The evaluation shows good results in terms of match rate
Ongoing & Future Work
New evaluations on larger knowledge bases
New specializations procedures
New heuristics
Scalability
Parallel computation
Distributed computation (Spark, Flink...)
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
Conclusions, Ongoing & Future Work
Thank You!
Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018

More Related Content

PDF
LightFields.jl: Fast 3D image reconstruction for VR applications - Hector And...
PDF
A Labelling Semantics for Weighted Argumentation Frameworks
PDF
[Question Paper] C++ and Java (Old Course) [June / 2014]
PDF
Os Urner
PDF
RuleML2015: Learning Characteristic Rules in Geographic Information Systems
PDF
Traffic flow modeling on road networks using Hamilton-Jacobi equations
PDF
Implementing Ranking-Based Semantics in ConArg
LightFields.jl: Fast 3D image reconstruction for VR applications - Hector And...
A Labelling Semantics for Weighted Argumentation Frameworks
[Question Paper] C++ and Java (Old Course) [June / 2014]
Os Urner
RuleML2015: Learning Characteristic Rules in Geographic Information Systems
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Implementing Ranking-Based Semantics in ConArg

Similar to DL-Foil:Class Expression Learning Revisited (20)

PDF
A framework for Tackling myopia in concept learning on the Web of Data
PPT
tutorialdlvgb2.ppt - answer set programming
PDF
Knowledge Graphs, Daria Stepanova, Bosch Center for Artificial Intelligence
PPT
cs344-lect11-resolution-robotic-knowledge-representation-29jan08.ppt
PPT
S10
PPT
S10
PDF
Bill howe 6_machinelearning_1
PPTX
artificial intelligence.pptx
PPT
tutorial.ppt
PPT
Learning sets of rules, Sequential Learning Algorithm,FOIL
PPT
knowld in learning jkjdsbfbsdvs iuuohgguh
PDF
Logic Programming and ILP
PPT
original
PPT
Bounded Model Checking
PPT
Learning to Search Henry Kautz
PPT
Learning to Search Henry Kautz
PDF
Mpi talk
PDF
RuleML2015: Rule Generalization Strategies in Incremental Learning of Disjunc...
PDF
RuleML2015: Rule Generalization Strategies in Incremental Learning of Disjunc...
PDF
Rule Generalization Strategies in Incremental Learning of Disjunctive Concepts
A framework for Tackling myopia in concept learning on the Web of Data
tutorialdlvgb2.ppt - answer set programming
Knowledge Graphs, Daria Stepanova, Bosch Center for Artificial Intelligence
cs344-lect11-resolution-robotic-knowledge-representation-29jan08.ppt
S10
S10
Bill howe 6_machinelearning_1
artificial intelligence.pptx
tutorial.ppt
Learning sets of rules, Sequential Learning Algorithm,FOIL
knowld in learning jkjdsbfbsdvs iuuohgguh
Logic Programming and ILP
original
Bounded Model Checking
Learning to Search Henry Kautz
Learning to Search Henry Kautz
Mpi talk
RuleML2015: Rule Generalization Strategies in Incremental Learning of Disjunc...
RuleML2015: Rule Generalization Strategies in Incremental Learning of Disjunc...
Rule Generalization Strategies in Incremental Learning of Disjunctive Concepts
Ad

More from Giuseppe Rizzo (8)

PDF
Boosting dl concept learners
PDF
Terminological cluster trees for Disjointness Axiom Discovery
PDF
Approximating Numeric Role Fillers via Predictive Clustering Trees for Know...
PDF
Inducing Predictive Clustering Trees for Datatype properties Values
PDF
On the Effectiveness of Evidence-based Terminological Decision Trees
PDF
Inductive Classification through Evidence-based Models and Their Ensemble
PDF
Tackling the Class Imbalance Learning Problem in Semantic Web Knowledge bases
PDF
Towards Evidence Terminological Decision Tree
Boosting dl concept learners
Terminological cluster trees for Disjointness Axiom Discovery
Approximating Numeric Role Fillers via Predictive Clustering Trees for Know...
Inducing Predictive Clustering Trees for Datatype properties Values
On the Effectiveness of Evidence-based Terminological Decision Trees
Inductive Classification through Evidence-based Models and Their Ensemble
Tackling the Class Imbalance Learning Problem in Semantic Web Knowledge bases
Towards Evidence Terminological Decision Tree
Ad

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Business Analytics and business intelligence.pdf
PDF
Introduction to Data Science and Data Analysis
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Introduction to the R Programming Language
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
oil_refinery_comprehensive_20250804084928 (1).pptx
Miokarditis (Inflamasi pada Otot Jantung)
Business Analytics and business intelligence.pdf
Introduction to Data Science and Data Analysis
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
STUDY DESIGN details- Lt Col Maksud (21).pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Supervised vs unsupervised machine learning algorithms
Fluorescence-microscope_Botany_detailed content
Data_Analytics_and_PowerBI_Presentation.pptx
Qualitative Qantitative and Mixed Methods.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction to the R Programming Language
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb

DL-Foil:Class Expression Learning Revisited

  • 1. DL-FOIL: Class Expression Learning Revisited Nicola Fanizzi, Giuseppe Rizzo, Claudia d’Amato, Floriana Esposito LACAM - Dipartimento di Informatica, Universit`a degli Studi di Bari Aldo Moro EKAW 2018, Nancy, France – 15th November 2018
  • 2. Outline 1 Introduction 2 The problem 3 DL-Foil 4 Evaluation 5 Conclusions, Ongoing & Future Work Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 3. Introduction Introduction Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 4. Introduction Motivations Goal Eliciting candidate concept descriptions for semi-automatic knowledge base completion TBox: candidate (equivalence) axioms ABox: candidate (class) assertions by classifying individuals Solutions (Supervised) Machine learning methods: E.g. concept learning: symbolic methods for producing a concept description using a set of pos./neg./unlabeled. examples Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 5. Introduction Motivations: Previous solutions and current limits DL-Foil: produces a concept description in a disjunctive form providing a consistent classification of the examples ternary problem (pos., neg., unlabeled ex.s) – OWA partial description generated on-the-fly to cover the largest number of pos. ex.s as possible selection among a set of candidates generated according to an heuristic Problems: generated descriptions not covering positive examples unlabeled individuals equally contribute to the score for candidate evaluation Contribution: improving both the specialization procedure and and the heuristic considering the actual number of unlabeled individuals Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 6. The problem The problem Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 7. The problem The learning problem Let K = T , A be a DL knowledge base. Given a concept name C a training set Tr = (Ps, Ns, Us) Find a concept description D, such that, letting K = K ∪ {C ≡ D}, A , the following entailments hold: ∀a ∈ Ps: K |= C(a) ∀b ∈ Ns: K |= ¬C(b) i.e. correct w.r.t. the examples and general for predictive purposes Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 8. The problem The learning problem: an example Let K = { Man Person, Man ¬Woman Woman Person, Man (a), Man (b), Man(c), hasChild(a,d), hasChild (b,e), Woman (d), Woman (f), Artist(e), Dog(z) } Target concept: Father, i.e. a man with at least a child Ps={ a, b} Ns ={ d,f} ( due to Man ¬Woman) Us= { c, e,z } induced concept: Father ≡ Man ∃ hasChild Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 9. DL-Foil DL-Foil Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 10. DL-Foil The algorithm Given Tr and a partial description C in a disjunctive form (initialized C = ⊥): Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 11. DL-Foil The algorithm Given Tr and a partial description C in a disjunctive form (initialized C = ⊥): C’= Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 12. DL-Foil The algorithm Given Tr and a partial description C in a disjunctive form (initialized C = ⊥): C’= Refines C Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 13. DL-Foil The algorithm Given Tr and a partial description C in a disjunctive form (initialized C = ⊥): C’= Refines C Find the best Di {Di |Di C } neg./unl. example covered Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 14. DL-Foil The algorithm Given Tr and a partial description C in a disjunctive form (initialized C = ⊥): C’= Refines C Find the best Di C = C Di {Di |Di C } neg./unl. example covered no neg.exs coveredRemove pos examples Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 15. DL-Foil DL-Foil: Covering Procedure Example Input: Tr = {a, b, d, f, c, e, z} C = ⊥ Trace of the algorithm 1st ref. step: C = (covered examples: Tr) D∗ = ¬Woman (covered examples: a, b, c) (c ∈ Us — further specialization required) 2nd ref. step: C = ¬Woman D∗ = ¬Woman ∃hasChild.Person (covered examples: a, b –all i ∈ Ps) C = ⊥ ¬Woman ∃hasChild.Person Ps = Ps {a, b} = ∅ Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 16. DL-Foil Specializing a concept Generation n concept D C performing a sort of random sampling in the DL concept space Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 17. DL-Foil Specializing a concept Generation n concept D C performing a sort of random sampling in the DL concept space ρ1 D = C A ρ2 D = C ¬A ρ3 D = C ∀R. Add a conjunct (randomly selected from the signature) ρ4 D = C ∃R. Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 18. DL-Foil Specializing a concept Generation n concept D C performing a sort of random sampling in the DL concept space Refines an existing sub-description (randomly selected) ρ5 D = C1 · · · B · · · Cn if C = C1 · · · A · · · Cn and B A ρ6 D = C1 · · · ¬B · · · Cn if C = C1 · · · ¬A · · · Cn and A B ρ7 D = C1 · · · ∃R.F · · · Cn if C = C1 · · · ∃R.E · · · Cn and F ∈ ρ(E) ρ8 D = C1 · · · ∀R.F · · · Cn if C = C1 · · · ∀R.E · · · Cn and F ∈ ρ(E) Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 19. DL-Foil Specializing a concept Generation n concept D C performing a sort of random sampling in the DL concept space ρ1 D = C A ρ2 D = C ¬A ρ3 D = C ∀R. ρ4 D = C ∃R. ρ5 D = C1 · · · B · · · Cn if C = C1 · · · A · · · Cn and B A ρ6 D = C1 · · · ¬B · · · Cn if C = C1 · · · ¬A · · · Cn and A B ρ7 D = C1 · · · ∃R.F · · · Cn if C = C1 · · · ∃R.E · · · Cn and F ∈ ρ(E) ρ8 D = C1 · · · ∀R.F · · · Cn if C = C1 · · · ∀R.E · · · Cn and F ∈ ρ(E) Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 20. DL-Foil Specializing a concept: examples Let C = Person the concept to be refined D = Person Man ( using ρ1) D = Person ¬Woman( using ρ2) D = Person ∃hasChild. (using ρ3) D = Person ∀hasChild. (using ρ4) Let C = Person ∃hasChild. D = Person ∃hasChild.Man D = Person ∃hasChild.Dog Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 21. DL-Foil Specializing a concept: examples Let C = Person the concept to be refined D = Person Man ( using ρ1) D = Person ¬Woman( using ρ2) D = Person ∃hasChild. (using ρ3) D = Person ∀hasChild. (using ρ4) Let C = Person ∃hasChild. D = Person ∃hasChild.Man D = Person ∃hasChild.Dog ← Satisfiable w.r.t KB but without pos.exs.! Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 22. DL-Foil Improving concept specialization Further constraints are used in DL-FOIL for avoiding ”uninformative” concepts: the specialization procedure implementing ρ generates concepts that covers at least a positive example the score of each specialization exceeds a threshold Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 23. DL-Foil The score DL-Foil selects the concept maximizing an information-theoretics heuristic: g(D0, D1) = p1 · log p1 p1 + n1 + u1 − log p0 p0 + n0 + u0 D0: the former partial definition D1: the specialization p1, n1, u1: the actual number of pos., neg., unl. exs. covered by D1 p0, n0, u0: the actual number of pos., neg., unl. exs. covered by D0 Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 24. Evaluation Evaluation Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 25. Evaluation Preliminary Experiments Concept membership prediction 5 publicly available ontologies 15 artificially generated datasets: random target concept generation ground truth: individuals labeled according to the membership w.r.t. target Competitor: CELOE 0.632 bootstrap as the design of the experiment Indices: membership w.r.t. the induced concept against membership w.r.t. the target actual value Prediction outcome pos. neg. unl. pos. match (M) commission (C) omission (O) neg. commission match omission unl. induction (I) induction match Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 26. Evaluation Outcomes Dataset index DL-FOIL CELOE BIOPAX M% 95.73 ± 03.74 94.53 ± 01.17 C% 00.13 ± 00.20 03.24 ± 00.85 O% 01.90 ± 03.31 01.62 ± 00.38 I% 02.23 ± 00.40 00.61 ± 00.18 NTN M% 97.78 ± 05.05 97.41 ± 00.15 C% 00.05 ± 00.07 00.00 ± 00.00 O% 02.17 ± 05.00 00.00 ± 00.00 I% 00.01 ± 00.01 02.59 ± 00.15 HDISEASE M% 88.75 ± 01.09 88.08 ± 01.09 C% 00.04 ± 00.10 00.00 ± 00.00 O% 03.64 ± 01.30 07.69 ± 00.90 I% 07.57 ± 01.42 04.23 ± 00.24 FINANCIAL M% 93.52 ± 01.02 87.40 ± 04.74 C% 00.22 ± 00.21 06.33 ± 04.33 O% 00.00 ± 00.00 00.00 ± 00.01 I% 06.26 ± 00.88 06.26 ± 00.52 GEOSKILLS M% 82.60 ± 04.69 50.20 ± 02.31 C% 00.00 ± 00.00 23.66 ± 02.61 O% 13.33 ± 04.43 01.34 ± 00.12 I% 04.07 ± 04.09 24.80 ± 00.89 Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 27. Conclusions, Ongoing & Future Work Conclusions, Ongoing & Future Work Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 28. Conclusions, Ongoing & Future Work Conclusions & Extensions Modified version of DL-Foil with a different specialization procedure and heuristic The evaluation shows good results in terms of match rate Ongoing & Future Work New evaluations on larger knowledge bases New specializations procedures New heuristics Scalability Parallel computation Distributed computation (Spark, Flink...) Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018
  • 29. Conclusions, Ongoing & Future Work Thank You! Giuseppe Rizzo (LACAM-Dip.Informatica, Bari) DL-Foil EKAW 2018, Nancy, France – 15th November 2018