Using Neural Networks to 
aggregate Linked Data rules 
Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta
Knowledge Discovery patterns 
• Useful, explicit information about some 
collected data. 
raw data 
• Problems 
clean data 
Quantity - too many 
Quality - not interesting 
Knowledge 
PATTERNS 
Prepro 
cessing 
Mining 
Interpreting
Dedalo: interpreting patterns with 
Linked Data 
• Given the patterns of a clustering process 
• Find explanations for one cluster with information 
from Linked Data 
KMi researchers grouped together according to their co-authorship
Dedalo: interpreting patterns with 
Linked Data 
• Linked Data can help the non experts in 
understanding the pattern 
“Researchers working on projects led by someone interested in Semantic Web”
Dedalo: the bottleneck 
• Explanations are atomic. 
“People working with Enrico Motta” F1=66% 
“People working interested in Semantic Web” F1=66% 
“People working interested in Ontologies” F1=66% 
• We want to aggregate them to improve the 
explanation of the cluster 
“People working with Enrico Motta OR interested in Semantic Web” 
F1=93% 
“People interested in Semantic Web AND Ontologies” F1=86%
Rule aggregation state of the art 
• Pre-production of patterns 
• Machine Learning techniques 
• Artificial Neural Networks (ANN) for features reduction 
• Patterns post-processing 
• Interestingness measures (IR, Statistics…) 
• Semantic knowledge (ontologies, taxonomies) 
Proposition: 
use ANNs for post-processing patterns 
(i.e. for aggregating Linked Data rules)
Using ANNs to combine rules 
• We want to know if two rules r1 and r2 are worth combining 
• There must be a relationship between features of two rules 
that can help in deciding their combination 
• We know their F-score, hence Precision and Recall 
…(r1) 
…(r2) 
…(r1) 
UNION(r1,r2) 
INTERSECTION(r1,r2)
Model training and testing 
• Model configuration 
• Model : Feedforward multilayer perceptron 
• Neurons structure: 9 – 12 – 2 
• Inputs : P(r1), P(r2), R(r1), R(r2), F1(r1), F1(r2), and their absolute 
differences 
• Training and test set 
• 30,000 automatically labeled combinations (unions and intersections) 
for training 
• Boolean label if the F1 of the combination has increased 
• 30,000 combinations for testing 
• Error rate: 0.24 (MSE rate)
Predicting combinations 
• A process to combine rules 
• a large set of ranked rules 
• the nnet learned model 
• a prediction indicator p(r1,r2)= nnet(r1,r2)*max(f(r1),f(r2)) 
• Start from the top(H) rule 
• predict p(top(H),ri) for each rule in H 
• combine rules if p(top(H),ri) above a given threshold 
• add the new rule
Experiments - datasets 
• KMiA – authors clustered according to the papers written 
together 
• KMiP – papers clustered according to the abstracts words 
• Huds – students clustered according to the books they 
borrowed 
Data Rules RAM Time (sec) Initial top(H) Ending top(H) 
KMiA1 369 4G 60’’ 71.1% 86.3% 
KMiA2 511 4G 60’’ 60.6% 63.9% 
KMiP1 747 4G 75’’ 54.9% 63.9% 
KMiP2 1746 4G 160’’ 30.6% 84.1% 
Hud1 11,937 10G 2,500’’ 20.2% 66.9% 
Hud2 11,151 10G 3,000’’ 13.3% 67.3%
Experiments - strategies 
Compare the NNET process with other strategies 
Random baseline combine a random rule with the top(H) 
AllComb baseline combine everything in H with everything in H 
Top100 naïve combine the first 100 rules in H only 
First naïve always combine the top(H) with H 
Delta combine all rules above a threshold 
NNET combine any pair predicted 
NNET50 
combine if prediction is higher than 50% of 
the highest score at the current iteration
Experiments - results 
0.7	 
0.65	 
0.6	 
0.55	 
0.5	 
Random 
AllComb 
Top100 
First 
Delta 
NNET 
NNET50 0.2	 
Huds1 example. 
0.45	 
0.4	 
0.35	 
0.3	 
0.25	 
0	 500	 1000	 1500	 2000	 2500
Experiments - results 
1	 
0.95	 
0.9	 
0.85	 
0.8	 
0.75	 
0.7	 
0.65	 
0.6	 
0	 10	 20	 30	 40	 50	 60	 
0.88	 
0.86	 
0.84	 
0.82	 
0.8	 
0.78	 
0.76	 
0.74	 
0.72	 
0.7	 
0	 10	 20	 30	 40	 50	 60	 
KMiP2 example. Huds2 example. 
0	 10	 20	 30	 40	 50	 60	 70	 
KMiA2 example. 
0.63	 
0.61	 
0.59	 
0.57	 
0.55	 
KMiP1 example. 
0.7	 
0.6	 
0.5	 
0.4	 
0.3	 
0.2	 
0.1	 
0	 500	 1000	 1500	 2000	 2500	 3000	 
0.9	 
0.8	 
0.7	 
0.6	 
0.5	 
0.4	 
0.3	 
0	 30	 60	 90	 120	 150	 
KMiA1 example.
Experiments - performance 
• Comparing performances 
Speed Accuracy Scalability 
Random ++ - ++ 
AllComb + ++ -- 
Top100 + -- + 
First + -- -- 
Delta + -- -- 
NNET / 
NNET50 
++ ++ ++
Conclusions and future work 
• An approach to predict rule combination based on 
Artificial Neural Networks 
• Trained model on the information (Precision, Recall, 
F-score) about the rules 
• Save time and computational costs (vs. other) 
• Evaluating Dedalo on Google Trends: why is a trend 
popular according to Linked Data? 
http://linkedu.eu/dedalo/eval/
THANKS FOR YOUR ATTENTION 
ilaria.tiddi@open.ac.uk 
@IlaTiddi 
Questions?

More Related Content

PDF
JZanzigposter
PDF
Text-to-SQL with Data-Driven Templates
PPTX
RDF2Vec: RDF Graph Embeddings for Data Mining
PPTX
Seungwon Hwang: Entity Graph Mining and Matching
PDF
Graph Kernelpdf
PDF
Machine Learning Methods for Analysing and Linking RDF Data
PDF
ISMB読み会 2nd graph kernel
PDF
Machine Learning Techniques for the Semantic Web
JZanzigposter
Text-to-SQL with Data-Driven Templates
RDF2Vec: RDF Graph Embeddings for Data Mining
Seungwon Hwang: Entity Graph Mining and Matching
Graph Kernelpdf
Machine Learning Methods for Analysing and Linking RDF Data
ISMB読み会 2nd graph kernel
Machine Learning Techniques for the Semantic Web

Similar to Using Neural Networks to aggregate Linked Data rules (20)

PDF
Expert system neural fuzzy system
PDF
tsopze2011
PPTX
Nural network ER.Abhishek k. upadhyay
PDF
Artificial Intelligence Chapter 9 Negnevitsky
PPTX
Deep learning (2)
PPTX
Artificial Neural Network
PDF
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
PPT
hybrid intelligents systems @ Vimbie.techie.ppt
PDF
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
PPTX
Digit recognizer by convolutional neural network
PDF
ANNs.pdf
PPTX
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
PPTX
Researching artificial intelligence and machine learning a lgorithms final
PPTX
layer major Networks.pptx
PPTX
Deep vs diverse architectures for classification problems
PDF
Search to Distill: Pearls are Everywhere but not the Eyes
PPT
Machine Learning
PDF
PPTX
Lecture notes s eeeeeeeeeeeeeeeeeeeeeeeeeeee
PPT
Soft Computing-173101
Expert system neural fuzzy system
tsopze2011
Nural network ER.Abhishek k. upadhyay
Artificial Intelligence Chapter 9 Negnevitsky
Deep learning (2)
Artificial Neural Network
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
hybrid intelligents systems @ Vimbie.techie.ppt
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...
Digit recognizer by convolutional neural network
ANNs.pdf
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Researching artificial intelligence and machine learning a lgorithms final
layer major Networks.pptx
Deep vs diverse architectures for classification problems
Search to Distill: Pearls are Everywhere but not the Eyes
Machine Learning
Lecture notes s eeeeeeeeeeeeeeeeeeeeeeeeeeee
Soft Computing-173101
Ad

More from Vrije Universiteit Amsterdam (14)

PDF
Building intelligent systems (that can explain)
PDF
Building intelligent systems (that can explain)
PDF
Building intelligent systems with FAIR data
PPTX
Building intelligent systems (that can explain)
PDF
An ontology-based approach to improve the accessibility of ROS-based robotic ...
PPTX
Answer Worskshop @ESWC2017 - Introduction
PPTX
Update of time-invalid information in knowledge bases through mobile agents
PPTX
Learning to assess Linked Data relationships using Genetic Programming
PPTX
An Ontology Design Pattern to Define Explanations
PPTX
LD4KD 2015 - Demos and tools
PPTX
Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015
PPTX
Quantifying the bias in data links
PPTX
Walking Linked Data: a graph traversal approach to explain clusters
PDF
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Building intelligent systems (that can explain)
Building intelligent systems (that can explain)
Building intelligent systems with FAIR data
Building intelligent systems (that can explain)
An ontology-based approach to improve the accessibility of ROS-based robotic ...
Answer Worskshop @ESWC2017 - Introduction
Update of time-invalid information in knowledge bases through mobile agents
Learning to assess Linked Data relationships using Genetic Programming
An Ontology Design Pattern to Define Explanations
LD4KD 2015 - Demos and tools
Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015
Quantifying the bias in data links
Walking Linked Data: a graph traversal approach to explain clusters
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Ad

Recently uploaded (20)

PPTX
NORMAN_RESEARCH_PRESENTATION.in education
PDF
Microsoft-365-Administrator-s-Guide_.pdf
PPTX
HOW TO HANDLE THE STAGE FOR ACADEMIA AND OTHERS.pptx
DOCX
Action plan to easily understanding okey
PPTX
Knowledge Knockout ( General Knowledge Quiz )
PPTX
power point presentation ofDracena species.pptx
PDF
Yusen Logistics Group Sustainability Report 2024.pdf
PPTX
FINAL TEST 3C_OCTAVIA RAMADHANI SANTOSO-1.pptx
PPTX
3RD-Q 2022_EMPLOYEE RELATION - Copy.pptx
PPTX
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
PPTX
Sustainable Forest Management ..SFM.pptx
DOC
EVC毕业证学历认证,北密歇根大学毕业证留学硕士毕业证
PDF
Module 7 guard mounting of security pers
PPTX
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
PDF
COLEAD A2F approach and Theory of Change
PPTX
Unit 8#Concept of teaching and learning.pptx
PDF
Unnecessary information is required for the
PPTX
CASEWORK Power Point Presentation - pointers
DOCX
CLASS XII bbbbbnjhcvfyfhfyfyhPROJECT.docx
PPTX
PurpoaiveCommunication for students 02.pptx
NORMAN_RESEARCH_PRESENTATION.in education
Microsoft-365-Administrator-s-Guide_.pdf
HOW TO HANDLE THE STAGE FOR ACADEMIA AND OTHERS.pptx
Action plan to easily understanding okey
Knowledge Knockout ( General Knowledge Quiz )
power point presentation ofDracena species.pptx
Yusen Logistics Group Sustainability Report 2024.pdf
FINAL TEST 3C_OCTAVIA RAMADHANI SANTOSO-1.pptx
3RD-Q 2022_EMPLOYEE RELATION - Copy.pptx
Module_4_Updated_Presentation CORRUPTION AND GRAFT IN THE PHILIPPINES.pptx
Sustainable Forest Management ..SFM.pptx
EVC毕业证学历认证,北密歇根大学毕业证留学硕士毕业证
Module 7 guard mounting of security pers
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
COLEAD A2F approach and Theory of Change
Unit 8#Concept of teaching and learning.pptx
Unnecessary information is required for the
CASEWORK Power Point Presentation - pointers
CLASS XII bbbbbnjhcvfyfhfyfyhPROJECT.docx
PurpoaiveCommunication for students 02.pptx

Using Neural Networks to aggregate Linked Data rules

  • 1. Using Neural Networks to aggregate Linked Data rules Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta
  • 2. Knowledge Discovery patterns • Useful, explicit information about some collected data. raw data • Problems clean data Quantity - too many Quality - not interesting Knowledge PATTERNS Prepro cessing Mining Interpreting
  • 3. Dedalo: interpreting patterns with Linked Data • Given the patterns of a clustering process • Find explanations for one cluster with information from Linked Data KMi researchers grouped together according to their co-authorship
  • 4. Dedalo: interpreting patterns with Linked Data • Linked Data can help the non experts in understanding the pattern “Researchers working on projects led by someone interested in Semantic Web”
  • 5. Dedalo: the bottleneck • Explanations are atomic. “People working with Enrico Motta” F1=66% “People working interested in Semantic Web” F1=66% “People working interested in Ontologies” F1=66% • We want to aggregate them to improve the explanation of the cluster “People working with Enrico Motta OR interested in Semantic Web” F1=93% “People interested in Semantic Web AND Ontologies” F1=86%
  • 6. Rule aggregation state of the art • Pre-production of patterns • Machine Learning techniques • Artificial Neural Networks (ANN) for features reduction • Patterns post-processing • Interestingness measures (IR, Statistics…) • Semantic knowledge (ontologies, taxonomies) Proposition: use ANNs for post-processing patterns (i.e. for aggregating Linked Data rules)
  • 7. Using ANNs to combine rules • We want to know if two rules r1 and r2 are worth combining • There must be a relationship between features of two rules that can help in deciding their combination • We know their F-score, hence Precision and Recall …(r1) …(r2) …(r1) UNION(r1,r2) INTERSECTION(r1,r2)
  • 8. Model training and testing • Model configuration • Model : Feedforward multilayer perceptron • Neurons structure: 9 – 12 – 2 • Inputs : P(r1), P(r2), R(r1), R(r2), F1(r1), F1(r2), and their absolute differences • Training and test set • 30,000 automatically labeled combinations (unions and intersections) for training • Boolean label if the F1 of the combination has increased • 30,000 combinations for testing • Error rate: 0.24 (MSE rate)
  • 9. Predicting combinations • A process to combine rules • a large set of ranked rules • the nnet learned model • a prediction indicator p(r1,r2)= nnet(r1,r2)*max(f(r1),f(r2)) • Start from the top(H) rule • predict p(top(H),ri) for each rule in H • combine rules if p(top(H),ri) above a given threshold • add the new rule
  • 10. Experiments - datasets • KMiA – authors clustered according to the papers written together • KMiP – papers clustered according to the abstracts words • Huds – students clustered according to the books they borrowed Data Rules RAM Time (sec) Initial top(H) Ending top(H) KMiA1 369 4G 60’’ 71.1% 86.3% KMiA2 511 4G 60’’ 60.6% 63.9% KMiP1 747 4G 75’’ 54.9% 63.9% KMiP2 1746 4G 160’’ 30.6% 84.1% Hud1 11,937 10G 2,500’’ 20.2% 66.9% Hud2 11,151 10G 3,000’’ 13.3% 67.3%
  • 11. Experiments - strategies Compare the NNET process with other strategies Random baseline combine a random rule with the top(H) AllComb baseline combine everything in H with everything in H Top100 naïve combine the first 100 rules in H only First naïve always combine the top(H) with H Delta combine all rules above a threshold NNET combine any pair predicted NNET50 combine if prediction is higher than 50% of the highest score at the current iteration
  • 12. Experiments - results 0.7 0.65 0.6 0.55 0.5 Random AllComb Top100 First Delta NNET NNET50 0.2 Huds1 example. 0.45 0.4 0.35 0.3 0.25 0 500 1000 1500 2000 2500
  • 13. Experiments - results 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0 10 20 30 40 50 60 0.88 0.86 0.84 0.82 0.8 0.78 0.76 0.74 0.72 0.7 0 10 20 30 40 50 60 KMiP2 example. Huds2 example. 0 10 20 30 40 50 60 70 KMiA2 example. 0.63 0.61 0.59 0.57 0.55 KMiP1 example. 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 500 1000 1500 2000 2500 3000 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0 30 60 90 120 150 KMiA1 example.
  • 14. Experiments - performance • Comparing performances Speed Accuracy Scalability Random ++ - ++ AllComb + ++ -- Top100 + -- + First + -- -- Delta + -- -- NNET / NNET50 ++ ++ ++
  • 15. Conclusions and future work • An approach to predict rule combination based on Artificial Neural Networks • Trained model on the information (Precision, Recall, F-score) about the rules • Save time and computational costs (vs. other) • Evaluating Dedalo on Google Trends: why is a trend popular according to Linked Data? http://linkedu.eu/dedalo/eval/
  • 16. THANKS FOR YOUR ATTENTION ilaria.tiddi@open.ac.uk @IlaTiddi Questions?

Editor's Notes

  • #9: To train a neural network you need some measure of error between computed outputs and the desired target outputs of the training data. The most common measure of error is called mean squared error.
  • #10: most likely to result in a higher score whose combination is likely to have a higher score