Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

Anthony Coutant, Philippe Leray, Hoel Le Capitaine
DUKe (Data, User, Knowledge) Team, LINA
26th June, 2014
Learning Probabilistic Relational Models using
Non-Negative Matrix Factorization
7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes

22 / 24
Context
• Probabilistic Relational Models (PRM)
– Attributes uncertainty in Relational datasets
• Relational datasets: attributes + link
• PRM with Reference Uncertainty (RU) model link uncertainty
• Partitioning individuals necessary in PRM-RU

33 / 24
Problem & Proposal
• PRM-RU partition individuals based on attributes only
• We propose to cluster the relationship information instead
• We show that :
– Attributes partitioning do not explain all relationships
– Relational partitioning can explain attributes oriented relationships

44 / 24
Flat datasets – Bayesian Networks
• Individuals supposed i.i.d.
P(G1)
A B
0,25 0,75
P(G2)
A B
0,25 0,75
Dataset
G1 G2 R
A B 1st
B A 1st
B B 2nd
B B 2nd
G1, G2
P(R|G1,G2) A,A A,B B,A B,B
1st division 0,8 0,5 0,5 0,2
2nd division 0,2 0,5 0,5 0,8
Grade 1
Ranking
Grade 2

55 / 24
Relational datasets – Relational schema
Student
Intelligence
Ranking
Registration
Grade
Satisfaction
1,n1
Instance
Schema
Course
Phil101
Difficulté
???
Note
???
Registration
#4563
Note
???
Satisfaction
???
Student
Jane Doe
Intelligence
???
Classement
???
Student
Jane Doe
Intelligence
high
Ranking
1st division
Registration
#4563
Note
???
Satisfaction
???
Registration
#4563
Grade
A
Satisfaction
high
Course
Phil101
Difficulty
high
Evaluation
high
Course
Difficulty
Evaluation
1,n 1

66 / 24
Probabilistic Relational Models (PRM) .
MEAN(G)
P(R|MEAN(G)) A B
1st division 0,8 0,2
2nd division 0,2 0,8
PRM
Schema
Instance
Student
Intelligence
Ranking
Registration
Grade
Satisfaction
1,n1Course
Difficulty
Evaluation
1,n 1
Evaluation Intelligence
Grade
Satisfaction
Difficulty Ranking
Course Registration Student
MEAN
MEAN
Course
Math
Difficulté
???
Note
???
Registration
#6251
Note
???
Satisfaction
???
Student
John Smith
Intelligence
???
Classement
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#5621
Note
???
Satisfaction
???
Registration
#4563
Grade
???
Satisfaction
???
Course
Phil
Difficulty
???
Evaluation
???
Instance

77 / 24
Probabilistic Relational Models (PRM) ..
MEAN(G)
P(R|MEAN(G)) A B
1st division 0,8 0,2
2nd division 0,2 0,8
PRM
Schema
Course
Math
Difficulté
???
Note
???
Registration
#6251
Note
???
Satisfaction
???
Student
John Smith
Intelligence
???
Classement
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#5621
Note
???
Satisfaction
???
Registration
#4563
Grade
???
Satisfaction
???
Course
Phil
Difficulty
???
Evaluation
???
Instance
Evaluation Intelligence
Grade
Satisfaction
Difficulty Ranking
Course Registration Student
MEAN
MEAN
Math.Diff
#4563.Grade
#5621.Grade
#6251.Grade
MEAN
GBN (Ground Bayesian Network)
Math.Eval
Phil.Diff
Phil.Eval
#4563.Satis #5621.Satis
#6251.Satis
MEAN
JD.Int
JS.Int
JD.Rank
JS.Rank
MEAN
MEAN
Instance
Student
Intelligence
Ranking
Registration
Grade
Satisfaction
1,n1Course
Difficulty
Evaluation
1,n 1

88 / 24
Uncertainty in Relational datasets
Course
Phil101
Difficulté
???
Note
???
Registration
#4563
Note
???
Satisfaction
???
Student
Jane Doe
Intelligence
???
Classement
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#4563
Note
???
Satisfaction
???
Registration
#4563
Grade
???
Satisfaction
???
Course
Phil101
Difficulty
???
Evaluation
???
Student
Jane Doe
Intelligence
???
Ranking
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#4563
Note
???
Satisfaction
???
Registration
#4563
Grade
A
Satisfaction
???
Course
Phil101
Difficulté
???
Note
???
Course
Phil101
Difficulty
???
Evaluation
high
Course
Phil101
Difficulté
???
Note
???
Registration
#4563
Note
???
Satisfaction
???
Student
Jane Doe
Intelligence
???
Classement
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#4563
Note
???
Satisfaction
???
Registration
#4563
Grade
???
Satisfaction
???
Course
Phil101
Difficulty
???
Evaluation
???
Student
Jane Doe
Intelligence
???
Ranking
???
Student
Jane Doe
Intelligence
???
Ranking
???
Registration
#4563
Note
???
Satisfaction
???
Registration
#4563
Grade
A
Satisfaction
???
Course
Phil101
Difficulté
???
Note
???
Course
Phil101
Difficulty
???
Evaluation
high
?
Attributes uncertainty (PRM)
Attributes and link uncertainty (PRM extensions)
?

99 / 24
• Reference uncertainty: P(r.Course = ci, r.Student = sj | r.exists = true)
• A random variable for each individual id? Not generalizable
• Solution: partitioning
Difficulty Intelligence
Course Student
Registration
Student
Evaluation RankingCourse
P(Student | Course.Difficulty)?
P(Course)?
PRM with reference uncertainty .

1010 / 24
• P(Student | ClusterStudent) follows a uniform law
Course Student
Registration
ClusterCourse
Course
ClusterStudent
Student
P(CStudent | S.Intelligence)
low high
C1 0 1
C2 1 0
P(Student | CStudent)
C1 C2
s1 0 1
s2 1 0
Evaluation Ranking
PRM with reference uncertainty ..

1111 / 24
• P(Student | ClusterStudent) follows a uniform law
Course Student
Registration
ClusterCourse
Course
ClusterStudent
Student
P(CStudent | S.Intelligence)
low high
C1 0 1
C2 1 0
P(Student | CStudent)
C1 C2
s1 0 1
s2 1 0
Evaluation Ranking
PRM with reference uncertainty ..
highlow
Biolow
high C1
C2
Students Population stats
50% 50%
Partition Function

1212 / 24
Attributes-oriented Partition Functions in PRM-RU
• PRM-RU: Clustering from attributes
• Assumption: attributes explain the relationship
• Not generalizable, relationship information not used for partitioning
Course Student
P(Green | Red) = 1
P(Purple | Blue) = 1
YES

1313 / 24
Course Student
P(Green | Red) = 1
Course Student
P(Green | Red) = 1
YES IS THAT SO?

1414 / 24
Course Student Course Student
P(Green | Red) = 1
P(Green | Red) = 0.5
P(Purple | Red) = 0.5
Course Student
Course Student
P(Green | Red) = 1
YES NOIS THAT SO?

1515 / 24
Relationship-oriented Partitioning
• Objective: finding partitioning maximizing intra-partition edges
Course Student
P(Student.p1 | Course.p1) = 1
P(Student.p2 | Course.p2) = 1
p1
p2
Course Student

1616 / 24
Experiments – Protocol – Dataset generation
Entity 2
Att 1
…
Att n
R
1,n 1
Entity 1
Att 1
…
Att n
1 1,n
Schema
Instance
Entity 1 Entity 2R

1717 / 24
Experiments – Protocol – Dataset generation
Entity 2
Att 1
…
Att n
R
1,n 1
Entity 1
Att 1
…
Att n
1 1,n
Schema
Instance
Entity 1 Entity 2
Attributes partitioning
favorable case
Relationship partitioning
favorable case
Entity 1 Entity 2
R
R

1818 / 24
Experiments – Protocol – Learning
Entity 1 Entity 2Relation
Att n
Att 1
Att n
Att 1
CE1
CE2
E2
E1
• Parameter learning on set up structure
• 2 PRM compared:
– Either with attributes partitioning
– Or with relational partitioning

1919 / 24
Experiments – Protocol – Evaluation
• For each generated dataset D
– Split D into 10 subsets {D1, …, D10}
– Perform 10 Folds CV each with one Di for test and others for training
• Do it for PRM with attributes partitioning : store the results of 10 log likelihood PattsLL[i]
• Do it for PRM with relationship partitioning : store the results of 10 log likelihood PrelLL[i]
– Evaluate mean and sd of PattsLL[i] and PrelLL[i]
– Evaluate significancy of relationship partitioning over attributes partitioning

2020 / 24
Experiments – Results
Random clusters
(independent from attributes)
k
2 4 16
n
25
50
100
200
Relational > Attributes partitioning
Attributes > Relational partitioning
Partitionings not significantly comparable
k
2 4 16
n
25
50
100
200
Attributes => Cluster
(fully dependent from attributes)
Green:
Red:
Orange:

2121 / 24
Experiments – About the NMF choice for partitioning
• NMF
– Find low dimension factor matrices which product approximates the original matrix
– A relationship between two entities is an adjacency matrix
• Motivation for NMF usage
– (Restrictively) captures latent information from both rows and columns: co-clustering
– Several extensions dedicated to more accurate co-clustering (NMTF)
– Extensions for Laplacian regularization
• Allow to capture both attributes and relationship information for clustering
– Extensions for Tensor factorization
• Allow to model n-ary relationships, n >= 2
– NMF = Good starting choice for the long-term needs?

2222 / 24
Experiments – About the NMF choice for partitioning
• But
– Troubles with performances in experimentations
– Very sensitive to initialization: crashes whenever reaching singular
state
– Moving toward large scale methods : graph based relational
clustering?

2323 / 24
Conclusion
• PRM-RU to define probability structure in relational datasets
• Need for partitioning
• PRM-RU use attributes oriented partitioning
• We propose to cluster the relationship information instead
• Experiments show that :
– Attributes partitioning do not explain all relationships
– Relational partitioning can explain attributes oriented relationships

2424 / 24
Perspectives
• Experiments on real life datasets
• Towards large scale partitioning methods
• PRM-RU Structure Learning using clustering algorithms
• What about other link uncertainty representations?

Anthony Coutant, Philippe Leray, Hoel Le Capitaine
DUKe (Data, User, Knowledge) Team, LINA
Questions?
7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes
(anthony.coutant | philippe.leray | hoel.lecapitaine)
@univ-nantes.fr

Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

More Related Content

What's hot (16)

Similar to Learning Probabilistic Relational Models using Non-Negative Matrix Factorization (20)

Recently uploaded (20)

Learning Probabilistic Relational Models using Non-Negative Matrix Factorization

Editor's Notes