SlideShare a Scribd company logo
Finding All Maximal Cliques in
Very Large Social Networks
16 March 2016, EDBT 2016, Bordeaux, France
Alessio Conte°, Roberto De Virgilio§
, Antonio Maccioni§
,
Maurizio Patrignani§
, Riccardo Torlone§
°Università di Pisa §
Università Roma Tre
Social Network Analysis
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1
Community Detection
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2
Cliques
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3
A
J H
F
Z
A
J
A
J
F
A
J H
F
Maximal Clique Enumeration (MCE)
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4
A
J
H H
F
D D E
S
E Y
E G
S U
S W
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
D
U
S
W
E
E
S D
GY
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S X
GY
D
L
S
W
U
Distributed MCE
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11
[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]
Block size m (Max number of nodes) = 5
A
J
H
F
D
Z R
P
D E
E
S
X
GY
D
L
S
W
U
D E
S
undetected cliques
Hub Nodes Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12
Block size m (Max number of nodes) = 5
A
J
H
F
Z R
P
D
L
E
S
W
U
X
G
Y
The Block Size Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13
* Taken from [Cheng et al., KDD 2012]
efficiency vs completeness/correcteness
Overview of the Approach
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14
G = (N, E)
C = c1
, c2
, ..., cn
1st level decomposition
Induced graph2nd level decomposition
FIND MAX CLIQUES
FIND MAX
CLIQUES
Block analysis
Nf Nh
Block 1 Block z...
U
Cf
Ch
1st Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15
Separate hubs from the rest of nodes in N
- according to a maximum block size m
1st level decomposition
FIND MAX CLIQUES
Nf Nh
G = (N, E)
A
J
H
F
Z R
P
L W
U
X
G
Y
D E
S
1st Level Decomposition - Lemma
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16
The set of all maximal cliques C of G can
be obtained by computing Cf
and Ch
alone
- C = Cf
U Ch
- Cf
is the set of cliques with at least one node in Nf
- Ch
is the set of cliques with all the nodes in Nh
- proof of this Lemma is in our paper
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19
Kernel node Border nodeKernel node Visited node
B1
A
J
H
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20
Kernel node Border nodeKernel node Visited node
B1
A
J
H
F
D
2nd level decomposition
FIND MAX CLIQUES
Nf
Block 1 Block z...
2nd Level Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21
Kernel node Border nodeKernel node Visited node
Maximal Clique Enumeration
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22
There are many algorithms for MCE
- no one outperforms the others
- but each has specific advantages
Tomita et al. Eppstein et al....
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
Block Analysis
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23
We determine the best-fit MCE algorithm on
each block
Block analysis
Select
best-fit
Tomita et al. Eppstein et al....
Cf
Block
FIND MAX CLIQUES
Block analysis
Block 1 Block z...
Cf
Decision Tree
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24
Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
Recursion Over Hub Nodes
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26
The induced graph of the hub nodes is recursively
processed
B12
D
E
S
Kernel node Border nodeKernel node Visited node
Induced graph
FIND MAX CLIQUES
FIND MAX
CLIQUES
Nh
Ch
Convergence Guarantee
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27
Given a degeneracy of the graph lower than
the block size, the whole process converges
Theorem. Let G be a graph and let Gi
, with i=1, 2, 3, ... be a sequence
of subgraphs of G such that G1
= G and Gi
, for i > 1 is the graph induced
by the nodes of Gi−1
of degree greater or equal than m.
Let the degeneracy d of G be strictly less than m + 1.
1. There is a value q such that all Gj
, with j ≥ q, are empty graphs.
2. There exists a graph with n nodes for which q is Ω(n).
Degeneracy and Sparsity
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28
A measure of the sparsity of a graph
- it is the highest value d for which the network
contains a d-core. A d-core is obtained by
recursively removing nodes with degree less
than d
- degeneracy is typically < 100 on scale-free real
graphs
- facebook has a degeneracy ~ 54
Experiments: Decomposition
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29
Experiments: the Block Size Effect
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30
Experiments: Decision Tree
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31
Experiments: Effectiveness
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32
Conclusion
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33
Approach for computing maximal cliques over an
arbitrarily large graph
- taking advantage of distributed computation
- leveraging on the advantages of different existing
algorithms for MCE
Completeness and correcteness are not
compromised by hub nodes
- unlike state-of-the-art
Future Work
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34
Take into account the “semantics” of the graph
in order to search for specific kind of cliques
Extend our framework for computing more
relaxed communities
- k-plexes, k-clans, k-cores, etc.
Thanks For The Attention
Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35

More Related Content

PDF
Accordion - VLDB 2014
PPT
4 Cliques Clusters
PDF
Clustering of graphs and search of assemblages
PPTX
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
PDF
Social network analysis part ii
PDF
Personal Information Management Systems - EDBT/ICDT'15 Tutorial
PDF
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
PDF
Graph Analysis Beyond Linear Algebra
Accordion - VLDB 2014
4 Cliques Clusters
Clustering of graphs and search of assemblages
From HadoopDB to Hadapt: A Case Study of Transitioning a VLDB paper into Real...
Social network analysis part ii
Personal Information Management Systems - EDBT/ICDT'15 Tutorial
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
Graph Analysis Beyond Linear Algebra

Similar to Finding All Maximal Cliques in Very Large Social Networks (9)

PPTX
Group and Community Detection in Social Networks
PDF
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
PDF
Netwoks icml09
PDF
Gaps between the theory and practice of large-scale matrix-based network comp...
PDF
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
PDF
ch10-graphs2.pdf
PDF
High-Performance Graph Analysis and Modeling
PPT
Mediapresentation file for social media.
PPTX
Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks
Group and Community Detection in Social Networks
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
Netwoks icml09
Gaps between the theory and practice of large-scale matrix-based network comp...
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
ch10-graphs2.pdf
High-Performance Graph Analysis and Modeling
Mediapresentation file for social media.
Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks
Ad

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PDF
Transcultural that can help you someday.
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Introduction to Data Science and Data Analysis
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Business Analytics and business intelligence.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
Database Infoormation System (DBIS).pptx
Transcultural that can help you someday.
[EN] Industrial Machine Downtime Prediction
Introduction to Data Science and Data Analysis
oil_refinery_comprehensive_20250804084928 (1).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
SAP 2 completion done . PRESENTATION.pptx
Introduction to Knowledge Engineering Part 1
Introduction-to-Cloud-ComputingFinal.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Clinical guidelines as a resource for EBP(1).pdf
Business Analytics and business intelligence.pdf
Mega Projects Data Mega Projects Data
Supervised vs unsupervised machine learning algorithms
IB Computer Science - Internal Assessment.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
annual-report-2024-2025 original latest.
Optimise Shopper Experiences with a Strong Data Estate.pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
Ad

Finding All Maximal Cliques in Very Large Social Networks

  • 1. Finding All Maximal Cliques in Very Large Social Networks 16 March 2016, EDBT 2016, Bordeaux, France Alessio Conte°, Roberto De Virgilio§ , Antonio Maccioni§ , Maurizio Patrignani§ , Riccardo Torlone§ °Università di Pisa § Università Roma Tre
  • 2. Social Network Analysis Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1
  • 3. Community Detection Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2
  • 4. Cliques Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3 A J H F Z A J A J F A J H F
  • 5. Maximal Clique Enumeration (MCE) Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4 A J H H F D D E S E Y E G S U S W
  • 6. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 7. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 8. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D D U S W E E S D GY
  • 9. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F Z R P D L E S W U X G Y
  • 10. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U
  • 11. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U
  • 12. Distributed MCE Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11 [Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010] Block size m (Max number of nodes) = 5 A J H F D Z R P D E E S X GY D L S W U D E S undetected cliques
  • 13. Hub Nodes Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12 Block size m (Max number of nodes) = 5 A J H F Z R P D L E S W U X G Y
  • 14. The Block Size Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13 * Taken from [Cheng et al., KDD 2012] efficiency vs completeness/correcteness
  • 15. Overview of the Approach Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14 G = (N, E) C = c1 , c2 , ..., cn 1st level decomposition Induced graph2nd level decomposition FIND MAX CLIQUES FIND MAX CLIQUES Block analysis Nf Nh Block 1 Block z... U Cf Ch
  • 16. 1st Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15 Separate hubs from the rest of nodes in N - according to a maximum block size m 1st level decomposition FIND MAX CLIQUES Nf Nh G = (N, E) A J H F Z R P L W U X G Y D E S
  • 17. 1st Level Decomposition - Lemma Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16 The set of all maximal cliques C of G can be obtained by computing Cf and Ch alone - C = Cf U Ch - Cf is the set of cliques with at least one node in Nf - Ch is the set of cliques with all the nodes in Nh - proof of this Lemma is in our paper
  • 18. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 19. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18 Kernel node Border nodeKernel node Visited node B1 A J H 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 20. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19 Kernel node Border nodeKernel node Visited node B1 A J H 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 21. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20 Kernel node Border nodeKernel node Visited node B1 A J H F D 2nd level decomposition FIND MAX CLIQUES Nf Block 1 Block z...
  • 22. 2nd Level Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21 Kernel node Border nodeKernel node Visited node
  • 23. Maximal Clique Enumeration Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22 There are many algorithms for MCE - no one outperforms the others - but each has specific advantages Tomita et al. Eppstein et al.... FIND MAX CLIQUES Block analysis Block 1 Block z... Cf
  • 24. Block Analysis Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23 We determine the best-fit MCE algorithm on each block Block analysis Select best-fit Tomita et al. Eppstein et al.... Cf Block FIND MAX CLIQUES Block analysis Block 1 Block z... Cf
  • 25. Decision Tree Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24
  • 26. Recursion Over Hub Nodes Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25 Induced graph FIND MAX CLIQUES FIND MAX CLIQUES Nh Ch
  • 27. Recursion Over Hub Nodes Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26 The induced graph of the hub nodes is recursively processed B12 D E S Kernel node Border nodeKernel node Visited node Induced graph FIND MAX CLIQUES FIND MAX CLIQUES Nh Ch
  • 28. Convergence Guarantee Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27 Given a degeneracy of the graph lower than the block size, the whole process converges Theorem. Let G be a graph and let Gi , with i=1, 2, 3, ... be a sequence of subgraphs of G such that G1 = G and Gi , for i > 1 is the graph induced by the nodes of Gi−1 of degree greater or equal than m. Let the degeneracy d of G be strictly less than m + 1. 1. There is a value q such that all Gj , with j ≥ q, are empty graphs. 2. There exists a graph with n nodes for which q is Ω(n).
  • 29. Degeneracy and Sparsity Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28 A measure of the sparsity of a graph - it is the highest value d for which the network contains a d-core. A d-core is obtained by recursively removing nodes with degree less than d - degeneracy is typically < 100 on scale-free real graphs - facebook has a degeneracy ~ 54
  • 30. Experiments: Decomposition Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29
  • 31. Experiments: the Block Size Effect Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30
  • 32. Experiments: Decision Tree Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31
  • 33. Experiments: Effectiveness Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32
  • 34. Conclusion Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33 Approach for computing maximal cliques over an arbitrarily large graph - taking advantage of distributed computation - leveraging on the advantages of different existing algorithms for MCE Completeness and correcteness are not compromised by hub nodes - unlike state-of-the-art
  • 35. Future Work Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34 Take into account the “semantics” of the graph in order to search for specific kind of cliques Extend our framework for computing more relaxed communities - k-plexes, k-clans, k-cores, etc.
  • 36. Thanks For The Attention Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35