SlideShare a Scribd company logo
Higher-order clustering in
networks
Austin R. Benson · Cornell
HONS 2018
June 8, 2018 · Paris, France
HONS'18Austin R. Benson 1
Joint work with
Hao Yin · Stanford
Jure Leskovec ·
slides ⟶ bit.ly/arb-HONS-18 code ⟶ github.com/arbenson/HigherOrderClustering.jl
Many networks are globally sparse but locally
dense.
HONS'18Austin R. Benson 2
Coauthorship network
Brain network
Sporns and
Bullmore, Nature
Rev. Neuro., 2012
Networks for real-world systems have modules, clusters, communities.
[Watts-Strogatz 98; Flake 00; Newman 04, 06; many others…]
HONS'18Austin R. Benson 3
How do we measure
how much a network clusters?
The clustering coefficient is a fundamental
measure in network science about how much a
network clusters.
HONS'18Austin R. Benson 4
?
C(u) = fraction of length-2 paths centered at node u
that form a triangle.
Average clustering coefficient C = mean of C(u).
• Data insights. Average clustering coefficient is larger than we would expect.
[Watts-Strogatz 98] > 36k citations!
• Domain phenomenon. Triadic closure in sociology.
[Simmel 1908; Rapoport 53; Granovetter 73]
• Statistical Feature. Role discovery, anomaly detection, mental health study.
[Henderson+ 12; La Fond+ 14, 16; Bearman-Moody 2004]
• Modeling tool. Key property for generative models.
[Newman 09; Seshadhri-Kolda-Pinar 12; Roble+ 16]
-
Higher-order clustering coefficients are limited.
HONS'18Austin R. Benson 5
The clustering coefficient measures the closure
probability of just one simple structure—the triangle.
• 4-cliques reveal community structure in word association and PPI networks [Palla+ 05]
• 4-/5-cliques (+ other structure) identify network type & dimension [Yaveroğlu+ 14, Bonato+ 14
• 4-node motifs identify community structure in neural systems [Benson-Gleich-Leskovec 16]
… but there is lots of evidence that dense “higher-
order structure” between > 3 nodes are also
important for clustering.
We will show that triangles are insufficient to
explain clustering. We need larger cliques.
HONS'18Austin R. Benson 6
• old idea ⟶ pretty much all real-world networks exhibit
clustering.
• new idea ⟶ networks may only cluster up to a certain “order”.
HONS'18Austin R. Benson 7
Triangles tell just one part of the story.
How do we measure clustering with
respect to higher-order (clique) closure?
1. Find a 2-clique 2. Attach adjacent edge 3. Check for (2+1)-clique
1. Find a 3-clique 2. Attach adjacent edge 3. Check for (3+1)-clique
1. Find a 4-clique 2. Attach adjacent edge 3. Check for (4+1)-
clique
8
C2 = avg. fraction of (2-clique, adjacent edge) pairs that induce a (2+1)-clique.
Increase clique size by 1 to get a higher-order clustering coefficient!
C3 = avg. fraction of (3-clique, adjacent edge) pairs that induce a (3+1)-clique.
C4 = avg. fraction of (4-clique, adjacent edge) pairs that induce a (4+1)-clique.
-
-
-
We view clustering as a clique expansion process.
HONS'18Austin R. Benson
9
We can think of higher-order closure processes in
everyday life.
HONS'18Austin R. Benson
Alice
Bob
Charlie
1. Start with a group
of 3 friends.
2. One person in the
group befriends
someone new.
3.The group might
increase in size.
Dave
rollingstone.com
oprah.com
10
Higher-order clustering coefficients offer
several advantages.
HONS'18Austin R. Benson
Theory & analysis.
• Better understanding of small-world and Gn,p random graph models.
• Extremal combinatorics for general graphs.
Data Insights.
• old idea ⟶ pretty much all real-world networks exhibit clustering.
• new idea ⟶ real-world networks may only cluster up to a certain order.
order.
11
Background.
Local, average, and global clustering coefficients.
HONS'18Austin R. Benson
Second-order (classical)
local clustering
coefficient at node u.
Second-order (classical)
global clustering coefficient.
Second-order (classical)
average clustering
coefficient.
#
#
#
#
#
#
12
Higher-order (third-order)
local, average, and global clustering coefficients.
HONS'18Austin R. Benson
Third-order
local clustering
coefficient at node u.
Third-order
global clustering coefficient.
Third-order
average clustering
coefficient.
#
#
#
#
#
#
Theorem [Watts-Strogatz 98]
13
We can analyze higher-order clustering with
small-world models.
HONS'18Austin R. Benson
• Start with n nodes and edges to 2k neighbors
and then rewire each edge with probability p.
n = 16
k = 3
p = 0
[Yin-Benson-Leskovec 18]
[Watts-Strogatz 98]
14
We can also analyze higher-order clustering in
Gn,p.
HONS'18Austin R. Benson
Theorem [Yin-Benson-Leskovec 18]
Everything scales exponentially in the order of the cluster coefficient...
Even if a node’s neighborhood is dense, i.e., C2(u) is large,
higher-order clustering still decays exponentially in Gn,p.
15
Extremal combinatorics show relationships
between clustering coefficients of different orders.
HONS'18Austin R. Benson
Theorem [Yin-Benson-Leskovec 18]
Local higher-order clustering coefficients
hierarchically capture clique density in a node’s
neighborhood.
HONS'18Austin R. Benson 16
Theorem [Yin-Benson-Leskovec 18]
The product of the first r - 1 local higher-order clustering coefficients is
the r-clique density between the neighbors of node u.
Computation only requires clique participation
counts.
HONS'18Austin R. Benson 17
We can compute the rth-order HOCCs by
enumerating r- and (r + 1)-cliques.
Ka(u) is the number of a-
cliques containing u.
18
Higher-order clustering coefficients offer
several advantages.
HONS'18Austin R. Benson
Theory & analysis.
• Better understanding of small-world and Gn,p random graph models.
• Extremal combinatorics for general graphs.
Data Insights.
• old idea ⟶ pretty much all real-world networks exhibit clustering.
• new idea ⟶ real-world networks may only cluster up to a certain order.
order.
19
Neural connections (C. elegans)
297 nodes
2.15k edges
Facebook friendships (Stanford3)
11.6k nodes
568k edges
Coauthorships (arXiv ca-AstroPh)
18.8k nodes
198k edges
http://www.wormatlas.org/hermaphrodite/
neuronalsupport/mainframe.htm
HONS'18Austin R. Benson
Global clustering patterns varies widely across
datasets.
HONS'18Austin R. Benson 20
Neural connections
Facebook friendships
Coauthorships
Not obviously due to cliques in coauthorship!
High-degree nodes in co-authorships exhibit
clique + star structure where C3(u) > C2(u).
0.32 0.33 0.36 increases with order
0.16 0.11 0.12 decreases and increases
0.18 0.08 0.06 decreases with order
Average higher-order clustering also varies widely.
HONS'18Austin R. Benson 21
Neural connections 0.31 0.14
Random configurations 0.15 0.04
Random configurations (C2 fixed). 0.31 0.17
Facebook friendships 0.25 0.18
Random configurations 0.03 0.00
Random configurations (C2 fixed) 0.25 0.14
Coauthorships 0.68 0.61
Random configurations 0.01 0.00
Random configurations (C2 fixed). 0.68 0.60-
-
-
statistically
significantly
less
clustering
statistically
significantly
more clustering
Not significantly
different
clustering
(using sampling tools from [Bollobás 1980; Milo+ 03; Park-Newman 04; Colomer de Simón+ 13])
Random samples concentrate in neural
connections data.
HONS'18Austin R. Benson 22
Random configurations
[Bollobás 1980; Milo 2003]
Random configurations
with C2 fixed
[Park-Newman 2004;
Colomer de Simón+ 2013]
Real network (C. elegans)
-
Clustering in neural connections not just due to
cliques.
HONS'18Austin R. Benson 23
Original network Null model
# 4-cliques 2,010 440 ± 68
C3 0.14 0.17 ± 0.004
4-clique count decreases in the null model, but the
higher-order clustering coefficient increases.
-
Key reason. Clustering coefficients are
normalized by opportunities to cluster.
Changes in higher-order clustering tend to be
independent of the degree.
HONS'18Austin R. Benson 24
Neural connections Facebook friendships Coauthorships
HONS'18Austin R. Benson 25
Local higher-order clustering gives a more nuanced
view.
Neural connections
Gn,p baseline
Upper bound
Facebook friendships Coauthorships
Dense but nearly
random regions
Dense and
structured
regions
• Actual network data
• Random configuration with C2 fixed
-
Hitting
upper bound
HONS'18Austin R. Benson 26
Email Autonomous systems
Average third-order
clusteringNot significantly
different
clustering
statistically
significantly
more clustering
We should keep higher-order clustering in mind
when mining and modeling network data.
HONS'18Austin R. Benson 27
1. Only using triangles gives a misleading notion of clustering.
Some networks do not even exhibit clustering w/r/t larger cliques!
→ Are there models that capture higher-order clustering statistics?
2. Higher-order clustering coefficients and closure coefficients offer
additional measures of network clustering.
→We should plug these features into ML pipelines for network data.
3. We examined higher-order structure from dyadic data.
→What happens if we use hypergraph data?
Higher-order clustering in networks.
Thanks for your attention!
HONS'18Austin R. Benson 28
Austin R. Benson
http://cs.cornell.edu/~arb
@austinbenson
arb@cs.cornell.edu
Yin, Benson, and Leskovec. Higher-order clustering in networks. Physical Review E,
2018.
Code. github.com/arbenson/HigherOrderClustering.jl
Slides. bit.ly/arb-HONS-18

More Related Content

PPTX
Higher-order clustering coefficients
PPTX
Higher-order clustering coefficients at Purdue CSoI
PDF
Higher-order clustering coefficients
PPT
Socialnetworkanalysis (Tin180 Com)
PPTX
New perspectives on measuring network clustering
PDF
Simplicial closure & higher-order link prediction
PPTX
Simplicial closure and higher-order link prediction (SIAMNS18)
PDF
Higher-order Link Prediction GraphEx
Higher-order clustering coefficients
Higher-order clustering coefficients at Purdue CSoI
Higher-order clustering coefficients
Socialnetworkanalysis (Tin180 Com)
New perspectives on measuring network clustering
Simplicial closure & higher-order link prediction
Simplicial closure and higher-order link prediction (SIAMNS18)
Higher-order Link Prediction GraphEx

Similar to Higher-order clustering in networks (20)

PPTX
Higher-order spectral graph clustering with motifs
PDF
Interpretation of the biological knowledge using networks approach
PPTX
Communities in Network Science
PPTX
Simplicial closure and higher-order link prediction
PPT
Clique-based Network Clustering
PPTX
Community detection
PPTX
Network Flow
PDF
Electronic Supplementary Material for: Analytical reasoning task reveals limi...
PDF
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
PDF
Distribution of maximal clique size of the
PDF
CORRELATION AND REGRESSION ANALYSIS FOR NODE BETWEENNESS CENTRALITY
PDF
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
PDF
ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT
PDF
PDF
CORRELATION AND REGRESSION ANALYSIS FOR NODE BETWEENNESS CENTRALITY
PPT
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
PDF
009_20150201_Structural Inference for Uncertain Networks
PPT
mathematics of network science: basic definitions
PPTX
Unit 6: All
PDF
Simplicial closure and higher-order link prediction --- SIAMNS18
Higher-order spectral graph clustering with motifs
Interpretation of the biological knowledge using networks approach
Communities in Network Science
Simplicial closure and higher-order link prediction
Clique-based Network Clustering
Community detection
Network Flow
Electronic Supplementary Material for: Analytical reasoning task reveals limi...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
Distribution of maximal clique size of the
CORRELATION AND REGRESSION ANALYSIS FOR NODE BETWEENNESS CENTRALITY
O N T HE D ISTRIBUTION OF T HE M AXIMAL C LIQUE S IZE F OR T HE V ERTICES IN ...
ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT
CORRELATION AND REGRESSION ANALYSIS FOR NODE BETWEENNESS CENTRALITY
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
009_20150201_Structural Inference for Uncertain Networks
mathematics of network science: basic definitions
Unit 6: All
Simplicial closure and higher-order link prediction --- SIAMNS18
Ad

More from Austin Benson (20)

PDF
Hypergraph Cuts with General Splitting Functions (JMM)
PDF
Spectral embeddings and evolving networks
PDF
Computational Frameworks for Higher-order Network Data Analysis
PDF
Higher-order link prediction and other hypergraph modeling
PDF
Hypergraph Cuts with General Splitting Functions
PDF
Hypergraph Cuts with General Splitting Functions
PDF
Higher-order link prediction
PDF
Three hypergraph eigenvector centralities
PDF
Semi-supervised learning of edge flows
PDF
Choosing to grow a graph
PDF
Link prediction in networks with core-fringe structure
PDF
Higher-order Link Prediction Syracuse
PDF
Random spatial network models for core-periphery structure
PDF
Random spatial network models for core-periphery structure.
PDF
Simplicial closure & higher-order link prediction
PDF
Simplicial closure and simplicial diffusions
PDF
Sampling methods for counting temporal motifs
PDF
Set prediction three ways
PDF
Sequences of Sets KDD '18
PPTX
Tensor Eigenvectors and Stochastic Processes
Hypergraph Cuts with General Splitting Functions (JMM)
Spectral embeddings and evolving networks
Computational Frameworks for Higher-order Network Data Analysis
Higher-order link prediction and other hypergraph modeling
Hypergraph Cuts with General Splitting Functions
Hypergraph Cuts with General Splitting Functions
Higher-order link prediction
Three hypergraph eigenvector centralities
Semi-supervised learning of edge flows
Choosing to grow a graph
Link prediction in networks with core-fringe structure
Higher-order Link Prediction Syracuse
Random spatial network models for core-periphery structure
Random spatial network models for core-periphery structure.
Simplicial closure & higher-order link prediction
Simplicial closure and simplicial diffusions
Sampling methods for counting temporal motifs
Set prediction three ways
Sequences of Sets KDD '18
Tensor Eigenvectors and Stochastic Processes
Ad

Recently uploaded (20)

PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Computer network topology notes for revision
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Computer network topology notes for revision
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Reliability_Chapter_ presentation 1221.5784
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Clinical guidelines as a resource for EBP(1).pdf
Business Acumen Training GuidePresentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Miokarditis (Inflamasi pada Otot Jantung)
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Fluorescence-microscope_Botany_detailed content
Qualitative Qantitative and Mixed Methods.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction-to-Cloud-ComputingFinal.pptx
Data_Analytics_and_PowerBI_Presentation.pptx

Higher-order clustering in networks

  • 1. Higher-order clustering in networks Austin R. Benson · Cornell HONS 2018 June 8, 2018 · Paris, France HONS'18Austin R. Benson 1 Joint work with Hao Yin · Stanford Jure Leskovec · slides ⟶ bit.ly/arb-HONS-18 code ⟶ github.com/arbenson/HigherOrderClustering.jl
  • 2. Many networks are globally sparse but locally dense. HONS'18Austin R. Benson 2 Coauthorship network Brain network Sporns and Bullmore, Nature Rev. Neuro., 2012 Networks for real-world systems have modules, clusters, communities. [Watts-Strogatz 98; Flake 00; Newman 04, 06; many others…]
  • 3. HONS'18Austin R. Benson 3 How do we measure how much a network clusters?
  • 4. The clustering coefficient is a fundamental measure in network science about how much a network clusters. HONS'18Austin R. Benson 4 ? C(u) = fraction of length-2 paths centered at node u that form a triangle. Average clustering coefficient C = mean of C(u). • Data insights. Average clustering coefficient is larger than we would expect. [Watts-Strogatz 98] > 36k citations! • Domain phenomenon. Triadic closure in sociology. [Simmel 1908; Rapoport 53; Granovetter 73] • Statistical Feature. Role discovery, anomaly detection, mental health study. [Henderson+ 12; La Fond+ 14, 16; Bearman-Moody 2004] • Modeling tool. Key property for generative models. [Newman 09; Seshadhri-Kolda-Pinar 12; Roble+ 16] -
  • 5. Higher-order clustering coefficients are limited. HONS'18Austin R. Benson 5 The clustering coefficient measures the closure probability of just one simple structure—the triangle. • 4-cliques reveal community structure in word association and PPI networks [Palla+ 05] • 4-/5-cliques (+ other structure) identify network type & dimension [Yaveroğlu+ 14, Bonato+ 14 • 4-node motifs identify community structure in neural systems [Benson-Gleich-Leskovec 16] … but there is lots of evidence that dense “higher- order structure” between > 3 nodes are also important for clustering.
  • 6. We will show that triangles are insufficient to explain clustering. We need larger cliques. HONS'18Austin R. Benson 6 • old idea ⟶ pretty much all real-world networks exhibit clustering. • new idea ⟶ networks may only cluster up to a certain “order”.
  • 7. HONS'18Austin R. Benson 7 Triangles tell just one part of the story. How do we measure clustering with respect to higher-order (clique) closure?
  • 8. 1. Find a 2-clique 2. Attach adjacent edge 3. Check for (2+1)-clique 1. Find a 3-clique 2. Attach adjacent edge 3. Check for (3+1)-clique 1. Find a 4-clique 2. Attach adjacent edge 3. Check for (4+1)- clique 8 C2 = avg. fraction of (2-clique, adjacent edge) pairs that induce a (2+1)-clique. Increase clique size by 1 to get a higher-order clustering coefficient! C3 = avg. fraction of (3-clique, adjacent edge) pairs that induce a (3+1)-clique. C4 = avg. fraction of (4-clique, adjacent edge) pairs that induce a (4+1)-clique. - - - We view clustering as a clique expansion process. HONS'18Austin R. Benson
  • 9. 9 We can think of higher-order closure processes in everyday life. HONS'18Austin R. Benson Alice Bob Charlie 1. Start with a group of 3 friends. 2. One person in the group befriends someone new. 3.The group might increase in size. Dave rollingstone.com oprah.com
  • 10. 10 Higher-order clustering coefficients offer several advantages. HONS'18Austin R. Benson Theory & analysis. • Better understanding of small-world and Gn,p random graph models. • Extremal combinatorics for general graphs. Data Insights. • old idea ⟶ pretty much all real-world networks exhibit clustering. • new idea ⟶ real-world networks may only cluster up to a certain order. order.
  • 11. 11 Background. Local, average, and global clustering coefficients. HONS'18Austin R. Benson Second-order (classical) local clustering coefficient at node u. Second-order (classical) global clustering coefficient. Second-order (classical) average clustering coefficient. # # # # # #
  • 12. 12 Higher-order (third-order) local, average, and global clustering coefficients. HONS'18Austin R. Benson Third-order local clustering coefficient at node u. Third-order global clustering coefficient. Third-order average clustering coefficient. # # # # # #
  • 13. Theorem [Watts-Strogatz 98] 13 We can analyze higher-order clustering with small-world models. HONS'18Austin R. Benson • Start with n nodes and edges to 2k neighbors and then rewire each edge with probability p. n = 16 k = 3 p = 0 [Yin-Benson-Leskovec 18] [Watts-Strogatz 98]
  • 14. 14 We can also analyze higher-order clustering in Gn,p. HONS'18Austin R. Benson Theorem [Yin-Benson-Leskovec 18] Everything scales exponentially in the order of the cluster coefficient... Even if a node’s neighborhood is dense, i.e., C2(u) is large, higher-order clustering still decays exponentially in Gn,p.
  • 15. 15 Extremal combinatorics show relationships between clustering coefficients of different orders. HONS'18Austin R. Benson Theorem [Yin-Benson-Leskovec 18]
  • 16. Local higher-order clustering coefficients hierarchically capture clique density in a node’s neighborhood. HONS'18Austin R. Benson 16 Theorem [Yin-Benson-Leskovec 18] The product of the first r - 1 local higher-order clustering coefficients is the r-clique density between the neighbors of node u.
  • 17. Computation only requires clique participation counts. HONS'18Austin R. Benson 17 We can compute the rth-order HOCCs by enumerating r- and (r + 1)-cliques. Ka(u) is the number of a- cliques containing u.
  • 18. 18 Higher-order clustering coefficients offer several advantages. HONS'18Austin R. Benson Theory & analysis. • Better understanding of small-world and Gn,p random graph models. • Extremal combinatorics for general graphs. Data Insights. • old idea ⟶ pretty much all real-world networks exhibit clustering. • new idea ⟶ real-world networks may only cluster up to a certain order. order.
  • 19. 19 Neural connections (C. elegans) 297 nodes 2.15k edges Facebook friendships (Stanford3) 11.6k nodes 568k edges Coauthorships (arXiv ca-AstroPh) 18.8k nodes 198k edges http://www.wormatlas.org/hermaphrodite/ neuronalsupport/mainframe.htm HONS'18Austin R. Benson
  • 20. Global clustering patterns varies widely across datasets. HONS'18Austin R. Benson 20 Neural connections Facebook friendships Coauthorships Not obviously due to cliques in coauthorship! High-degree nodes in co-authorships exhibit clique + star structure where C3(u) > C2(u). 0.32 0.33 0.36 increases with order 0.16 0.11 0.12 decreases and increases 0.18 0.08 0.06 decreases with order
  • 21. Average higher-order clustering also varies widely. HONS'18Austin R. Benson 21 Neural connections 0.31 0.14 Random configurations 0.15 0.04 Random configurations (C2 fixed). 0.31 0.17 Facebook friendships 0.25 0.18 Random configurations 0.03 0.00 Random configurations (C2 fixed) 0.25 0.14 Coauthorships 0.68 0.61 Random configurations 0.01 0.00 Random configurations (C2 fixed). 0.68 0.60- - - statistically significantly less clustering statistically significantly more clustering Not significantly different clustering (using sampling tools from [Bollobás 1980; Milo+ 03; Park-Newman 04; Colomer de Simón+ 13])
  • 22. Random samples concentrate in neural connections data. HONS'18Austin R. Benson 22 Random configurations [Bollobás 1980; Milo 2003] Random configurations with C2 fixed [Park-Newman 2004; Colomer de Simón+ 2013] Real network (C. elegans) -
  • 23. Clustering in neural connections not just due to cliques. HONS'18Austin R. Benson 23 Original network Null model # 4-cliques 2,010 440 ± 68 C3 0.14 0.17 ± 0.004 4-clique count decreases in the null model, but the higher-order clustering coefficient increases. - Key reason. Clustering coefficients are normalized by opportunities to cluster.
  • 24. Changes in higher-order clustering tend to be independent of the degree. HONS'18Austin R. Benson 24 Neural connections Facebook friendships Coauthorships
  • 25. HONS'18Austin R. Benson 25 Local higher-order clustering gives a more nuanced view. Neural connections Gn,p baseline Upper bound Facebook friendships Coauthorships Dense but nearly random regions Dense and structured regions • Actual network data • Random configuration with C2 fixed - Hitting upper bound
  • 26. HONS'18Austin R. Benson 26 Email Autonomous systems Average third-order clusteringNot significantly different clustering statistically significantly more clustering
  • 27. We should keep higher-order clustering in mind when mining and modeling network data. HONS'18Austin R. Benson 27 1. Only using triangles gives a misleading notion of clustering. Some networks do not even exhibit clustering w/r/t larger cliques! → Are there models that capture higher-order clustering statistics? 2. Higher-order clustering coefficients and closure coefficients offer additional measures of network clustering. →We should plug these features into ML pipelines for network data. 3. We examined higher-order structure from dyadic data. →What happens if we use hypergraph data?
  • 28. Higher-order clustering in networks. Thanks for your attention! HONS'18Austin R. Benson 28 Austin R. Benson http://cs.cornell.edu/~arb @austinbenson arb@cs.cornell.edu Yin, Benson, and Leskovec. Higher-order clustering in networks. Physical Review E, 2018. Code. github.com/arbenson/HigherOrderClustering.jl Slides. bit.ly/arb-HONS-18