SlideShare a Scribd company logo
Spectral graph clustering
with motifs and
higher-order structures
David F. Gleich
Purdue University
Code & Data github.com/arbenson/higher-order-organization-julia
github.com/dgleich/motif-ssbm
9
10
8
7
2
0
4
3
11
6
5
1
Austin Benson (Stanford -> Cornell)
Jure Leskovec (Stanford) NAConf'17David Gleich · Purdue
1
Graphs and matrices have a long and
intertwined history.
Matrices and graphs represent
relationships among a group of
objects.
To study the relationships
• centrality
• reachability
• clustering
• … and more …
often use matrix computations
• e.g. Estrada & Higham, SIREV
• e.g. Network analysis, Brandes & Erlebach
Helen Bott, Observation of play
activities in a nursery school, 1928
Ax = b Ax = x
… a suggestion based on our work …
given a graph G = (V, E)
and its adjacency matrix A
consider using the weighted matrix W = A2
A
Hadamard /
element-wise
and its non-symmetric
adjacency matrix A
consider using a symmetric
weighted matrix from
given a directed graph G = (V, E)
Motif Matrix computations W =
M1 C = (U · U) UT
C + CT
M2 C = (B · U) UT
+ (U · B) UT
+ (U · U) B C + CT
M3 C = (B · B) U + (B · U) B + (U · B) B C + CT
M4 C = (B · B) B C
M5 C = (U · U) U + (U · UT
) U + (UT
· U) U C + CT
M6 C = (U · B) U + (B · UT
) UT
+ (UT
· U) B C
M7 C = (UT
· B) UT
+ (B · U) U + (U · UT
) B C
M8 C = (U · N) U + (N · UT
) UT
+ (UT
· U) N C
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
what
why
how
(A*A).*A Matlab
(A*A).*A Julia
np.dot(A,A)*A Python (A is array)
When clustering based on triangles, we often
have better numerical properties (e.g. eigenvalue
gaps) in model partitioning problems (stochastic
block models) and better real-world results.
The matrix W = A2
A arises from our motif and
higher-order clustering framework when using
triangles as the motif.
where
when
… a little story …
Networks are sets of nodes and edges (graphs)
that model real world systems
Key insight. [Flake et al., Newman et al., and hundreds more!]
Networks—for real-world systems—have modules, communities, clusters
This structure has traditionally been exposed with node and edge based
clustering metrics. Density, modularity, conductance, cut, ratio cuts, etc.
NAConf'17David Gleich · Purdue
9
Co-author network
8
Background network clustering is a fundamental network
analysis for finding coherent groups of nodes based on edges
§ Real-world networks have modular organization [Newman 2004, Newman 2006].
§ We want to automatically find the modules in the system.
Co-author network
§ Old idea Find groups of nodes with high internal edge density and low
external edge density [Newman 2004, Danon 2005, Leskovec+ 2009].
Brain network, de Reus et al., RSTB, 2014.
Brain network, de Reus et al., RSTB, 2014.
Similar tools are used to partition
computations for parallelism
Comanche mesh from Alex Pothen,
from Sparse Matrix Collection NAConf'17David Gleich · Purdue
10
There is abundant evidence that higher-order
connectivity patterns drive complex systems.
NAConf'17David Gleich · Purdue
11
4
es
order
drive
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A C
B
CC
A C
B
A B
C
ks.
4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A
B
A C
B
A C
B
A
B
A B
C
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007 4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A
B
A C
B
A C
B
A
B
A B
C
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A
B
A C
B
A C
B
A
B
A B
C
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
Mangan et al., 2003
Alon, 2007
Triangles in
social networks
Bi-directed paths
in brain networks
Signed feed-forward loops
in genetic transcription
Key Insight. [Milo et al. (Science 2002)]
Certain subgraphs were far more
common than expected.
We call any small subgraph a motif.
Nodes and edges may not be
the basis elements of these networks.
Why should we look for module structure
in terms of nodes and edges?
NAConf'17David Gleich · Purdue
12
Idea Find clusters of motifs
NAConf'17David Gleich · Purdue
13
Higher-order organization of
complex networks
We generalize spectral clustering, a classic
technique to find clusters or communities in a
graph, to use motifs to cluster the graph.
• Uses motif conductance instead of node & edge conductance
• We also bound the conductance in terms of the optimal solution
Outline
1. So we’ll briefly review how spectral clustering works
2. Then see how to adapt it to work with network motifs
3. Then see this procedure on real-world & model data
We can do motif-based clustering by
generalizing spectral clustering
Spectral clustering is a classic technique to partition
graphs by looking at eigenvectors.
M. Fiedler, 1973,
Algebraic
connectivity of
graphs
Graph Laplacian Eigenvector
NAConf'17David Gleich · Purdue
15
Earlier work by Simon, Ando, Courtois
dealt with a related decomposability idea
eigenvalueentry
A L = D 1/2
(D A)D 1/2
(D A)x = Dx
Spectral clustering works based on
conductance with node and edge cuts
NAConf'17David Gleich · Purdue
16
Conductance is one of the most important quality scored used to
identify network modules, clusters or communities [Schaeffer 2007]
used in Markov chain theory, bioinformatics, vision, etc.
(edges leaving the set)
(total edges
in the set)
(S) =
cut(S)
min vol(S), vol( ¯S)
S S
vol(S) =
P
i2S degree of i
(conductance)
cut(S) = # edges between S, ¯S
Spectral clustering works based on
conductance with node and edge cuts
NAConf'17David Gleich · Purdue
17
Conductance is one of the most important quality scored used to
identify network modules, clusters or communities [Schaeffer 2007]
used in Markov chain theory, bioinformatics, vision, etc.
(edges leaving the set)
(total edges
in the set)
(S) =
cut(S)
min vol(S), vol( ¯S)
S S
(conductance)
cut(S) = 7 cut( ¯S) = 7
|S| = 15 | ¯S| = 20
vol(S) = 85 vol( ¯S) = 151
cut(S) = 7 cut( ¯S) = 7
|S| = 15 | ¯S| = 20
vol(S) = 85 vol( ¯S) = 151
(S) = 7/85
= 0.082
Small conductance ó Good set
Spectral clustering has theoretical
guarantees
Cheeger Inequality
Finding the best conductance set
is NP-hard. L
• Cheeger realized the eigenvalues of the
Laplacian provided a bound in manifolds
• Alon and Milman independently realized
the same thing for a graph!
J. Cheeger, 1970,
A lower bound on
the smallest
eigenvalue of the
Laplacian
N. Alon, V. Milman
1985. λ1 isoperi-
metric inequalities
for graphs and
superconcentrators
Laplacian 2
⇤/2  2  2 ⇤
0 = 1  2  ...  n  2
Eigenvalues of the Laplacian
⇤ = set of smallest conductance
NAConf'17David Gleich · Purdue
18
The sweep cut algorithm realizes the
guarantee
We can find a set S that achieves
the Cheeger bound.
1. Compute the eigenvector
associated with λ2 (e.g. ARPACK)
2. Sort the vertices by their values
in the eigenvector: σ1, σ2, … σn
3. Let Sk = {σ1, …, σk} and
compute the conductance of
each Sk: φk = φ(Sk)
4. Pick the minimum φm of φk .
M. Mihail, 1989
Conductance and
convergence of
Markov chains
F. C. Graham,
1992, Spectral
Graph Theory.
NAConf'17David Gleich · Purdue
19
m  2
p
⇤
The sweep cut visualized
0 20 40
0
0.2
0.4
0.6
0.8
1
S
i
φi
(S) =
cut(S)
min vol(S), vol( ¯S)
NAConf'17David Gleich · Purdue
20
But current problems are much more rich
than where spectral is justified
Spectral clustering is theoretically justified for undirected graphs
• Various extensions to multiple clusters [Dhillon et al.; Gharan et al.; Jordan et al.]
• Weighted graphs are okay
• Approximate eigenvectors are okay [Mihail]
Current network models are more richly annotated
• directed, signed, colored, layered, multiplex, etc.
R. Milo, 2002, Science
X causes Y to be expressed
Z represses Y
X
Z
Y
+
–
NAConf'17David Gleich · Purdue
21
Nice recent work by [Fairbanks
et al. arXiv] on better
numerical stopping criteria!
There is a literature on directed spectral
graph partitioning, but it is hard to interpret
Markov chains
• Stewart (numerical solution to Markov chains)
• Chung (Random walks and cuts in dir graphs )
Nonlinear Laplacian
• Yoshida WSDM2016
Asymmetric Laplacian
• Boley et al. LAA2011 (commute times)
Gleich, Klymko, Kolda ASE BigData 2014
D 1
Ax = x
(D A)x = x
1
2 ⇧(D 1
A) + 1
2 (AT
D 1
)⇧
X
(u,v)2E
(
(xu xv )2
xu xv 0
0 otherwise
NAConf'17David Gleich · Purdue
22
Our contributions
1. A generalized conductance metric
for motifs
2. A “new” spectral clustering algorithm to
minimize the generalized conductance.
3. AND an associated Cheeger inequality.
(which handles directed graphs)
4. Aquatic layers in food webs
5. Hub structure in transportation
This talk, still preliminary!
NAConf'17David Gleich · Purdue
23
Some studies in stochastic block modelsNew!
Motif-based conductance generalizes
edge-based conductance
Need notions of cut and volume
S
S
S¯S
¯S
vol(S) = #(edge end points in S)
NAConf'17David Gleich · Purdue
24
cut(S) = #(edges cut by S) cutM (S) = #(motifs cut by S)
volM (S) = #(motif
end points in S)
M (S) =
cutM (S)
min(volM (S), volM ( ¯S))
(S) =
cut(S)
min(vol(S), vol( ¯S))
vol(S) =
P
i2S degree of i
An example of motif-conductance
9
10
6
5
8
1
7
2
0
4
3
11
9
10
8
7
2
0
4
3
11
6
5
1
¯S
S
Motif
M (S) =
motifs cut
motif volume
=
1
10
NAConf'17David Gleich · Purdue
25
How can we optimize motif conductance?
We thought that motif conductance would spark new tensor and
hypermatrix methods based on the motif adjacency tensor.
NAConf'17David Gleich · Purdue
26
1
3
2
A
We were wrong!
A(i, j, k) =
(
1 if motif involves nodes i, j, k
0 otherwise
Benson, Gleich, Leskovec, SDM 2016
There is a symmetric matrix that serves as the
appropriate tool to study motif conductance
9
10
6
5
8
1
7
2
0
4
3
11
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
A
W(M)
ij = counts co-occurrences of motif pattern between i, j
W(M)
NAConf'17David Gleich · Purdue
27
Going from motifs back to a matrix for
spectral clustering
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
W(M)
ij = counts co-occurrences of motif pattern between i, j
W(M)
KEY INSIGHT
Spectral clustering on
W(M) yields results on
the new motif notion
of conductance
M (S) =
motifs cut
motif volume
=
1
10
NAConf'17David Gleich · Purdue
28
Here is a quick illustration of how this works.
NAConf'17David Gleich · Purdue
29
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
9
10
8
7
2
0
4
3
11
6
5
1
M (S) =
motifs cut
motif volume
=
1
10
cut(S) = 2
vol(S) = 6 + 8 + 2 + 2 + 2
=
1
10
A motif-based clustering algorithm
1. Form weighted graph W(M)
2. Compute the Fiedler vector associated with λ2 of the
motif-normalized Laplacian
3. Run a sweep cut on f
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
W(M)
D = diag(W(M)
e)
L(M)
= D 1/2
(D W(M)
)D 1/2
L(M)
z = 2z
f(M)
= D 1/2
z
NAConf'17David Gleich · Purdue
30
The sweep cut results
2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1
2
0
4
3
1
2
0
4
3
9
10
6
Best higher-
order cluster
2nd best higher-
order cluster
9
10
6
5
8
1
7
2
0
4
3
11
1
1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
2
3
(Order from the Fiedler vector)
NAConf'17David Gleich · Purdue
31
There are nice matrix computations
for three-node motifs
NAConf'17David Gleich · Purdue
32
W = A2
A
4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loop
in genetic transcription.A
B
A
B
A C
B
D
A
B
A B
C
Figure 1: Higher-order network str
framework. A: Higher-order structur
13 connected three-node directed motif
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
There are nice matrix computations
for three-node directed motifs
Given a (directed) adjacency matrix A, let B = A AT
and U = A B
bidirectional unidirectional
Motif Matrix computations W =
M1 C = (U · U) UT
C + CT
M2 C = (B · U) UT
+ (U · B) UT
+ (U · U) B C + CT
M3 C = (B · B) U + (B · U) B + (U · B) B C + CT
M4 C = (B · B) B C
M5 C = (U · U) U + (U · UT
) U + (UT
· U) U C + CT
M6 C = (U · B) U + (B · UT
) UT
+ (UT
· U) B C
M7 C = (UT
· B) UT
+ (B · U) U + (U · UT
) B C
M8 C = (U · N) U + (N · UT
) UT
+ (UT
· U) N C
N
= ee
T
B
U
U
T
NAConf'17David Gleich · Purdue
33
The three-node
motif-based Cheeger inequality
THEOREM
If the motif has three nodes, then the
sweep procedure on the weighted graph
finds a set S of nodes for which
M(G) = {instances of M in G}
Key Proof Step
NAConf'17David Gleich · Purdue
34
cutM (S, G) =
X
{i,j,k}2M(G)
Indicator[xi , xj , xk not the same]
= 1
4 (x2
i + x2
j + x2
k xi xj xj xk xi xk )
= quadratic in x
M (S)  2
q
⇤
M
IMPLICATION
Just run spectral clustering
on those weighted matrices.
Awesome advantages
Works for arbitrary non-neg. combos of motifs too
We inherit 40+ years of research!
• Fast algorithms (ARPACK, etc.)!
• Local methods!
Yin, Benson, Leskovec, Gleich,
KDD2017
• Overlapping!
• Easy to implement
(20 lines of Matlab/Julia)
• Scalable (1.4B edges graphs
are not a prob.)
NAConf'17David Gleich · Purdue
35
17 elseif motif == "M5"
18 C = (U * U) .* U + (U * U’) .* U + (U’ * U) .* U
19 W = C + C’
20 elseif motif == "M6"
21 W = (U * B) .* U + (B * U’) .* U’ + (U’ * U) .* B
22 elseif motif == "M7"
23 W = (U’ * B) .* U’ + (B * U) .* U + (U * U’) .* B
24 else
25 error("Motif must be one of M1, M2, M3, M4, M5, M6, or M7.")
26 end
27
28 # Get Fiedler eigenvector
29 dinvsqrt = spdiagm(1.0 ./ sqrt.(vec(sum(W, 1))))
30 LM = I - dinvsqrt * W * dinvsqrt
31 lambdas, evecs = eigs(LM, nev=2, which=:SM)
32 z = dinvsqrt * real(evecs[:, 2])
33
34 # Sweep cut
35 sigma = sortperm(z)
36 C = W[sigma, sigma]
37 Csums = sum(C, 1)’
38 motifvolS = cumsum(Csums)
39 motifvolSbar = sum(W) * ones(length(sigma)) - motifvolS
40 conductances = cumsum(Csums - 2 * sum(triu(C), 1)’) ./ min.(motif
41 split = indmin(conductances)
42 if split <= length(size(A, 1) / 2)
43 return sigma[1:split]
44 else
45 return sigma[(split + 1):end]
46 end
47 end
Figure 2.3 – Julia implementation of the motif-based spectral clusteri
Case study 1
Motifs partition the food webs
Food webs model
energy exchange
in species of an
ecosystem.
means i’s energy
goes to j
(or j eats i)
NAConf'17David Gleich · Purdue
36
i j
Case study 1
Motifs partition the food webs
Food webs model
energy exchange
in species of an
ecosystem.
means i’s energy
goes to j
(or j eats i)
Via Cheeger, motif
conductance is
better than edge
conductance.
NAConf'17David Gleich · Purdue
37
i j
Demo and reproducibility
https://github.com/arbenson/higher-order-organization-julia
NAConf'17David Gleich · Purdue
38
# form W0 … W4
sc0 = spectral_cut(W0)
sc1 = spectral_cut(W1)
sc2 = spectral_cut(W2)
sc3 = spectral_cut(W3)
sc4 = spectral_cut(W4)
plt = x ->
semilogx(x.sweepcut_profile
.conductance)
plt(sc0)
plt(sc1)
plt(sc2)
plt(sc3)
plt(sc4)
Case study 1
Motifs partition the food webs
NAConf'17David Gleich · Purdue
39
B D
Micronutrient
sources
Pelagic fishes
and benthic
prey
Benthic macro-
invertebrates
Benthic Fishes
Motif M6 reveals
aquatic layers
A
61% accuracy vs.
48% with edge-
based methods
24
Application 1 Food webs
Case study 2
Hub structure in the air transportation network
North American air
transport network
Nodes are airports
Edges reflect
reachability, and
are unweighted.
(Based on Frey
et al.’s 2007)
NAConf'17David Gleich · Purdue
40
The weighed adjacency matrix already
reveals hub-like structure
NAConf'17David Gleich · Purdue
41
Accepted pending
	
B
A
Counts length-two walks
The motif embedding shows this structure
and splits into east-west
Top 10
U.S. hubs
East coast non-hubs
West coast non-hubs
Primary spectral coordinate
Atlanta, the top hub, is
next to Salina, a non-hub.
MOTIF SPECTRAL
EMBEDDING
EDGE SPECTRAL
EMBEDDING
NAConf'17David Gleich · Purdue
42
Case study 3: the stochastic block model
shows numerical advantages to motif-matrices
Model problems are useful in because they are simple and we often
“know” everything about them. They may not reflect real-world issues.
Mouse picture from Wikipedia Мышь_2.jpg,
Fly from oregonstateuniversity/11179958483
Biology Matrix Computations Clustering
r2
u = f
NAConf'17David Gleich · Purdue
43
The stochastic block model is extremely well
understood in theory
The symmetric stochastic block model (SSBM)
• k blocks, each of size m-by-m
• within-block edges exist with prob p
• between-block edges with prob q
symmetric stochastic block model
m = 200, k = 5,
p = 0.3, q = 0.13
m m m
2
6
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
7
5
m p q · · · q
m q p
...
...
...
m q q p
.
Reminescnt of Simon & Ando.
NAConf'17David Gleich · Purdue
44
The stochastic block model is extremely well
understood in theory
The symmetric stochastic block model (SSBM)
• k blocks, each of size m-by-m
• within-block edges exist with prob p
• between-block edges with prob q
m m m
2
6
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
7
5
m p q · · · q
m q p
...
...
...
m q q p
.
The task
Given a graph that is an SSBM and
given m, k, p, q
Find the k blocks.
Theory
E. Abbe, community
detection and the
stochastic block model
(In prep, on webpage)
• Necessary p > q
• Exact recovery (get all correct)
• Detectability (find a non-trivial portion)
• Uses non-backtracking random walk.
m = 200, k = 5,
p = 0.3, q = 0.13
NAConf'17David Gleich · Purdue
45
4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A C
B
D
Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network.
A: The four higher-order structures used by our higher-order clustering method, which can
model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive
delay elements in transcriptional regulation networks (46). The edge signs refer to activation
(positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A).
Clusters show functional modules consisting of several motifs (coherent feedforward loops),
which were previously studied individually (45). The higher-order clustering framework identi-
fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D:
Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func-
tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating
type match (D). The clustering suggests that coherent feedforward loops function together as a
single processing unit rather than as independent elements.
S48
A C
B
D
Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network.
A: The four higher-order structures used by our higher-order clustering method, which can
model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive
delay elements in transcriptional regulation networks (46). The edge signs refer to activation
(positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A).
Clusters show functional modules consisting of several motifs (coherent feedforward loops),
which were previously studied individually (45). The higher-order clustering framework identi-
fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D:
Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func-
tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating
type match (D). The clustering suggests that coherent feedforward loops function together as a
single processing unit rather than as independent elements.
S48
A C
B
D
Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network.
A: The four higher-order structures used by our higher-order clustering method, which can
model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive
delay elements in transcriptional regulation networks (46). The edge signs refer to activation
(positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A).
Clusters show functional modules consisting of several motifs (coherent feedforward loops),
which were previously studied individually (45). The higher-order clustering framework identi-
fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D:
Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func-
tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating
type match (D). The clustering suggests that coherent feedforward loops function together as a
single processing unit rather than as independent elements.
S48
A C
B
D
Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network.
A: The four higher-order structures used by our higher-order clustering method, which can
model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive
delay elements in transcriptional regulation networks (46). The edge signs refer to activation
(positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A).
Clusters show functional modules consisting of several motifs (coherent feedforward loops),
which were previously studied individually (45). The higher-order clustering framework identi-
fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D:
Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func-
tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating
type match (D). The clustering suggests that coherent feedforward loops function together as a
single processing unit rather than as independent elements.
S48
A B
C
Figure 1: Higher-order network structures and the higher-order network clustering
framework. A: Higher-order structures are captured by network motifs. For example, all
13 connected three-node directed motifs are shown here. B: Clustering of a network based on
motif M7. For a given motif M, our framework aims to find a set of nodes S that minimizes
motif conductance, M (S), which we define as the ratio of the number of motifs cut (filled
triangles cut) to the minimum number of nodes in instances of the motif in either S or ¯S (13).
In this case, there is one motif cut. C: The higher-order network clustering framework. Given a
graph and a motif of interest (in this case, M7), the framework forms a motif adjacency matrix
(WM ) by counting the number of times two nodes co-occur in an instance of the motif. An
eigenvector of a Laplacian transformation of the motif adjacency matrix is then computed. The
ordering of the nodes provided by the components of the eigenvector (15) produces nested sets
Sr = { 1, . . . , r} of increasing size r. We prove that the set Sr with the smallest motif-based
conductance, M (Sr), is a near-optimal higher-order cluster (13).
7
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
Our study is to look at SSBM model using
our motif-weighting based on triangles.
4
Thinking beyond nodes and edges
There is abundant evidence that higher-order
connectivity patterns, or network motifs, drive
complex systems [Milo+02, Yaveroğlu+14].
Mangan et al., 2003
Alon, 2007
Signed feed-forward loops
in genetic transcription.A C
B
D
A C
B
D
A C
B
D
A C
B
D
A B
C
Figure 1: Higher-order network structures and the
Triangles in social
relationships.
Simmel, 1908
Rapoport, 1953
Granovetter, 1973
Bi-directed length-2
paths in brain networks.
Sporns-Kötter, 2004
Sporns et al., 2007
Honey et al., 2007
W = A2
A
NAConf'17David Gleich · Purdue
46
Based also on Tsourakakis, Pachocki, & Mitzenmacher. WWW, 2017
who showed -conductance < edge-conductance in a model.
Just using the motif-weighting highlights the
blocks for a range of parameters.
We introduce a mixing parameter μ to scale q.
µ = 0 $ q = 0, µ = k 1
k $ q = p
W = A2
AA
NAConf'17David Gleich · Purdue
47
The power method identifies a cluster using
the motif weighting better than the adjacency
Detectability
Exact recovery
Exp. details.
We take the
normalized
Lap and shift
to reverse the
spectrum.
Then we
deflate given
knowledge of
the leading
eigenvector.
Accuracy is
the most
accurate block
in the extremal
m entries
AccuracyW = A2
AA
The power method identifies a cluster using
the motif weighting better than the adjacency
W=A2
A
A
NAConf'17David Gleich · Purdue
49
We don’t converge faster for the usual
reasons that the power method converges
NAConf'17David Gleich · Purdue
50
There is a bigger gap deeper in the spectrum,
that could explain what is going on
NAConf'17David Gleich · Purdue
51
The motif weighting shifts all the eigenvalues
down, but lowest drop the most.
Semi-circle law
Not a Marchenko-
Pastur law!
W = A2
AA
We’d like a numerical understanding of why
we get better results faster with motifs.
Eigenvalues show that we’ll converge to the “cluster subspace” faster.
Conjecture. Higher accuracy for motifs because the eigenvectors are
more localized—or sharper—around the clusters.
• What remains is to understand why they are sharper!
W = A2
AA
Related work.
• Laplacian we propose was originally proposed by Rodríguez [2004]
and again by Zhou et al. [2006]
Our new theory (motif Cheeger inequality) explains why these were good ideas.
• Falls under general strategy of encoding hypergraph partitioning
problem as graph clustering problem [Agarwal+ 06]
• Serrour, Arenas, & Gómez, Detecting communities of triangles in
complex networks using spectral optimization, 2011.
• Arenas et al., Motif-based communities in complex networks, 2008.
• Rohe & Qin, Blessing of transitivity …, arXiv, 2013.
• Klymko, Gleich, Kolda (Using triangles & cycles …, ASE BigData 2014)
• Benson, Gleich, Leskovec (Motifs & Tensors, SIAM Data Mining 2015)
NAConf'17David Gleich · Purdue
54
Paper
Benson, Gleich, Leskovec
Science, 2016
1. A generalized conductance metric for motifs
2. A new spectral clustering algorithm to
minimize the generalized conductance.
3. AND an associated Cheeger inequality.
4. Aquatic layers in food webs
5. Hub structure in transportation networks
6. Eigenvalues & vectors of motifs in SSBMs.
7. Lots of cool stuff on signed networks.
Joint work with
Austin Benson and Jure
Leskovec, Stanford
Supported by NSF CAREER
CCF-1149756, IIS-1422918
IIS- DARPA SIMPLEX
9 10
2
0
4
3
6
5
1
NAConf'17David Gleich · Purdue
55
Code & Data
snap.stanford.edu/higher-order
github.com/arbenson/higher-order-organization-julia
github.com/dgleich/motif-ssbm
Open questions
• What is the distribution law
for the Laplacian of A2 ⊙ A
• How to work with element-
wise prods like matvecs for
N = eeT
B U UT
Thank you!

More Related Content

PDF
Higher-order organization of complex networks
PDF
Correlation clustering and community detection in graphs and networks
PDF
Spacey random walks and higher-order data analysis
PDF
Localized methods in graph mining
PDF
Engineering Data Science Objectives for Social Network Analysis
PDF
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
PDF
Localized methods for diffusions in large graphs
PDF
Spacey random walks and higher order Markov chains
Higher-order organization of complex networks
Correlation clustering and community detection in graphs and networks
Spacey random walks and higher-order data analysis
Localized methods in graph mining
Engineering Data Science Objectives for Social Network Analysis
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Localized methods for diffusions in large graphs
Spacey random walks and higher order Markov chains

What's hot (20)

PDF
Anti-differentiating Approximation Algorithms: PageRank and MinCut
PDF
Big data matrix factorizations and Overlapping community detection in graphs
PDF
Iterative methods with special structures
PDF
Non-exhaustive, Overlapping K-means
PDF
Personalized PageRank based community detection
PDF
PageRank Centrality of dynamic graph structures
PDF
Fast relaxation methods for the matrix exponential
PDF
Higher-order clustering coefficients
PDF
Spacey random walks from Householder Symposium XX 2017
PDF
Using Local Spectral Methods to Robustify Graph-Based Learning
PDF
Gaps between the theory and practice of large-scale matrix-based network comp...
PPTX
"Principal Component Analysis - the original paper" presentation @ Papers We ...
PDF
A new generalized lindley distribution
PDF
A lattice-based consensus clustering
PDF
Pattern-based classification of demographic sequences
PDF
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
ODP
Minimizing cost in distributed multiquery processing applications
PDF
Neural Networks: Principal Component Analysis (PCA)
PDF
Ijciras1101
PDF
High-Performance Approach to String Similarity using Most Frequent K Characters
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Big data matrix factorizations and Overlapping community detection in graphs
Iterative methods with special structures
Non-exhaustive, Overlapping K-means
Personalized PageRank based community detection
PageRank Centrality of dynamic graph structures
Fast relaxation methods for the matrix exponential
Higher-order clustering coefficients
Spacey random walks from Householder Symposium XX 2017
Using Local Spectral Methods to Robustify Graph-Based Learning
Gaps between the theory and practice of large-scale matrix-based network comp...
"Principal Component Analysis - the original paper" presentation @ Papers We ...
A new generalized lindley distribution
A lattice-based consensus clustering
Pattern-based classification of demographic sequences
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Minimizing cost in distributed multiquery processing applications
Neural Networks: Principal Component Analysis (PCA)
Ijciras1101
High-Performance Approach to String Similarity using Most Frequent K Characters
Ad

Similar to Spectral clustering with motifs and higher-order structures (20)

PPTX
Higher-order spectral graph clustering with motifs
PDF
Higher-order graph clustering at AMS Spring Western Sectional
PDF
Analytic tools for higher-order data
PPTX
Higher-order spectral graph clustering with motifs
PDF
Community detection in social networks[1]
PDF
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
PPT
Mediapresentation file for social media.
PPTX
Network sampling, community detection
PDF
High-Performance Graph Analysis and Modeling
PDF
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
PDF
Webinar on Graph Neural Networks
PDF
Graph based Clustering
PPTX
Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks
PDF
Using spectral radius ratio for node degree
PPTX
social.pptx
PDF
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
PPT
mathematics of network science: basic definitions
PDF
IJCAI13 Paper review: Large-scale spectral clustering on graphs
PPTX
Higher-order clustering coefficients
PDF
Graph Analysis Beyond Linear Algebra
Higher-order spectral graph clustering with motifs
Higher-order graph clustering at AMS Spring Western Sectional
Analytic tools for higher-order data
Higher-order spectral graph clustering with motifs
Community detection in social networks[1]
IRJET - Exploring Agglomerative Spectral Clustering Technique Employed for...
Mediapresentation file for social media.
Network sampling, community detection
High-Performance Graph Analysis and Modeling
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Webinar on Graph Neural Networks
Graph based Clustering
Leveraging Multiple GPUs and CPUs for Graphlet Counting in Large Networks
Using spectral radius ratio for node degree
social.pptx
The Power of Motif Counting Theory, Algorithms, and Applications for Large Gr...
mathematics of network science: basic definitions
IJCAI13 Paper review: Large-scale spectral clustering on graphs
Higher-order clustering coefficients
Graph Analysis Beyond Linear Algebra
Ad

More from David Gleich (13)

PDF
Fast matrix primitives for ranking, link-prediction and more
PDF
MapReduce Tall-and-skinny QR and applications
PDF
Recommendation and graph algorithms in Hadoop and SQL
PDF
Relaxation methods for the matrix exponential on large networks
PDF
Tall and Skinny QRs in MapReduce
PDF
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
PDF
A dynamical system for PageRank with time-dependent teleportation
PDF
How does Google Google: A journey into the wondrous mathematics behind your f...
PDF
Sparse matrix computations in MapReduce
PDF
The power and Arnoldi methods in an algebra of circulants
PDF
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
PDF
Matrix methods for Hadoop
PDF
Iterative methods for network alignment
Fast matrix primitives for ranking, link-prediction and more
MapReduce Tall-and-skinny QR and applications
Recommendation and graph algorithms in Hadoop and SQL
Relaxation methods for the matrix exponential on large networks
Tall and Skinny QRs in MapReduce
Vertex neighborhoods, low conductance cuts, and good seeds for local communit...
A dynamical system for PageRank with time-dependent teleportation
How does Google Google: A journey into the wondrous mathematics behind your f...
Sparse matrix computations in MapReduce
The power and Arnoldi methods in an algebra of circulants
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
Matrix methods for Hadoop
Iterative methods for network alignment

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
KodekX | Application Modernization Development
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Cloud computing and distributed systems.
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Approach and Philosophy of On baking technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Modernizing your data center with Dell and AMD
PPTX
A Presentation on Artificial Intelligence
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
cuic standard and advanced reporting.pdf
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Spectral efficient network and resource selection model in 5G networks
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Monthly Chronicles - July 2025
KodekX | Application Modernization Development
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Cloud computing and distributed systems.
Dropbox Q2 2025 Financial Results & Investor Presentation
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Approach and Philosophy of On baking technology
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”
Modernizing your data center with Dell and AMD
A Presentation on Artificial Intelligence
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
cuic standard and advanced reporting.pdf

Spectral clustering with motifs and higher-order structures

  • 1. Spectral graph clustering with motifs and higher-order structures David F. Gleich Purdue University Code & Data github.com/arbenson/higher-order-organization-julia github.com/dgleich/motif-ssbm 9 10 8 7 2 0 4 3 11 6 5 1 Austin Benson (Stanford -> Cornell) Jure Leskovec (Stanford) NAConf'17David Gleich · Purdue 1
  • 2. Graphs and matrices have a long and intertwined history. Matrices and graphs represent relationships among a group of objects. To study the relationships • centrality • reachability • clustering • … and more … often use matrix computations • e.g. Estrada & Higham, SIREV • e.g. Network analysis, Brandes & Erlebach Helen Bott, Observation of play activities in a nursery school, 1928 Ax = b Ax = x
  • 3. … a suggestion based on our work …
  • 4. given a graph G = (V, E) and its adjacency matrix A consider using the weighted matrix W = A2 A Hadamard / element-wise
  • 5. and its non-symmetric adjacency matrix A consider using a symmetric weighted matrix from given a directed graph G = (V, E) Motif Matrix computations W = M1 C = (U · U) UT C + CT M2 C = (B · U) UT + (U · B) UT + (U · U) B C + CT M3 C = (B · B) U + (B · U) B + (U · B) B C + CT M4 C = (B · B) B C M5 C = (U · U) U + (U · UT ) U + (UT · U) U C + CT M6 C = (U · B) U + (B · UT ) UT + (UT · U) B C M7 C = (UT · B) UT + (B · U) U + (U · UT ) B C M8 C = (U · N) U + (N · UT ) UT + (UT · U) N C
  • 6. Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 what why how (A*A).*A Matlab (A*A).*A Julia np.dot(A,A)*A Python (A is array) When clustering based on triangles, we often have better numerical properties (e.g. eigenvalue gaps) in model partitioning problems (stochastic block models) and better real-world results. The matrix W = A2 A arises from our motif and higher-order clustering framework when using triangles as the motif.
  • 8. … a little story …
  • 9. Networks are sets of nodes and edges (graphs) that model real world systems Key insight. [Flake et al., Newman et al., and hundreds more!] Networks—for real-world systems—have modules, communities, clusters This structure has traditionally been exposed with node and edge based clustering metrics. Density, modularity, conductance, cut, ratio cuts, etc. NAConf'17David Gleich · Purdue 9 Co-author network 8 Background network clustering is a fundamental network analysis for finding coherent groups of nodes based on edges § Real-world networks have modular organization [Newman 2004, Newman 2006]. § We want to automatically find the modules in the system. Co-author network § Old idea Find groups of nodes with high internal edge density and low external edge density [Newman 2004, Danon 2005, Leskovec+ 2009]. Brain network, de Reus et al., RSTB, 2014. Brain network, de Reus et al., RSTB, 2014.
  • 10. Similar tools are used to partition computations for parallelism Comanche mesh from Alex Pothen, from Sparse Matrix Collection NAConf'17David Gleich · Purdue 10
  • 11. There is abundant evidence that higher-order connectivity patterns drive complex systems. NAConf'17David Gleich · Purdue 11 4 es order drive Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A C B CC A C B A B C ks. 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A B A C B A C B A B A B C Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A B A C B A C B A B A B C Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A B A C B A C B A B A B C Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 Mangan et al., 2003 Alon, 2007 Triangles in social networks Bi-directed paths in brain networks Signed feed-forward loops in genetic transcription Key Insight. [Milo et al. (Science 2002)] Certain subgraphs were far more common than expected. We call any small subgraph a motif.
  • 12. Nodes and edges may not be the basis elements of these networks. Why should we look for module structure in terms of nodes and edges? NAConf'17David Gleich · Purdue 12
  • 13. Idea Find clusters of motifs NAConf'17David Gleich · Purdue 13
  • 14. Higher-order organization of complex networks We generalize spectral clustering, a classic technique to find clusters or communities in a graph, to use motifs to cluster the graph. • Uses motif conductance instead of node & edge conductance • We also bound the conductance in terms of the optimal solution Outline 1. So we’ll briefly review how spectral clustering works 2. Then see how to adapt it to work with network motifs 3. Then see this procedure on real-world & model data
  • 15. We can do motif-based clustering by generalizing spectral clustering Spectral clustering is a classic technique to partition graphs by looking at eigenvectors. M. Fiedler, 1973, Algebraic connectivity of graphs Graph Laplacian Eigenvector NAConf'17David Gleich · Purdue 15 Earlier work by Simon, Ando, Courtois dealt with a related decomposability idea eigenvalueentry A L = D 1/2 (D A)D 1/2 (D A)x = Dx
  • 16. Spectral clustering works based on conductance with node and edge cuts NAConf'17David Gleich · Purdue 16 Conductance is one of the most important quality scored used to identify network modules, clusters or communities [Schaeffer 2007] used in Markov chain theory, bioinformatics, vision, etc. (edges leaving the set) (total edges in the set) (S) = cut(S) min vol(S), vol( ¯S) S S vol(S) = P i2S degree of i (conductance) cut(S) = # edges between S, ¯S
  • 17. Spectral clustering works based on conductance with node and edge cuts NAConf'17David Gleich · Purdue 17 Conductance is one of the most important quality scored used to identify network modules, clusters or communities [Schaeffer 2007] used in Markov chain theory, bioinformatics, vision, etc. (edges leaving the set) (total edges in the set) (S) = cut(S) min vol(S), vol( ¯S) S S (conductance) cut(S) = 7 cut( ¯S) = 7 |S| = 15 | ¯S| = 20 vol(S) = 85 vol( ¯S) = 151 cut(S) = 7 cut( ¯S) = 7 |S| = 15 | ¯S| = 20 vol(S) = 85 vol( ¯S) = 151 (S) = 7/85 = 0.082 Small conductance ó Good set
  • 18. Spectral clustering has theoretical guarantees Cheeger Inequality Finding the best conductance set is NP-hard. L • Cheeger realized the eigenvalues of the Laplacian provided a bound in manifolds • Alon and Milman independently realized the same thing for a graph! J. Cheeger, 1970, A lower bound on the smallest eigenvalue of the Laplacian N. Alon, V. Milman 1985. λ1 isoperi- metric inequalities for graphs and superconcentrators Laplacian 2 ⇤/2  2  2 ⇤ 0 = 1  2  ...  n  2 Eigenvalues of the Laplacian ⇤ = set of smallest conductance NAConf'17David Gleich · Purdue 18
  • 19. The sweep cut algorithm realizes the guarantee We can find a set S that achieves the Cheeger bound. 1. Compute the eigenvector associated with λ2 (e.g. ARPACK) 2. Sort the vertices by their values in the eigenvector: σ1, σ2, … σn 3. Let Sk = {σ1, …, σk} and compute the conductance of each Sk: φk = φ(Sk) 4. Pick the minimum φm of φk . M. Mihail, 1989 Conductance and convergence of Markov chains F. C. Graham, 1992, Spectral Graph Theory. NAConf'17David Gleich · Purdue 19 m  2 p ⇤
  • 20. The sweep cut visualized 0 20 40 0 0.2 0.4 0.6 0.8 1 S i φi (S) = cut(S) min vol(S), vol( ¯S) NAConf'17David Gleich · Purdue 20
  • 21. But current problems are much more rich than where spectral is justified Spectral clustering is theoretically justified for undirected graphs • Various extensions to multiple clusters [Dhillon et al.; Gharan et al.; Jordan et al.] • Weighted graphs are okay • Approximate eigenvectors are okay [Mihail] Current network models are more richly annotated • directed, signed, colored, layered, multiplex, etc. R. Milo, 2002, Science X causes Y to be expressed Z represses Y X Z Y + – NAConf'17David Gleich · Purdue 21 Nice recent work by [Fairbanks et al. arXiv] on better numerical stopping criteria!
  • 22. There is a literature on directed spectral graph partitioning, but it is hard to interpret Markov chains • Stewart (numerical solution to Markov chains) • Chung (Random walks and cuts in dir graphs ) Nonlinear Laplacian • Yoshida WSDM2016 Asymmetric Laplacian • Boley et al. LAA2011 (commute times) Gleich, Klymko, Kolda ASE BigData 2014 D 1 Ax = x (D A)x = x 1 2 ⇧(D 1 A) + 1 2 (AT D 1 )⇧ X (u,v)2E ( (xu xv )2 xu xv 0 0 otherwise NAConf'17David Gleich · Purdue 22
  • 23. Our contributions 1. A generalized conductance metric for motifs 2. A “new” spectral clustering algorithm to minimize the generalized conductance. 3. AND an associated Cheeger inequality. (which handles directed graphs) 4. Aquatic layers in food webs 5. Hub structure in transportation This talk, still preliminary! NAConf'17David Gleich · Purdue 23 Some studies in stochastic block modelsNew!
  • 24. Motif-based conductance generalizes edge-based conductance Need notions of cut and volume S S S¯S ¯S vol(S) = #(edge end points in S) NAConf'17David Gleich · Purdue 24 cut(S) = #(edges cut by S) cutM (S) = #(motifs cut by S) volM (S) = #(motif end points in S) M (S) = cutM (S) min(volM (S), volM ( ¯S)) (S) = cut(S) min(vol(S), vol( ¯S)) vol(S) = P i2S degree of i
  • 25. An example of motif-conductance 9 10 6 5 8 1 7 2 0 4 3 11 9 10 8 7 2 0 4 3 11 6 5 1 ¯S S Motif M (S) = motifs cut motif volume = 1 10 NAConf'17David Gleich · Purdue 25
  • 26. How can we optimize motif conductance? We thought that motif conductance would spark new tensor and hypermatrix methods based on the motif adjacency tensor. NAConf'17David Gleich · Purdue 26 1 3 2 A We were wrong! A(i, j, k) = ( 1 if motif involves nodes i, j, k 0 otherwise Benson, Gleich, Leskovec, SDM 2016
  • 27. There is a symmetric matrix that serves as the appropriate tool to study motif conductance 9 10 6 5 8 1 7 2 0 4 3 11 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 A W(M) ij = counts co-occurrences of motif pattern between i, j W(M) NAConf'17David Gleich · Purdue 27
  • 28. Going from motifs back to a matrix for spectral clustering 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 W(M) ij = counts co-occurrences of motif pattern between i, j W(M) KEY INSIGHT Spectral clustering on W(M) yields results on the new motif notion of conductance M (S) = motifs cut motif volume = 1 10 NAConf'17David Gleich · Purdue 28
  • 29. Here is a quick illustration of how this works. NAConf'17David Gleich · Purdue 29 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 9 10 8 7 2 0 4 3 11 6 5 1 M (S) = motifs cut motif volume = 1 10 cut(S) = 2 vol(S) = 6 + 8 + 2 + 2 + 2 = 1 10
  • 30. A motif-based clustering algorithm 1. Form weighted graph W(M) 2. Compute the Fiedler vector associated with λ2 of the motif-normalized Laplacian 3. Run a sweep cut on f 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 W(M) D = diag(W(M) e) L(M) = D 1/2 (D W(M) )D 1/2 L(M) z = 2z f(M) = D 1/2 z NAConf'17David Gleich · Purdue 30
  • 31. The sweep cut results 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1 2 0 4 3 1 2 0 4 3 9 10 6 Best higher- order cluster 2nd best higher- order cluster 9 10 6 5 8 1 7 2 0 4 3 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 (Order from the Fiedler vector) NAConf'17David Gleich · Purdue 31
  • 32. There are nice matrix computations for three-node motifs NAConf'17David Gleich · Purdue 32 W = A2 A 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loop in genetic transcription.A B A B A C B D A B A B C Figure 1: Higher-order network str framework. A: Higher-order structur 13 connected three-node directed motif Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007
  • 33. There are nice matrix computations for three-node directed motifs Given a (directed) adjacency matrix A, let B = A AT and U = A B bidirectional unidirectional Motif Matrix computations W = M1 C = (U · U) UT C + CT M2 C = (B · U) UT + (U · B) UT + (U · U) B C + CT M3 C = (B · B) U + (B · U) B + (U · B) B C + CT M4 C = (B · B) B C M5 C = (U · U) U + (U · UT ) U + (UT · U) U C + CT M6 C = (U · B) U + (B · UT ) UT + (UT · U) B C M7 C = (UT · B) UT + (B · U) U + (U · UT ) B C M8 C = (U · N) U + (N · UT ) UT + (UT · U) N C N = ee T B U U T NAConf'17David Gleich · Purdue 33
  • 34. The three-node motif-based Cheeger inequality THEOREM If the motif has three nodes, then the sweep procedure on the weighted graph finds a set S of nodes for which M(G) = {instances of M in G} Key Proof Step NAConf'17David Gleich · Purdue 34 cutM (S, G) = X {i,j,k}2M(G) Indicator[xi , xj , xk not the same] = 1 4 (x2 i + x2 j + x2 k xi xj xj xk xi xk ) = quadratic in x M (S)  2 q ⇤ M IMPLICATION Just run spectral clustering on those weighted matrices.
  • 35. Awesome advantages Works for arbitrary non-neg. combos of motifs too We inherit 40+ years of research! • Fast algorithms (ARPACK, etc.)! • Local methods! Yin, Benson, Leskovec, Gleich, KDD2017 • Overlapping! • Easy to implement (20 lines of Matlab/Julia) • Scalable (1.4B edges graphs are not a prob.) NAConf'17David Gleich · Purdue 35 17 elseif motif == "M5" 18 C = (U * U) .* U + (U * U’) .* U + (U’ * U) .* U 19 W = C + C’ 20 elseif motif == "M6" 21 W = (U * B) .* U + (B * U’) .* U’ + (U’ * U) .* B 22 elseif motif == "M7" 23 W = (U’ * B) .* U’ + (B * U) .* U + (U * U’) .* B 24 else 25 error("Motif must be one of M1, M2, M3, M4, M5, M6, or M7.") 26 end 27 28 # Get Fiedler eigenvector 29 dinvsqrt = spdiagm(1.0 ./ sqrt.(vec(sum(W, 1)))) 30 LM = I - dinvsqrt * W * dinvsqrt 31 lambdas, evecs = eigs(LM, nev=2, which=:SM) 32 z = dinvsqrt * real(evecs[:, 2]) 33 34 # Sweep cut 35 sigma = sortperm(z) 36 C = W[sigma, sigma] 37 Csums = sum(C, 1)’ 38 motifvolS = cumsum(Csums) 39 motifvolSbar = sum(W) * ones(length(sigma)) - motifvolS 40 conductances = cumsum(Csums - 2 * sum(triu(C), 1)’) ./ min.(motif 41 split = indmin(conductances) 42 if split <= length(size(A, 1) / 2) 43 return sigma[1:split] 44 else 45 return sigma[(split + 1):end] 46 end 47 end Figure 2.3 – Julia implementation of the motif-based spectral clusteri
  • 36. Case study 1 Motifs partition the food webs Food webs model energy exchange in species of an ecosystem. means i’s energy goes to j (or j eats i) NAConf'17David Gleich · Purdue 36 i j
  • 37. Case study 1 Motifs partition the food webs Food webs model energy exchange in species of an ecosystem. means i’s energy goes to j (or j eats i) Via Cheeger, motif conductance is better than edge conductance. NAConf'17David Gleich · Purdue 37 i j
  • 38. Demo and reproducibility https://github.com/arbenson/higher-order-organization-julia NAConf'17David Gleich · Purdue 38 # form W0 … W4 sc0 = spectral_cut(W0) sc1 = spectral_cut(W1) sc2 = spectral_cut(W2) sc3 = spectral_cut(W3) sc4 = spectral_cut(W4) plt = x -> semilogx(x.sweepcut_profile .conductance) plt(sc0) plt(sc1) plt(sc2) plt(sc3) plt(sc4)
  • 39. Case study 1 Motifs partition the food webs NAConf'17David Gleich · Purdue 39 B D Micronutrient sources Pelagic fishes and benthic prey Benthic macro- invertebrates Benthic Fishes Motif M6 reveals aquatic layers A 61% accuracy vs. 48% with edge- based methods 24 Application 1 Food webs
  • 40. Case study 2 Hub structure in the air transportation network North American air transport network Nodes are airports Edges reflect reachability, and are unweighted. (Based on Frey et al.’s 2007) NAConf'17David Gleich · Purdue 40
  • 41. The weighed adjacency matrix already reveals hub-like structure NAConf'17David Gleich · Purdue 41 Accepted pending B A Counts length-two walks
  • 42. The motif embedding shows this structure and splits into east-west Top 10 U.S. hubs East coast non-hubs West coast non-hubs Primary spectral coordinate Atlanta, the top hub, is next to Salina, a non-hub. MOTIF SPECTRAL EMBEDDING EDGE SPECTRAL EMBEDDING NAConf'17David Gleich · Purdue 42
  • 43. Case study 3: the stochastic block model shows numerical advantages to motif-matrices Model problems are useful in because they are simple and we often “know” everything about them. They may not reflect real-world issues. Mouse picture from Wikipedia Мышь_2.jpg, Fly from oregonstateuniversity/11179958483 Biology Matrix Computations Clustering r2 u = f NAConf'17David Gleich · Purdue 43
  • 44. The stochastic block model is extremely well understood in theory The symmetric stochastic block model (SSBM) • k blocks, each of size m-by-m • within-block edges exist with prob p • between-block edges with prob q symmetric stochastic block model m = 200, k = 5, p = 0.3, q = 0.13 m m m 2 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 5 m p q · · · q m q p ... ... ... m q q p . Reminescnt of Simon & Ando. NAConf'17David Gleich · Purdue 44
  • 45. The stochastic block model is extremely well understood in theory The symmetric stochastic block model (SSBM) • k blocks, each of size m-by-m • within-block edges exist with prob p • between-block edges with prob q m m m 2 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 5 m p q · · · q m q p ... ... ... m q q p . The task Given a graph that is an SSBM and given m, k, p, q Find the k blocks. Theory E. Abbe, community detection and the stochastic block model (In prep, on webpage) • Necessary p > q • Exact recovery (get all correct) • Detectability (find a non-trivial portion) • Uses non-backtracking random walk. m = 200, k = 5, p = 0.3, q = 0.13 NAConf'17David Gleich · Purdue 45
  • 46. 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A C B D Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network. A: The four higher-order structures used by our higher-order clustering method, which can model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive delay elements in transcriptional regulation networks (46). The edge signs refer to activation (positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A). Clusters show functional modules consisting of several motifs (coherent feedforward loops), which were previously studied individually (45). The higher-order clustering framework identi- fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D: Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func- tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating type match (D). The clustering suggests that coherent feedforward loops function together as a single processing unit rather than as independent elements. S48 A C B D Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network. A: The four higher-order structures used by our higher-order clustering method, which can model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive delay elements in transcriptional regulation networks (46). The edge signs refer to activation (positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A). Clusters show functional modules consisting of several motifs (coherent feedforward loops), which were previously studied individually (45). The higher-order clustering framework identi- fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D: Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func- tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating type match (D). The clustering suggests that coherent feedforward loops function together as a single processing unit rather than as independent elements. S48 A C B D Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network. A: The four higher-order structures used by our higher-order clustering method, which can model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive delay elements in transcriptional regulation networks (46). The edge signs refer to activation (positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A). Clusters show functional modules consisting of several motifs (coherent feedforward loops), which were previously studied individually (45). The higher-order clustering framework identi- fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D: Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func- tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating type match (D). The clustering suggests that coherent feedforward loops function together as a single processing unit rather than as independent elements. S48 A C B D Figure S8: Higher-order organization of the S. cerevisiae transcriptional regulation network. A: The four higher-order structures used by our higher-order clustering method, which can model signed motifs. These are coherent feedfoward loop motifs, which act as sign-sensitive delay elements in transcriptional regulation networks (46). The edge signs refer to activation (positive) or repression (negative). B: Six higher-order clusters revealed by the motifs in (A). Clusters show functional modules consisting of several motifs (coherent feedforward loops), which were previously studied individually (45). The higher-order clustering framework identi- fies the functional modules with higher accuracy (97%) than existing methods (68–82%). C–D: Two higher-order clusters from (B). In these clusters, all edges have positive sign. The func- tionality of the motifs in the modules correspond to drug resistance (C) or cell cycle and mating type match (D). The clustering suggests that coherent feedforward loops function together as a single processing unit rather than as independent elements. S48 A B C Figure 1: Higher-order network structures and the higher-order network clustering framework. A: Higher-order structures are captured by network motifs. For example, all 13 connected three-node directed motifs are shown here. B: Clustering of a network based on motif M7. For a given motif M, our framework aims to find a set of nodes S that minimizes motif conductance, M (S), which we define as the ratio of the number of motifs cut (filled triangles cut) to the minimum number of nodes in instances of the motif in either S or ¯S (13). In this case, there is one motif cut. C: The higher-order network clustering framework. Given a graph and a motif of interest (in this case, M7), the framework forms a motif adjacency matrix (WM ) by counting the number of times two nodes co-occur in an instance of the motif. An eigenvector of a Laplacian transformation of the motif adjacency matrix is then computed. The ordering of the nodes provided by the components of the eigenvector (15) produces nested sets Sr = { 1, . . . , r} of increasing size r. We prove that the set Sr with the smallest motif-based conductance, M (Sr), is a near-optimal higher-order cluster (13). 7 Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 Our study is to look at SSBM model using our motif-weighting based on triangles. 4 Thinking beyond nodes and edges There is abundant evidence that higher-order connectivity patterns, or network motifs, drive complex systems [Milo+02, Yaveroğlu+14]. Mangan et al., 2003 Alon, 2007 Signed feed-forward loops in genetic transcription.A C B D A C B D A C B D A C B D A B C Figure 1: Higher-order network structures and the Triangles in social relationships. Simmel, 1908 Rapoport, 1953 Granovetter, 1973 Bi-directed length-2 paths in brain networks. Sporns-Kötter, 2004 Sporns et al., 2007 Honey et al., 2007 W = A2 A NAConf'17David Gleich · Purdue 46 Based also on Tsourakakis, Pachocki, & Mitzenmacher. WWW, 2017 who showed -conductance < edge-conductance in a model.
  • 47. Just using the motif-weighting highlights the blocks for a range of parameters. We introduce a mixing parameter μ to scale q. µ = 0 $ q = 0, µ = k 1 k $ q = p W = A2 AA NAConf'17David Gleich · Purdue 47
  • 48. The power method identifies a cluster using the motif weighting better than the adjacency Detectability Exact recovery Exp. details. We take the normalized Lap and shift to reverse the spectrum. Then we deflate given knowledge of the leading eigenvector. Accuracy is the most accurate block in the extremal m entries AccuracyW = A2 AA
  • 49. The power method identifies a cluster using the motif weighting better than the adjacency W=A2 A A NAConf'17David Gleich · Purdue 49
  • 50. We don’t converge faster for the usual reasons that the power method converges NAConf'17David Gleich · Purdue 50
  • 51. There is a bigger gap deeper in the spectrum, that could explain what is going on NAConf'17David Gleich · Purdue 51
  • 52. The motif weighting shifts all the eigenvalues down, but lowest drop the most. Semi-circle law Not a Marchenko- Pastur law! W = A2 AA
  • 53. We’d like a numerical understanding of why we get better results faster with motifs. Eigenvalues show that we’ll converge to the “cluster subspace” faster. Conjecture. Higher accuracy for motifs because the eigenvectors are more localized—or sharper—around the clusters. • What remains is to understand why they are sharper! W = A2 AA
  • 54. Related work. • Laplacian we propose was originally proposed by Rodríguez [2004] and again by Zhou et al. [2006] Our new theory (motif Cheeger inequality) explains why these were good ideas. • Falls under general strategy of encoding hypergraph partitioning problem as graph clustering problem [Agarwal+ 06] • Serrour, Arenas, & Gómez, Detecting communities of triangles in complex networks using spectral optimization, 2011. • Arenas et al., Motif-based communities in complex networks, 2008. • Rohe & Qin, Blessing of transitivity …, arXiv, 2013. • Klymko, Gleich, Kolda (Using triangles & cycles …, ASE BigData 2014) • Benson, Gleich, Leskovec (Motifs & Tensors, SIAM Data Mining 2015) NAConf'17David Gleich · Purdue 54
  • 55. Paper Benson, Gleich, Leskovec Science, 2016 1. A generalized conductance metric for motifs 2. A new spectral clustering algorithm to minimize the generalized conductance. 3. AND an associated Cheeger inequality. 4. Aquatic layers in food webs 5. Hub structure in transportation networks 6. Eigenvalues & vectors of motifs in SSBMs. 7. Lots of cool stuff on signed networks. Joint work with Austin Benson and Jure Leskovec, Stanford Supported by NSF CAREER CCF-1149756, IIS-1422918 IIS- DARPA SIMPLEX 9 10 2 0 4 3 6 5 1 NAConf'17David Gleich · Purdue 55 Code & Data snap.stanford.edu/higher-order github.com/arbenson/higher-order-organization-julia github.com/dgleich/motif-ssbm Open questions • What is the distribution law for the Laplacian of A2 ⊙ A • How to work with element- wise prods like matvecs for N = eeT B U UT Thank you!