SlideShare a Scribd company logo
Paper digest
“Large-Scale Spectral Clustering on Graphs”
Akisato Kimura
akisato@ieee.org, @_akisato
One-page abstract
• Approx. acceleration of spectral clustering
– by introducing additional nodes that enable us to
compress the original graph,
– resulting in a bipartite graph which is
computationally efficient for spectral clustering.
• Note
– Large-scale spectral clustering,
especially works well for dense graphs.
– Not suitable for large-scale graph clustering,
due to the sparsity in nature.
Spectral clustering [Shi & Malik 1997]
• Notations
– Undirected weighted graph 𝐺 = 𝑉, 𝐸
– Num. nodes 𝑛 = |𝑉|; Num. Edges 𝑚 = |𝐸|
– Adjacency matrix 𝑊 = 𝑊𝑖,𝑗 𝑖,𝑗=1,2,…,𝑛
• Objective function
– Solved by eigen-decomposition (EVD)
min
𝑋∈ℝ 𝑘×𝑛
𝑇𝑟(𝑋 𝑇
𝐷−1/2
𝐿𝐷−1/2
𝑋) s.t. 𝑋 𝑇
𝑋 = 𝐼
(𝐿: graph Laplacian of 𝑊, 𝐷 = 𝐿 − 𝑊, 𝑘: num.clusters)
Main contribution of this work
• SC needs 𝑂(𝑛3
) computations due to EVD.
• Several improvements so far.
– Compressing the adjacency matrix by Nystrom
method [Fowlkes+ 2004]
– Reducing samples (= nodes) [Shinnou & Sasaki 2008] [Yan+
2009] [Sakai & Imiya 2009] [Chen & Cai 2011]
– Early stopping of EVD [Chen+ 2006] [Liu+ 2007]
• In contrast, this work
– Reducing the size of the graph.
• Why supernodes? --- Intuition from co-clustering
– A partition of supernodes can induce a partition of the
observed nodes, and vise versa.
• Generating a set of 𝑑 ≪ 𝑛 supernodes
Introducing supernodes
Original graph
Regular nodes
Supernodes
How to generate supernodes
1. Randomly choosing 𝑑 regular nodes as seeds.
2. Calculating the shortest paths from the seeds
to the other regular nodes.
i. Converting adjacencies to distances.
ii. Applying Dijkstra’s algorithm.
3. Partitioning all the regular nodes into 𝑑
disjoint subsets based on the shortest paths.
4. (Each subset corresponds to a supernode.)
After generating supernodes
𝑛 regular nodes
𝑑 supernodes
𝑊
𝑅
𝑊 = 𝑅𝑊
𝑅 ∈ ℤ 𝑑×𝑛: binary bipartite graph
𝑊 ∈ ℝ 𝑑×𝑛:
bipartite, called a “reduced graph”
𝑊Propagating edge weights between
regular nodes and supernodes
Spectral clustering on reduced graphs
• Consider another representation of the
reduced graph
• Spectral clustering on 𝑊′
𝑛 regular nodes
𝑑 supernodes
𝑛 regular nodes
𝑑 supernodes
Result of spectral clustering on 𝑊′
Spectral clustering on reduced graphs
• Spectral clustering on 𝑊′ becomes
• It can be more simplified
– 𝑦 is also an eigenvector of 𝑍𝑍 𝑇 ∈ ℝ 𝑑×𝑑
𝑛 regular nodes
𝑑 supernodes
• Co-clustering structure
• 𝑥 and 𝑦 are left & right
singular vectors of 𝑍 ∈ ℝ 𝑑×𝑛.
∵ 𝑍𝑍 𝑇 𝑦 = 𝑍 1 − 𝜆 𝑥 = 1 − 𝜆 2 𝑦
(𝑍𝑍 𝑇
looks like a compressed representation of 𝑊.)
In summary
Described by now
Additional steps
Regenerating supernodes
• Intuitions
1. The matrix 𝑈 ∈ ℝ 𝑛×𝑘 implies
the current clustering.
2. Most of the nodes in the
same cluster expect to be
densely connected.
• Method
– Selecting 𝑘 − 1 right
(= with large eigenvalues)
vectors as supernodes. 𝑈
𝑛 regular nodes
𝑑 supernodes
𝑘 cluster nodes
𝑊
In detail
New regular-super links
Average affiliation score over all the samples.
• Resulting in (𝑘 − 1) edges from every regular node.
• Every edge stands for a binalized affiliation score
• So, this idea can be easily extended to quantized affiliation scores with arbitrary sizes
Finally, the algorithm is as follows
Generating or updating
supernodes
Small-size spectral clustering
can be replaced to a function of 𝑡 as 𝑙 𝑡
Computational costs
3-4. 𝑂(𝑚𝑑)
1-2. 𝑂(𝑛𝑑 log 𝑛)
6. 𝑂 𝑛𝑑2 + 𝑂(𝑑3)
7-9. 𝑂(𝑛𝑑𝑘)
Alg. 1: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑛𝑑2)
5. 𝑂(𝑛𝑑)
3. 𝑂(𝑛𝑑 log 𝑛 + 𝑚(𝑑 + 1))
5. 𝑂(𝑚𝑘)
Alg. 2: 𝑂(𝑚𝑘)
Alg. 3: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑚𝑘𝑡 + 𝑛(𝑑2 + 𝑘2 𝑡))
If 𝑑2 ≈ 𝑘2 𝑡 ≈ log2 𝑛 → 𝑂 𝑛 log2 𝑛
( = modularity-based clustering)
Data sets for experiments
• 2 synthetic, 2 real-world.
– Syn-1k: kNN graph; 100k: 100-ins & 40-outs
– DBLP: Author network, co-conference links.
– IMDB: Movie network, co-director links.
• Looks like moderate-scale (not large-scale) graphs…
Experimental results
Shortest Path (See Slide 6)
Proposed (Alg. 1)
Proposed (Alg. 3)
Spectral Clustering
[Khoa & Chawla 2012]
[Fowlkes+ 2004]
The proposed method is
suitable for dense graphs.
(if sparse, modularity-based
clustering would be better
(𝑂 𝑛 log 𝑛 ∼ 𝑂(𝑛 log2
𝑛)) )
Detailed results
Performance of the proposed methods
w.r.t parameter 𝑑 (num.supernodes).
Why not monotonically increasing?
Performance of the proposed methods
w.r.t parameter 𝑡 (num.iterations).
Qualitative evaluations
• Toy example on Syn-1K
Ground truth k-NN graph SP Proposed 1
Proposed 2
(5 iterations)
SC RESC Nystrom
Comments
• The idea and technique are interesting and
maybe versatile.
• (Serialized and parallel) implementation
would be quite simple.
– Matlab code is available at
http://jialu.cs.illinois.edu/publication
• Might be suitable only for dense graph
clustering (with features).

More Related Content

PPTX
Spectral clustering
PPTX
Spectral clustering Tutorial
PPTX
Tensor Spectral Clustering
PDF
Spectral Clustering Report
PDF
Notes on Spectral Clustering
PDF
K means and dbscan
PDF
Optics ordering points to identify the clustering structure
PPT
3.4 density and grid methods
Spectral clustering
Spectral clustering Tutorial
Tensor Spectral Clustering
Spectral Clustering Report
Notes on Spectral Clustering
K means and dbscan
Optics ordering points to identify the clustering structure
3.4 density and grid methods

What's hot (20)

PPT
Data miningpresentation
PPTX
DBSCAN (2014_11_25 06_21_12 UTC)
PPTX
K-means Clustering
PPT
Clustering: Large Databases in data mining
PPTX
Spectral graph theory
PPTX
K MEANS CLUSTERING
PDF
Spectral cnn
PDF
K means clustering
PPTX
K means clustering | K Means ++
PPT
Enhance The K Means Algorithm On Spatial Dataset
PPTX
Kmeans
PPT
Intro to MATLAB and K-mean algorithm
PPTX
K-means Clustering
PPTX
Principal component analysis
PPTX
DBSCAN : A Clustering Algorithm
PDF
Sharpness-aware minimization (SAM)
PPTX
Transfer learningforclp
PPTX
Clustering techniques
PDF
Parallel-kmeans
Data miningpresentation
DBSCAN (2014_11_25 06_21_12 UTC)
K-means Clustering
Clustering: Large Databases in data mining
Spectral graph theory
K MEANS CLUSTERING
Spectral cnn
K means clustering
K means clustering | K Means ++
Enhance The K Means Algorithm On Spatial Dataset
Kmeans
Intro to MATLAB and K-mean algorithm
K-means Clustering
Principal component analysis
DBSCAN : A Clustering Algorithm
Sharpness-aware minimization (SAM)
Transfer learningforclp
Clustering techniques
Parallel-kmeans
Ad

Similar to IJCAI13 Paper review: Large-scale spectral clustering on graphs (20)

PDF
post119s1-file2
PDF
Dictionary Learning in Games - GDC 2014
PDF
Paper study: Learning to solve circuit sat
PDF
Cs36565569
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
PPTX
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
PDF
Exploring Simple Siamese Representation Learning
PPTX
240708_JW_labseminar[struc2vec: Learning Node Representations from Structural...
PDF
Parallelizing Pruning-based Graph Structural Clustering
PDF
Webinar on Graph Neural Networks
PPTX
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
PDF
Clustering of graphs and search of assemblages
PDF
Symbolic Regression on Network Properties
PDF
Distributional RL via Moment Matching
PPTX
Parallel Algorithms for Geometric Graph Problems (at Stanford)
PPTX
141222 graphulo ingraphblas
 
PPTX
141205 graphulo ingraphblas
PPTX
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
PDF
Nonlinear dimension reduction
PDF
Deep Implicit Layers: Learning Structured Problems with Neural Networks
post119s1-file2
Dictionary Learning in Games - GDC 2014
Paper study: Learning to solve circuit sat
Cs36565569
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Exploring Simple Siamese Representation Learning
240708_JW_labseminar[struc2vec: Learning Node Representations from Structural...
Parallelizing Pruning-based Graph Structural Clustering
Webinar on Graph Neural Networks
NS-CUK Seminar: S.T.Nguyen, Review on "Improving Graph Neural Network Express...
Clustering of graphs and search of assemblages
Symbolic Regression on Network Properties
Distributional RL via Moment Matching
Parallel Algorithms for Geometric Graph Problems (at Stanford)
141222 graphulo ingraphblas
 
141205 graphulo ingraphblas
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
Nonlinear dimension reduction
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Ad

More from Akisato Kimura (20)

PPTX
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
PPTX
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
PDF
多変量解析の一般化
PDF
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
PDF
NIPS2015 reading - Learning visual biases from human imagination
PDF
CVPR2015 reading "Global refinement of random forest"
PDF
CVPR2015 reading "Understainding image virality" (in Japanese)
PDF
Computational models of human visual attention driven by auditory cues
PDF
NIPS2014 reading - Top rank optimization in linear time
PDF
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
PDF
ICCV2013 reading: Learning to rank using privileged information
PDF
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
PDF
関西CVPR勉強会 2012.10.28
PDF
関西CVPR勉強会 2012.7.29
PDF
ICWSM12 Brief Review
PDF
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
PDF
関西CVPRML勉強会(特定物体認識) 2012.1.14
PDF
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
PDF
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
PDF
立命館大学 AMLコロキウム 2011.10.20
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
Paper reading - Dropout as a Bayesian Approximation: Representing Model Uncer...
多変量解析の一般化
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
NIPS2015 reading - Learning visual biases from human imagination
CVPR2015 reading "Global refinement of random forest"
CVPR2015 reading "Understainding image virality" (in Japanese)
Computational models of human visual attention driven by auditory cues
NIPS2014 reading - Top rank optimization in linear time
CVPR2014 reading "Reconstructing storyline graphs for image recommendation fr...
ICCV2013 reading: Learning to rank using privileged information
ACMMM 2013 reading: Large-scale visual sentiment ontology and detectors using...
関西CVPR勉強会 2012.10.28
関西CVPR勉強会 2012.7.29
ICWSM12 Brief Review
関西CVPRML勉強会 2012.2.18 (一般物体認識 - データセット)
関西CVPRML勉強会(特定物体認識) 2012.1.14
人間の視覚的注意を予測するモデル - 動的ベイジアンネットワークに基づく 最新のアプローチ -
IBIS2011 企画セッション「CV/PRで独自の進化を遂げる学習・最適化技術」 趣旨説明
立命館大学 AMLコロキウム 2011.10.20

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Modernizing your data center with Dell and AMD
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Big Data Technologies - Introduction.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Understanding_Digital_Forensics_Presentation.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Network Security Unit 5.pdf for BCA BBA.
Per capita expenditure prediction using model stacking based on satellite ima...
Modernizing your data center with Dell and AMD
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Big Data Technologies - Introduction.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Monthly Chronicles - July 2025
NewMind AI Weekly Chronicles - August'25 Week I
Digital-Transformation-Roadmap-for-Companies.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

IJCAI13 Paper review: Large-scale spectral clustering on graphs

  • 1. Paper digest “Large-Scale Spectral Clustering on Graphs” Akisato Kimura akisato@ieee.org, @_akisato
  • 2. One-page abstract • Approx. acceleration of spectral clustering – by introducing additional nodes that enable us to compress the original graph, – resulting in a bipartite graph which is computationally efficient for spectral clustering. • Note – Large-scale spectral clustering, especially works well for dense graphs. – Not suitable for large-scale graph clustering, due to the sparsity in nature.
  • 3. Spectral clustering [Shi & Malik 1997] • Notations – Undirected weighted graph 𝐺 = 𝑉, 𝐸 – Num. nodes 𝑛 = |𝑉|; Num. Edges 𝑚 = |𝐸| – Adjacency matrix 𝑊 = 𝑊𝑖,𝑗 𝑖,𝑗=1,2,…,𝑛 • Objective function – Solved by eigen-decomposition (EVD) min 𝑋∈ℝ 𝑘×𝑛 𝑇𝑟(𝑋 𝑇 𝐷−1/2 𝐿𝐷−1/2 𝑋) s.t. 𝑋 𝑇 𝑋 = 𝐼 (𝐿: graph Laplacian of 𝑊, 𝐷 = 𝐿 − 𝑊, 𝑘: num.clusters)
  • 4. Main contribution of this work • SC needs 𝑂(𝑛3 ) computations due to EVD. • Several improvements so far. – Compressing the adjacency matrix by Nystrom method [Fowlkes+ 2004] – Reducing samples (= nodes) [Shinnou & Sasaki 2008] [Yan+ 2009] [Sakai & Imiya 2009] [Chen & Cai 2011] – Early stopping of EVD [Chen+ 2006] [Liu+ 2007] • In contrast, this work – Reducing the size of the graph.
  • 5. • Why supernodes? --- Intuition from co-clustering – A partition of supernodes can induce a partition of the observed nodes, and vise versa. • Generating a set of 𝑑 ≪ 𝑛 supernodes Introducing supernodes Original graph Regular nodes Supernodes
  • 6. How to generate supernodes 1. Randomly choosing 𝑑 regular nodes as seeds. 2. Calculating the shortest paths from the seeds to the other regular nodes. i. Converting adjacencies to distances. ii. Applying Dijkstra’s algorithm. 3. Partitioning all the regular nodes into 𝑑 disjoint subsets based on the shortest paths. 4. (Each subset corresponds to a supernode.)
  • 7. After generating supernodes 𝑛 regular nodes 𝑑 supernodes 𝑊 𝑅 𝑊 = 𝑅𝑊 𝑅 ∈ ℤ 𝑑×𝑛: binary bipartite graph 𝑊 ∈ ℝ 𝑑×𝑛: bipartite, called a “reduced graph” 𝑊Propagating edge weights between regular nodes and supernodes
  • 8. Spectral clustering on reduced graphs • Consider another representation of the reduced graph • Spectral clustering on 𝑊′ 𝑛 regular nodes 𝑑 supernodes 𝑛 regular nodes 𝑑 supernodes Result of spectral clustering on 𝑊′
  • 9. Spectral clustering on reduced graphs • Spectral clustering on 𝑊′ becomes • It can be more simplified – 𝑦 is also an eigenvector of 𝑍𝑍 𝑇 ∈ ℝ 𝑑×𝑑 𝑛 regular nodes 𝑑 supernodes • Co-clustering structure • 𝑥 and 𝑦 are left & right singular vectors of 𝑍 ∈ ℝ 𝑑×𝑛. ∵ 𝑍𝑍 𝑇 𝑦 = 𝑍 1 − 𝜆 𝑥 = 1 − 𝜆 2 𝑦 (𝑍𝑍 𝑇 looks like a compressed representation of 𝑊.)
  • 10. In summary Described by now Additional steps
  • 11. Regenerating supernodes • Intuitions 1. The matrix 𝑈 ∈ ℝ 𝑛×𝑘 implies the current clustering. 2. Most of the nodes in the same cluster expect to be densely connected. • Method – Selecting 𝑘 − 1 right (= with large eigenvalues) vectors as supernodes. 𝑈 𝑛 regular nodes 𝑑 supernodes 𝑘 cluster nodes 𝑊
  • 12. In detail New regular-super links Average affiliation score over all the samples. • Resulting in (𝑘 − 1) edges from every regular node. • Every edge stands for a binalized affiliation score • So, this idea can be easily extended to quantized affiliation scores with arbitrary sizes
  • 13. Finally, the algorithm is as follows Generating or updating supernodes Small-size spectral clustering can be replaced to a function of 𝑡 as 𝑙 𝑡
  • 14. Computational costs 3-4. 𝑂(𝑚𝑑) 1-2. 𝑂(𝑛𝑑 log 𝑛) 6. 𝑂 𝑛𝑑2 + 𝑂(𝑑3) 7-9. 𝑂(𝑛𝑑𝑘) Alg. 1: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑛𝑑2) 5. 𝑂(𝑛𝑑) 3. 𝑂(𝑛𝑑 log 𝑛 + 𝑚(𝑑 + 1)) 5. 𝑂(𝑚𝑘) Alg. 2: 𝑂(𝑚𝑘) Alg. 3: 𝑂(𝑛𝑑 log 𝑛 + 𝑚𝑑 + 𝑚𝑘𝑡 + 𝑛(𝑑2 + 𝑘2 𝑡)) If 𝑑2 ≈ 𝑘2 𝑡 ≈ log2 𝑛 → 𝑂 𝑛 log2 𝑛 ( = modularity-based clustering)
  • 15. Data sets for experiments • 2 synthetic, 2 real-world. – Syn-1k: kNN graph; 100k: 100-ins & 40-outs – DBLP: Author network, co-conference links. – IMDB: Movie network, co-director links. • Looks like moderate-scale (not large-scale) graphs…
  • 16. Experimental results Shortest Path (See Slide 6) Proposed (Alg. 1) Proposed (Alg. 3) Spectral Clustering [Khoa & Chawla 2012] [Fowlkes+ 2004] The proposed method is suitable for dense graphs. (if sparse, modularity-based clustering would be better (𝑂 𝑛 log 𝑛 ∼ 𝑂(𝑛 log2 𝑛)) )
  • 17. Detailed results Performance of the proposed methods w.r.t parameter 𝑑 (num.supernodes). Why not monotonically increasing? Performance of the proposed methods w.r.t parameter 𝑡 (num.iterations).
  • 18. Qualitative evaluations • Toy example on Syn-1K Ground truth k-NN graph SP Proposed 1 Proposed 2 (5 iterations) SC RESC Nystrom
  • 19. Comments • The idea and technique are interesting and maybe versatile. • (Serialized and parallel) implementation would be quite simple. – Matlab code is available at http://jialu.cs.illinois.edu/publication • Might be suitable only for dense graph clustering (with features).