SlideShare a Scribd company logo
SSumM : Sparse Summarization
of Massive Graphs
Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin
Graphs are Everywhere
Citation networksSubway networks Internet topologies
Introduction Algorithms Experiments ConclusionProblem
Designed by Freepik from Flaticon
Massive Graphs Appeared
Social networks
2.49 Billion active users
Purchase histories
0.5 Billion products
World Wide Web
5.49 Billion web pages
Introduction Algorithms Experiments ConclusionProblem
Difficulties in Analyzing Massive graphs
Computational cost
(number of nodes & edges)
Introduction Algorithms Experiments ConclusionProblem
Difficulties in Analyzing Massive graphs
>
Cannot fit
Introduction Algorithms Experiments ConclusionProblem
Input Graph
Solution: Graph Summarization
>
Introduction Algorithms Experiments ConclusionProblem
Can fit
Summary Graph
Advantages of Graph Summarization
Introduction Algorithms Experiments ConclusionProblem
• Many graph compression techniques are available
• TheWebGraph Framework [BV04]
• BFS encoding [AD09]
• SlashBurn [KF11]
• VoG [KKVF14]
• Graph summarization stands out because
• Elastic: reduce size of outputs as much as we want
• Analyzable: existing graph analysis and tools can be applied
• Combinable for Additional Compression: can be further compressed
Example of Graph Summarization
1
2
3 4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Adjacency Matrix
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Introduction Algorithms Experiments ConclusionProblem
Input Graph
Example of Graph Summarization
1
2
3 4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Adjacency Matrix
Input Graph
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
Introduction Algorithms Experiments ConclusionProblem
Subnode
Subedge
Supernode
Example of Graph Summarization
1
2
3 4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Adjacency Matrix
Input Graph
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3
Introduction Algorithms Experiments ConclusionProblem
Subnode
Subedge
Superedge
Supernode
Example of Graph Summarization
1
2
3 4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Adjacency Matrix
Input Graph
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3
𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟕𝟕} =2
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1
𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3
𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5
Introduction Algorithms Experiments ConclusionProblem
Example of Graph Summarization
1
2
3 4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Adjacency Matrix
Input Graph
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1
𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3
𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5
Introduction Algorithms Experiments ConclusionProblem
Example of Graph Summarization
Introduction Algorithms Experiments ConclusionProblem
1 2 3 4 5 6 7 8 9
1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3
2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3
3 3/8 3/8 0 5/6 5/6 5/6 0 0 0
4 3/8 3/8 5/6 0 5/6 5/6 0 0 0
5 3/8 3/8 5/6 5/6 0 5/6 0 0 0
6 3/8 3/8 5/6 5/6 5/6 0 0 0 0
7 1/3 1/3 0 0 0 0 0 1 1
8 1/3 1/3 0 0 0 0 1 0 1
9 1/3 1/3 0 0 0 0 1 1 0
Reconstructed Adjacency Matrix
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1
𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3
𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5
Example of Graph Summarization
𝑽𝑽𝟏𝟏={1,2}
𝑽𝑽𝟕𝟕={7,8,9}
𝑽𝑽𝟑𝟑={3,4,5,6}
Summary Graph
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1
𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3
𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5
Introduction Algorithms Experiments ConclusionProblem
1 2 3 4 5 6 7 8 9
1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3
2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3
3 3/8 3/8 0 5/6 5/6 5/6 0 0 0
4 3/8 3/8 5/6 0 5/6 5/6 0 0 0
5 3/8 3/8 5/6 5/6 0 5/6 0 0 0
6 3/8 3/8 5/6 5/6 5/6 0 0 0 0
7 1/3 1/3 0 0 0 0 0 1 1
8 1/3 1/3 0 0 0 0 1 0 1
9 1/3 1/3 0 0 0 0 1 1 0
Reconstructed Adjacency Matrix
𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} = 3
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑀𝑀𝑀𝑀𝑀𝑀 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 8
Road Map
• Introduction
• Problem <<
• Proposed Algorithm: SSumM
• Experimental Results
• Conclusions
Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the target number of node 𝑲𝑲
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲
Introduction Algorithms Experiments ConclusionProblem
Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the target number of node 𝑲𝑲
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲
Shouldn’t we
consider sizes?
Introduction Algorithms Experiments ConclusionProblem
Problem Definition: Graph Summarization
𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮:
a graph 𝑮𝑮 and the desired size 𝑲𝑲 (in bits)
𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭:
a summary graph �𝑮𝑮
𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴:
the difference with graph graph 𝑮𝑮and the restored graph �𝑮𝑮
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕:
size of �𝑮𝑮 in bits ≤ 𝑲𝑲
Introduction Algorithms Experiments ConclusionProblem
Details: Size in Bits of a Graph
𝐸𝐸 : set of edges
𝑉𝑉 : set of nodes
Encoded using log2|𝑉𝑉| bits
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 2 𝐸𝐸 log2 𝑉𝑉
Input graph 𝑮𝑮
Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
Details: Size in Bits of a Summary Graph
4
5
1
4
1
5
Introduction Algorithms Experiments ConclusionProblem
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆
𝑆𝑆 : set of supernodes
𝑃𝑃 : set of superedges
𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight
Summary graph �𝑮𝑮
Details: Error Measurement
Introduction Algorithms Experiments ConclusionProblem
1 2 3 4 5 6 7 8 9
1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3
2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3
3 3/8 3/8 0 5/6 5/6 5/6 0 0 0
4 3/8 3/8 5/6 0 5/6 5/6 0 0 0
5 3/8 3/8 5/6 5/6 0 5/6 0 0 0
6 3/8 3/8 5/6 5/6 5/6 0 0 0 0
7 1/3 1/3 0 0 0 0 0 1 1
8 1/3 1/3 0 0 0 0 1 0 1
9 1/3 1/3 0 0 0 0 1 1 0
1 2 3 4 5 6 7 8 9
1 0 1 0 0 0 1 0 0 1
2 1 0 1 0 0 1 1 0 0
3 0 1 0 1 1 1 0 0 1
4 0 0 1 0 1 0 0 0 0
5 0 0 1 1 0 1 1 0 0
6 1 1 1 0 1 0 0 0 0
7 0 1 0 0 1 0 0 1 1
8 0 0 0 0 0 0 1 0 1
9 1 0 1 0 0 0 1 1 0
Reconstructed Adjacency Matrix �𝑨𝑨Reconstructed Adjacency Matrix 𝑨𝑨
𝑅𝑅𝐸𝐸𝑝𝑝(𝑨𝑨, �𝑨𝑨) = �
𝑖𝑖=1
𝑉𝑉
�
𝑗𝑗=1
𝑉𝑉
𝐴𝐴 𝑖𝑖, 𝑗𝑗 − ̂𝐴𝐴 𝑖𝑖, 𝑗𝑗
𝑝𝑝
𝟏𝟏
𝒑𝒑
Road Map
• Introduction
• Problem
• Proposed Algorithm: SSumM <<
• Experimental Results
• Conclusions
Main ideas of SSumM
Introduction Algorithms Experiments ConclusionProblem
Combines node grouping and edge sparsification
Prunes search space
Balances error and size of the summary graph using MDL principle
• Practical graph summarization problem
◦ Given: a graph 𝑮𝑮
◦ Find: a summary graph �𝑮𝑮
◦ To minimize: the difference between 𝑮𝑮and the restored graph �𝑮𝑮
◦ Subject to: 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑜𝑜𝑜𝑜 �𝑮𝑮 in bits ≤ 𝑲𝑲
Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
Main Idea: Combining Two Strategies
Introduction Algorithms Experiments ConclusionProblem
Node Grouping Sparsification
Main Idea: MDL Principle
Introduction Algorithms Experiments ConclusionProblem
1
2
3
4
5
6
Merge (5, 6)
1
2
3
4
Merge (1, 2)
3
4
5
6
5
6
1
2
{1,2}
{5,6}Merge (1, {5,6})
Merge (1, 2)
Merge (1, 3)
Merge (1, 3)
How to choose a next action?
Main Idea: MDL Principle
Introduction Algorithms Experiments ConclusionProblem
1
2
3
4
5
6
Merge (5, 6)
1
2
3
4
Merge (1, 2)
3
4
5
6
5
6
1
2
{1,2}
{5,6}Merge (1, {5,6})
Merge (1, 2)
Merge (1, 3)
Merge (1, 3)
Graph Summarization is
A Search Problem
How to choose a next action?
Main Idea: MDL Principle
Introduction Algorithms Experiments ConclusionProblem
1
2
3
4
5
6
Merge (5, 6)
1
2
3
4
Merge (1, 2)
3
4
5
6
5
6
1
2
{1,2}
{5,6}
Summary graph size + Information loss
Merge (1, {5,6})
Merge (1, 2)
Merge (1, 3)
Merge (1, 3)
Graph Summarization is
A Search Problem
How to choose a next action?
Main Idea: MDL Principle
Introduction Algorithms Experiments ConclusionProblem
1
2
3
4
5
6
Merge (5, 6)
1
2
3
4
Merge (1, 2)
3
4
5
6
5
6
1
2
{1,2}
{5,6}
Summary graph size + Information loss
MDL Principle
Merge (1, {5,6})
Merge (1, 2)
Merge (1, 3)
Merge (1, 3)
Graph Summarization is
A Search Problem
How to choose a next action?
arg min 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 �𝑮𝑮 + 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶(𝑮𝑮|�𝑮𝑮)
�𝑮𝑮
# 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 �𝑮𝑮 # 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑮𝑮
𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 �𝑮𝑮
Overview: SSumM
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
Introduction Algorithms Experiments ConclusionProblem
Procedure
• Given:
◦ (1) An input graph 𝑮𝑮, (2) the desired size 𝑲𝑲, (3) the number 𝑻𝑻 of iterations
• Outputs:
◦ Summary graph �𝑮𝑮
Input graph 𝑮𝑮 Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase <<
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
𝑎𝑎
𝑏𝑏𝑐𝑐
𝑑𝑑
𝑒𝑒
𝑓𝑓
𝑔𝑔
ℎ
𝑖𝑖
𝐴𝐴 = {𝑎𝑎}
𝐵𝐵 = {𝑏𝑏}C = {𝑐𝑐}
𝐷𝐷 = {𝑑𝑑}
𝐸𝐸 = {𝑒𝑒}
𝐹𝐹 = {𝑓𝑓}
𝐺𝐺 = {𝑔𝑔}
𝐻𝐻 = {ℎ}
𝐼𝐼 = {𝑖𝑖}
Initialization Phase
Candidate Generation Phase
𝐵𝐵 = {𝑏𝑏}
𝐶𝐶 = {𝑐𝑐}
D = {𝑑𝑑}
𝐸𝐸 = {𝑒𝑒}
𝐴𝐴 = {𝑎𝑎}
𝐹𝐹 = {𝑓𝑓}
𝐺𝐺 = {𝑔𝑔}
𝐻𝐻 = {ℎ}
𝐼𝐼 = {𝑖𝑖}
Input graph 𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase <<
 Merge and sparsification phase
 Further sparsification phase
𝑎𝑎
𝑏𝑏𝑐𝑐
𝑑𝑑
𝑒𝑒
𝑓𝑓
𝑔𝑔
ℎ
𝑖𝑖
Merging and Sparsification Phase
For each candidate set 𝑪𝑪 Among possible candidate pairs
Introduction Algorithms Experiments ConclusionProblem
(A, B) (B, C) (C, D) (D, E)
(A, C) (B, D) (C, E)
(A, D) (B, E)
(A, E)
𝐵𝐵 = {𝑏𝑏}
𝐶𝐶 = {𝑐𝑐}
D = {𝑑𝑑}
𝐴𝐴 = {𝑎𝑎}
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
𝐸𝐸 = {𝑒𝑒}
Merging and Sparsification Phase
For each candidate set 𝑪𝑪 Among possible candidate pairs
Introduction Algorithms Experiments ConclusionProblem
(A, B) (B, C) (C, D) (D, E)
(A, C) (B, D) (C, E)
(A, D) (B, E)
(A, E)
𝐵𝐵 = {𝑏𝑏}
𝐶𝐶 = {𝑐𝑐}
D = {𝑑𝑑}
𝐴𝐴 = {𝑎𝑎}
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
Sample log2 |𝑪𝑪| pairs𝐸𝐸 = {𝑒𝑒}
Merging and Sparsification Phase
Select the pair with
the greatest (relative) reduction
in the cost function
𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽:
merge(C, D)
𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆
sample log2 |𝑪𝑪| pairs again
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
(A, B) (A, D) (C, D)
Merging and Sparsification Phase
Select the pair with
the greatest (relative) reduction
in the cost function
𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽:
merge(C, D)
𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆
sample log2 |𝑪𝑪| pairs again
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
(A, B) (A, D) (C, D)
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}𝐶𝐶 = {𝑐𝑐}
𝐷𝐷 = {𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
Merging and Sparsification Phase (cont.)
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase <<
 Further sparsification phase
Sparsify or not according
to total description cost
{𝑎𝑎}
{𝑏𝑏}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
C = {𝑐𝑐, 𝑑𝑑}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {ℎ}
{𝑔𝑔, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎, 𝑒𝑒}
Repetition
Summary graph �𝑮𝑮
Introduction Algorithms Experiments ConclusionProblem
Summary graph �𝑮𝑮
Different
candidate sets
and decreasing
threshold 𝜃𝜃
over iteration
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑓𝑓}
{𝑔𝑔, ℎ, 𝑖𝑖}
Summary graph �𝑮𝑮
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖}
{𝑎𝑎}
{𝑏𝑏}
{𝑐𝑐, 𝑑𝑑}
{𝑒𝑒}
{𝑓𝑓}
{𝑔𝑔}
{ℎ}
{𝑖𝑖}
{𝑎𝑎, 𝑒𝑒}
Introduction Algorithms Experiments ConclusionProblem
Further Sparsification Phase
Summary graph �𝑮𝑮
𝐴𝐴 𝐴𝐴
𝐵𝐵 𝐴𝐴
𝐹𝐹 𝐴𝐴
𝐺𝐺 𝐺𝐺
Superedges sorted by ∆𝑅𝑅𝐸𝐸𝑝𝑝
𝐶𝐶 𝐶𝐶
Size of �𝑮𝑮 in bits ≤ 𝑲𝑲
Procedure
 Initialization phase
 𝑡𝑡 = 1
 While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits
 Candidate generation phase
 Merge and sparsification phase
 Further sparsification phase <<
C = {𝑐𝑐, 𝑑𝑑}
𝐴𝐴 = {𝑎𝑎, 𝑒𝑒}
𝐵𝐵 = {𝑏𝑏}
𝐹𝐹 = {𝑓𝑓}
𝐺𝐺 = {𝑔𝑔, ℎ, 𝑖𝑖}
Road Map
• Introduction
• Problem
• Proposed Algorithm: SSumM
• Experimental Results <<
• Conclusions
• 10 datasets from 6 domains (up to 0.8B edges)
• Three competitors for graph summarization
◦ k-Gs [LT10]
◦ S2L [RSB17]
◦ SAA-Gs [BAZK18]
Introduction Algorithms Experiments ConclusionProblem
Social Internet Email Co-purchase Collaboration Hyperlinks
Experiments Settings
Email-Enron Caida Ego-Facebook
Web-UK-05 Web-UK-02 LiveJournal
DBLP Amazon-0302Skitter
k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample)
Introduction Algorithms Experiments ConclusionProblem
o.o.t. >12hours
o.o.m. >64GB
SSumM Gives Concise and Accurate Summary
Email-Enron Caida Ego-Facebook
Web-UK-05 Web-UK-02 LiveJournal
DBLP Amazon-0302Skitter
k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample)
Introduction Algorithms Experiments ConclusionProblem
o.o.t. >12hours
o.o.m. >64GB
SSumM Gives Concise and Accurate Summary
Email-Enron Caida Ego-Facebook
Web-UK-05 Web-UK-02 LiveJournal
Amazon-0601 DBLPSkitter
k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample)
Introduction Algorithms Experiments ConclusionProblem
SSumM is Fast
Email-Enron Caida Ego-Facebook
Web-UK-05 Web-UK-02 LiveJournal
Amazon-0601 DBLPSkitter
k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample)
Introduction Algorithms Experiments ConclusionProblem
SSumM is Fast
Introduction Algorithms Experiments ConclusionProblem
SSumM is Scalable
Introduction Algorithms Experiments ConclusionProblem
SSumM Converges Fast
Road Map
• Introduction
• Problem
• Proposed Algorithm: SSumM
• Experimental Results
• Conclusions <<
Code available at https://github.com/KyuhanLee/SSumM
Concise & Accurate Fast Scalable
Introduction Algorithms Experiments ConclusionProblem
Practical Problem Formulation
Extensive Experiments on 10 real world graphs
Scalable and Effective Algorithm Design
Conclusions
SSumM : Sparse Summarization
of Massive Graphs
Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin

More Related Content

PDF
"Incremental Lossless Graph Summarization", KDD 2020
PDF
Hmotif vldb2020 slide
PDF
GraphBLAS: A linear algebraic approach for high-performance graph queries
PDF
Identification of unknown parameters and prediction of missing values. Compar...
PDF
Fixed point theorems for random variables in complete metric spaces
PDF
Application of parallel hierarchical matrices for parameter inference and pre...
PDF
Trial pahang 2014 spm add math k2 dan skema [scan]
PDF
Hierarchical matrix techniques for maximum likelihood covariance estimation
"Incremental Lossless Graph Summarization", KDD 2020
Hmotif vldb2020 slide
GraphBLAS: A linear algebraic approach for high-performance graph queries
Identification of unknown parameters and prediction of missing values. Compar...
Fixed point theorems for random variables in complete metric spaces
Application of parallel hierarchical matrices for parameter inference and pre...
Trial pahang 2014 spm add math k2 dan skema [scan]
Hierarchical matrix techniques for maximum likelihood covariance estimation

What's hot (20)

PDF
Igraph
PDF
Application of parallel hierarchical matrices and low-rank tensors in spatial...
PDF
Trial terengganu 2014 spm add math k2 skema
PDF
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
PDF
Add Maths 1
PDF
Hideitsu Hino
PDF
Hiroyuki Sato
PPTX
Oct8 - 131 slid
PDF
Skema SPM SBP Add Maths Paper 2012
PDF
Tetsunao Matsuta
PDF
Deep genenergyprobdoc
PPT
ppt_tech
PDF
Hiroaki Shiokawa
PDF
Lecture 5: Stochastic Hydrology
PDF
Examens math
PDF
Solution Manual : Chapter - 01 Functions
PDF
Identification of unknown parameters and prediction with hierarchical matrice...
PDF
Sbe final exam jan17 - solved-converted
PDF
Solution Manual : Chapter - 05 Integration
PDF
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Igraph
Application of parallel hierarchical matrices and low-rank tensors in spatial...
Trial terengganu 2014 spm add math k2 skema
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Add Maths 1
Hideitsu Hino
Hiroyuki Sato
Oct8 - 131 slid
Skema SPM SBP Add Maths Paper 2012
Tetsunao Matsuta
Deep genenergyprobdoc
ppt_tech
Hiroaki Shiokawa
Lecture 5: Stochastic Hydrology
Examens math
Solution Manual : Chapter - 01 Functions
Identification of unknown parameters and prediction with hierarchical matrice...
Sbe final exam jan17 - solved-converted
Solution Manual : Chapter - 05 Integration
Solution Manual : Chapter - 06 Application of the Definite Integral in Geomet...
Ad

Similar to "SSumM: Sparse Summarization of Massive Graphs", KDD 2020 (20)

PPTX
Control charts
PPTX
Brief instruction on backprop
PPTX
Data Analysis Assignment Help
PDF
Polygon Filling method by computer science.pdf
PPTX
Pratt truss optimization using
PDF
Enumerating cycles in bipartite graph using matrix approach
PPT
a questionnaire for q learning and its whys
PPT
why you need q learning and what are the reasonings
PDF
Nelson maple pdf
PDF
Numpy intro presentation for college.pdf
PDF
Measurement of reliability parameters for a power
PDF
Overview of sparse and low-rank matrix / tensor techniques
PDF
My presentation at University of Nottingham "Fast low-rank methods for solvin...
PPTX
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
PDF
Computer Vision: Correlation, Convolution, and Gradient
PPTX
Cluto presentation
PPT
Vine shortest example
PPTX
Statistical quality control, sampling
PDF
Shmoo Quantify
PDF
newmicrosoftofficepowerpointpresentation-150826055944-lva1-app6891.pdf
Control charts
Brief instruction on backprop
Data Analysis Assignment Help
Polygon Filling method by computer science.pdf
Pratt truss optimization using
Enumerating cycles in bipartite graph using matrix approach
a questionnaire for q learning and its whys
why you need q learning and what are the reasonings
Nelson maple pdf
Numpy intro presentation for college.pdf
Measurement of reliability parameters for a power
Overview of sparse and low-rank matrix / tensor techniques
My presentation at University of Nottingham "Fast low-rank methods for solvin...
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
Computer Vision: Correlation, Convolution, and Gradient
Cluto presentation
Vine shortest example
Statistical quality control, sampling
Shmoo Quantify
newmicrosoftofficepowerpointpresentation-150826055944-lva1-app6891.pdf
Ad

Recently uploaded (20)

PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Business Analytics and business intelligence.pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Database Infoormation System (DBIS).pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Quality review (1)_presentation of this 21
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Lecture1 pattern recognition............
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Computer network topology notes for revision
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Foundation of Data Science unit number two notes
PPT
Reliability_Chapter_ presentation 1221.5784
IB Computer Science - Internal Assessment.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Analytics and business intelligence.pdf
Introduction-to-Cloud-ComputingFinal.pptx
Supervised vs unsupervised machine learning algorithms
Database Infoormation System (DBIS).pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
climate analysis of Dhaka ,Banglades.pptx
Quality review (1)_presentation of this 21
Clinical guidelines as a resource for EBP(1).pdf
Lecture1 pattern recognition............
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Computer network topology notes for revision
Galatica Smart Energy Infrastructure Startup Pitch Deck
Foundation of Data Science unit number two notes
Reliability_Chapter_ presentation 1221.5784

"SSumM: Sparse Summarization of Massive Graphs", KDD 2020

  • 1. SSumM : Sparse Summarization of Massive Graphs Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin
  • 2. Graphs are Everywhere Citation networksSubway networks Internet topologies Introduction Algorithms Experiments ConclusionProblem Designed by Freepik from Flaticon
  • 3. Massive Graphs Appeared Social networks 2.49 Billion active users Purchase histories 0.5 Billion products World Wide Web 5.49 Billion web pages Introduction Algorithms Experiments ConclusionProblem
  • 4. Difficulties in Analyzing Massive graphs Computational cost (number of nodes & edges) Introduction Algorithms Experiments ConclusionProblem
  • 5. Difficulties in Analyzing Massive graphs > Cannot fit Introduction Algorithms Experiments ConclusionProblem Input Graph
  • 6. Solution: Graph Summarization > Introduction Algorithms Experiments ConclusionProblem Can fit Summary Graph
  • 7. Advantages of Graph Summarization Introduction Algorithms Experiments ConclusionProblem • Many graph compression techniques are available • TheWebGraph Framework [BV04] • BFS encoding [AD09] • SlashBurn [KF11] • VoG [KKVF14] • Graph summarization stands out because • Elastic: reduce size of outputs as much as we want • Analyzable: existing graph analysis and tools can be applied • Combinable for Additional Compression: can be further compressed
  • 8. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Introduction Algorithms Experiments ConclusionProblem Input Graph
  • 9. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph Introduction Algorithms Experiments ConclusionProblem Subnode Subedge Supernode
  • 10. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3 Introduction Algorithms Experiments ConclusionProblem Subnode Subedge Superedge Supernode
  • 11. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3 𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟕𝟕} =2 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1 𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3 𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5 Introduction Algorithms Experiments ConclusionProblem
  • 12. Example of Graph Summarization 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Adjacency Matrix Input Graph 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1 𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3 𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5 Introduction Algorithms Experiments ConclusionProblem
  • 13. Example of Graph Summarization Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 Reconstructed Adjacency Matrix 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1 𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3 𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5
  • 14. Example of Graph Summarization 𝑽𝑽𝟏𝟏={1,2} 𝑽𝑽𝟕𝟕={7,8,9} 𝑽𝑽𝟑𝟑={3,4,5,6} Summary Graph 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟕𝟕} =2 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} =3𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟏𝟏} =1 𝝎𝝎 {𝑽𝑽𝟕𝟕, 𝑽𝑽𝟕𝟕} =3 𝝎𝝎 {𝑽𝑽𝟑𝟑, 𝑽𝑽𝟑𝟑} =5 Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 Reconstructed Adjacency Matrix 𝝎𝝎 {𝑽𝑽𝟏𝟏, 𝑽𝑽𝟑𝟑} = 3 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑀𝑀𝑀𝑀𝑀𝑀 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 8
  • 15. Road Map • Introduction • Problem << • Proposed Algorithm: SSumM • Experimental Results • Conclusions
  • 16. Problem Definition: Graph Summarization 𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮: a graph 𝑮𝑮 and the target number of node 𝑲𝑲 𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭: a summary graph �𝑮𝑮 𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴: the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕: the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲 Introduction Algorithms Experiments ConclusionProblem
  • 17. Problem Definition: Graph Summarization 𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮: a graph 𝑮𝑮 and the target number of node 𝑲𝑲 𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭: a summary graph �𝑮𝑮 𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴: the difference between graph 𝑮𝑮and the restored graph �𝑮𝑮 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕: the number of supernodes in �𝑮𝑮 ≤ 𝑲𝑲 Shouldn’t we consider sizes? Introduction Algorithms Experiments ConclusionProblem
  • 18. Problem Definition: Graph Summarization 𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮𝑮: a graph 𝑮𝑮 and the desired size 𝑲𝑲 (in bits) 𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭: a summary graph �𝑮𝑮 𝑻𝑻𝑻𝑻 𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴𝑴: the difference with graph graph 𝑮𝑮and the restored graph �𝑮𝑮 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒕𝒕𝒕𝒕: size of �𝑮𝑮 in bits ≤ 𝑲𝑲 Introduction Algorithms Experiments ConclusionProblem
  • 19. Details: Size in Bits of a Graph 𝐸𝐸 : set of edges 𝑉𝑉 : set of nodes Encoded using log2|𝑉𝑉| bits Introduction Algorithms Experiments ConclusionProblem 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 2 𝐸𝐸 log2 𝑉𝑉 Input graph 𝑮𝑮
  • 20. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆 𝑆𝑆 : set of supernodes 𝑃𝑃 : set of superedges 𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight Summary graph �𝑮𝑮
  • 21. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆 𝑆𝑆 : set of supernodes 𝑃𝑃 : set of superedges 𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight Summary graph �𝑮𝑮
  • 22. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆 𝑆𝑆 : set of supernodes 𝑃𝑃 : set of superedges 𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight Summary graph �𝑮𝑮
  • 23. Details: Size in Bits of a Summary Graph 4 5 1 4 1 5 Introduction Algorithms Experiments ConclusionProblem 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝒐𝒐𝒐𝒐 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈𝒈: 𝑃𝑃 2 log2 𝑆𝑆 + log2 𝜔𝜔𝑚𝑚𝑚𝑚𝑚𝑚 + 𝑉𝑉 log2 𝑆𝑆 𝑆𝑆 : set of supernodes 𝑃𝑃 : set of superedges 𝑤𝑤𝑚𝑚𝑚𝑚𝑚𝑚 : maximum superedge weight Summary graph �𝑮𝑮
  • 24. Details: Error Measurement Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 7 8 9 1 0 1 3/8 3/8 3/8 3/8 1/3 1/3 1/3 2 1 0 3/8 3/8 3/8 3/8 1/3 1/3 1/3 3 3/8 3/8 0 5/6 5/6 5/6 0 0 0 4 3/8 3/8 5/6 0 5/6 5/6 0 0 0 5 3/8 3/8 5/6 5/6 0 5/6 0 0 0 6 3/8 3/8 5/6 5/6 5/6 0 0 0 0 7 1/3 1/3 0 0 0 0 0 1 1 8 1/3 1/3 0 0 0 0 1 0 1 9 1/3 1/3 0 0 0 0 1 1 0 1 2 3 4 5 6 7 8 9 1 0 1 0 0 0 1 0 0 1 2 1 0 1 0 0 1 1 0 0 3 0 1 0 1 1 1 0 0 1 4 0 0 1 0 1 0 0 0 0 5 0 0 1 1 0 1 1 0 0 6 1 1 1 0 1 0 0 0 0 7 0 1 0 0 1 0 0 1 1 8 0 0 0 0 0 0 1 0 1 9 1 0 1 0 0 0 1 1 0 Reconstructed Adjacency Matrix �𝑨𝑨Reconstructed Adjacency Matrix 𝑨𝑨 𝑅𝑅𝐸𝐸𝑝𝑝(𝑨𝑨, �𝑨𝑨) = � 𝑖𝑖=1 𝑉𝑉 � 𝑗𝑗=1 𝑉𝑉 𝐴𝐴 𝑖𝑖, 𝑗𝑗 − ̂𝐴𝐴 𝑖𝑖, 𝑗𝑗 𝑝𝑝 𝟏𝟏 𝒑𝒑
  • 25. Road Map • Introduction • Problem • Proposed Algorithm: SSumM << • Experimental Results • Conclusions
  • 26. Main ideas of SSumM Introduction Algorithms Experiments ConclusionProblem Combines node grouping and edge sparsification Prunes search space Balances error and size of the summary graph using MDL principle • Practical graph summarization problem ◦ Given: a graph 𝑮𝑮 ◦ Find: a summary graph �𝑮𝑮 ◦ To minimize: the difference between 𝑮𝑮and the restored graph �𝑮𝑮 ◦ Subject to: 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑜𝑜𝑜𝑜 �𝑮𝑮 in bits ≤ 𝑲𝑲
  • 27. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  • 28. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  • 29. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  • 30. Main Idea: Combining Two Strategies Introduction Algorithms Experiments ConclusionProblem Node Grouping Sparsification
  • 31. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6}Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) How to choose a next action?
  • 32. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6}Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action?
  • 33. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6} Summary graph size + Information loss Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action?
  • 34. Main Idea: MDL Principle Introduction Algorithms Experiments ConclusionProblem 1 2 3 4 5 6 Merge (5, 6) 1 2 3 4 Merge (1, 2) 3 4 5 6 5 6 1 2 {1,2} {5,6} Summary graph size + Information loss MDL Principle Merge (1, {5,6}) Merge (1, 2) Merge (1, 3) Merge (1, 3) Graph Summarization is A Search Problem How to choose a next action? arg min 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 �𝑮𝑮 + 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶(𝑮𝑮|�𝑮𝑮) �𝑮𝑮 # 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 �𝑮𝑮 # 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑮𝑮 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 �𝑮𝑮
  • 35. Overview: SSumM  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase Introduction Algorithms Experiments ConclusionProblem Procedure • Given: ◦ (1) An input graph 𝑮𝑮, (2) the desired size 𝑲𝑲, (3) the number 𝑻𝑻 of iterations • Outputs: ◦ Summary graph �𝑮𝑮
  • 36. Input graph 𝑮𝑮 Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase <<  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase 𝑎𝑎 𝑏𝑏𝑐𝑐 𝑑𝑑 𝑒𝑒 𝑓𝑓 𝑔𝑔 ℎ 𝑖𝑖 𝐴𝐴 = {𝑎𝑎} 𝐵𝐵 = {𝑏𝑏}C = {𝑐𝑐} 𝐷𝐷 = {𝑑𝑑} 𝐸𝐸 = {𝑒𝑒} 𝐹𝐹 = {𝑓𝑓} 𝐺𝐺 = {𝑔𝑔} 𝐻𝐻 = {ℎ} 𝐼𝐼 = {𝑖𝑖} Initialization Phase
  • 37. Candidate Generation Phase 𝐵𝐵 = {𝑏𝑏} 𝐶𝐶 = {𝑐𝑐} D = {𝑑𝑑} 𝐸𝐸 = {𝑒𝑒} 𝐴𝐴 = {𝑎𝑎} 𝐹𝐹 = {𝑓𝑓} 𝐺𝐺 = {𝑔𝑔} 𝐻𝐻 = {ℎ} 𝐼𝐼 = {𝑖𝑖} Input graph 𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase <<  Merge and sparsification phase  Further sparsification phase 𝑎𝑎 𝑏𝑏𝑐𝑐 𝑑𝑑 𝑒𝑒 𝑓𝑓 𝑔𝑔 ℎ 𝑖𝑖
  • 38. Merging and Sparsification Phase For each candidate set 𝑪𝑪 Among possible candidate pairs Introduction Algorithms Experiments ConclusionProblem (A, B) (B, C) (C, D) (D, E) (A, C) (B, D) (C, E) (A, D) (B, E) (A, E) 𝐵𝐵 = {𝑏𝑏} 𝐶𝐶 = {𝑐𝑐} D = {𝑑𝑑} 𝐴𝐴 = {𝑎𝑎} Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase 𝐸𝐸 = {𝑒𝑒}
  • 39. Merging and Sparsification Phase For each candidate set 𝑪𝑪 Among possible candidate pairs Introduction Algorithms Experiments ConclusionProblem (A, B) (B, C) (C, D) (D, E) (A, C) (B, D) (C, E) (A, D) (B, E) (A, E) 𝐵𝐵 = {𝑏𝑏} 𝐶𝐶 = {𝑐𝑐} D = {𝑑𝑑} 𝐴𝐴 = {𝑎𝑎} Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase Sample log2 |𝑪𝑪| pairs𝐸𝐸 = {𝑒𝑒}
  • 40. Merging and Sparsification Phase Select the pair with the greatest (relative) reduction in the cost function 𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽: merge(C, D) 𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆 sample log2 |𝑪𝑪| pairs again Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase (A, B) (A, D) (C, D)
  • 41. Merging and Sparsification Phase Select the pair with the greatest (relative) reduction in the cost function 𝒊𝒊𝒊𝒊 reduction(C, D) > 𝜽𝜽: merge(C, D) 𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆 sample log2 |𝑪𝑪| pairs again Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase (A, B) (A, D) (C, D)
  • 42. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase {𝑎𝑎} {𝑏𝑏}𝐶𝐶 = {𝑐𝑐} 𝐷𝐷 = {𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 43. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 44. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} C = {𝑐𝑐, 𝑑𝑑}
  • 45. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} C = {𝑐𝑐, 𝑑𝑑}
  • 46. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase Sparsify or not according to total description cost {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} C = {𝑐𝑐, 𝑑𝑑}
  • 47. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase Sparsify or not according to total description cost {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} C = {𝑐𝑐, 𝑑𝑑}
  • 48. Merging and Sparsification Phase (cont.) Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase <<  Further sparsification phase Sparsify or not according to total description cost {𝑎𝑎} {𝑏𝑏} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} C = {𝑐𝑐, 𝑑𝑑}
  • 49. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 50. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 51. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 52. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {ℎ} {𝑔𝑔, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 53. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {ℎ} {𝑔𝑔, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 54. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {ℎ} {𝑔𝑔, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 55. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 56. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 57. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} Summary graph �𝑮𝑮 {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖}
  • 58. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} Summary graph �𝑮𝑮 {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} {𝑎𝑎, 𝑒𝑒}
  • 59. Repetition Summary graph �𝑮𝑮 Introduction Algorithms Experiments ConclusionProblem Summary graph �𝑮𝑮 Different candidate sets and decreasing threshold 𝜃𝜃 over iteration Procedure  Initialization phase  𝑡𝑡 = 1  While 𝒕𝒕++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} Summary graph �𝑮𝑮 {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔, ℎ, 𝑖𝑖} {𝑎𝑎} {𝑏𝑏} {𝑐𝑐, 𝑑𝑑} {𝑒𝑒} {𝑓𝑓} {𝑔𝑔} {ℎ} {𝑖𝑖} {𝑎𝑎, 𝑒𝑒}
  • 60. Introduction Algorithms Experiments ConclusionProblem Further Sparsification Phase Summary graph �𝑮𝑮 𝐴𝐴 𝐴𝐴 𝐵𝐵 𝐴𝐴 𝐹𝐹 𝐴𝐴 𝐺𝐺 𝐺𝐺 Superedges sorted by ∆𝑅𝑅𝐸𝐸𝑝𝑝 𝐶𝐶 𝐶𝐶 Size of �𝑮𝑮 in bits ≤ 𝑲𝑲 Procedure  Initialization phase  𝑡𝑡 = 1  While 𝑡𝑡++ ≤ 𝑻𝑻 and 𝑲𝑲 < size of �𝑮𝑮 in bits  Candidate generation phase  Merge and sparsification phase  Further sparsification phase << C = {𝑐𝑐, 𝑑𝑑} 𝐴𝐴 = {𝑎𝑎, 𝑒𝑒} 𝐵𝐵 = {𝑏𝑏} 𝐹𝐹 = {𝑓𝑓} 𝐺𝐺 = {𝑔𝑔, ℎ, 𝑖𝑖}
  • 61. Road Map • Introduction • Problem • Proposed Algorithm: SSumM • Experimental Results << • Conclusions
  • 62. • 10 datasets from 6 domains (up to 0.8B edges) • Three competitors for graph summarization ◦ k-Gs [LT10] ◦ S2L [RSB17] ◦ SAA-Gs [BAZK18] Introduction Algorithms Experiments ConclusionProblem Social Internet Email Co-purchase Collaboration Hyperlinks Experiments Settings
  • 63. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal DBLP Amazon-0302Skitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem o.o.t. >12hours o.o.m. >64GB SSumM Gives Concise and Accurate Summary
  • 64. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal DBLP Amazon-0302Skitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem o.o.t. >12hours o.o.m. >64GB SSumM Gives Concise and Accurate Summary
  • 65. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal Amazon-0601 DBLPSkitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem SSumM is Fast
  • 66. Email-Enron Caida Ego-Facebook Web-UK-05 Web-UK-02 LiveJournal Amazon-0601 DBLPSkitter k-GsSSumM S2LSAA-Gs SAA-Gs (linear sample) Introduction Algorithms Experiments ConclusionProblem SSumM is Fast
  • 67. Introduction Algorithms Experiments ConclusionProblem SSumM is Scalable
  • 68. Introduction Algorithms Experiments ConclusionProblem SSumM Converges Fast
  • 69. Road Map • Introduction • Problem • Proposed Algorithm: SSumM • Experimental Results • Conclusions <<
  • 70. Code available at https://github.com/KyuhanLee/SSumM Concise & Accurate Fast Scalable Introduction Algorithms Experiments ConclusionProblem Practical Problem Formulation Extensive Experiments on 10 real world graphs Scalable and Effective Algorithm Design Conclusions
  • 71. SSumM : Sparse Summarization of Massive Graphs Kyuhan Lee* Hyeonsoo Jo* Jihoon Ko Sungsu Lim Kijung Shin