NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Large Graphs", NIPS2017

Ho-Beom Kim
Network Science Lab
Dept. of Mathematics
The Catholic University of Korea
E-mail: hobeom2001@catholic.ac.kr
2023 / 08 / 07
HAMILTON, Will; YING, Zhitao; LESKOVEC
Advances in neural information processing systmens

2
Introduction
Problem Statements
1. The basic idea behind node embedding approaches is to use dimensionality reduction techniques to
distill the high-dimensional information about a node’s graph neighborhood into a dense vector
embedding.
2. previous works have focused on embedding nodes from a single fixed graph, and many real-world
applications require embeddings to be quickly generated for unseen nodes, or entirely new
(sub)graphs.
3. The inductive node embedding problem is especially difficult, compared to the transductive setting,
because generalizing to unseen nodes requires “aligning” newly observed subgraphs to the node
embeddings that the algorithm has already optimized on.
4. Most existing approaches to generating node embeddings are inherently transductive.

3
Introduction
Contributions
1. They propose a general framework, called GraphSAGE, for inductive node embedding
2. They leverage node features in order to learn an embedding function that generalizes to unseen nodes
3. Their algorithm can be applied to graphs without node features
4. Across domains, their supervised approach improves classification F1-scores by an average of 51%
compared to using node features alone and GraphSAGE consistently outperforms a strong

4
Methodology
Visual illustration of the GraphSAGE sample and aggregate approach

5
Methodology
GraphSAGE embedding generation algorithm

6
Methodology
GraphSAGE embedding generation algorithm

7
Methodology
Neighborhood definition
• They uniformly sample a fixed-size set of neighbors, instead of using full neighborhood sets
• They define N (v) as a fixed-size, uniform draw from the set {u ∈ V : (u, v) ∈ E}, and they draw
different uniform samples at each iteration, k
• Without this sampling the memory and expected runtime of a single batch is unpredictable and in the
worst case O(|V|).
• the per-batch space and time complexity for GraphSAGE is fixed at O( QK i=1 Si), where Si , i ∈ {1, ...,
K} and K are user-specified constants

8
Methodology
Learning the parameters of GraphSAGE

9
Methodology
Aggregator Architectures
• Mean aagregator
• LSTM aggregator
• Pooling aggregator

10
Experiments
Datasets
• Citation data
• Reddit data

11
Experiments
Prediction results for the three datasets

12
Experiments
Prediction results for the three datasets

13
Conclusion
Conclusion
• They introduced a novel approach that allows embeddings to be efficiently generated for unseen
nodes.
• GraphSAGE consistently outperforms state-of-the-art baselines, effectively trades off performance and
runtime by sampling node neighborhoods, and their theoretical analysis provides insight into how their
approach can learn about local graph structures.
• A number of extensions and potential improvements are possible, such as extending GraphSAGE to
incorporate directed or multi-modal graphs.
• A particularly interesting direction for future work is exploring non-uniform neighborhood sampling
functions, and perhaps even learning these functions as part of the GraphSAGE optimization.

NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Large Graphs", NIPS2017

More Related Content

Similar to NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Large Graphs", NIPS2017 (20)

More from ssuser4b1f48 (20)

Recently uploaded (20)

NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Large Graphs", NIPS2017

Editor's Notes