SlideShare a Scribd company logo
Ho-Beom Kim
Network Science Lab
Dept. of Mathematics
The Catholic University of Korea
E-mail: hobeom2001@catholic.ac.kr
2023 / 07 / 24
DONG, Yuxiao; CHAWLA, Nitesh V.; SWAMI, Ananthram.
ACM SIGKDD
2
Introduction
Problem Statements
• Neural network-based learning models can represent latent embeddings that capture the internal relations
of rich, complex data across various modalities, such as image, audio, and language.
• Social and information networks are similarly rich and complex data that encode the dynamics and types of
human interactions, and are similarly amenable to representation learning using neural networks.
• Recent research publications have proposed word2vec-based network representation learning frameworks
• DeepWalk, LINE, node2vec
→ These work has far focused on representation learning for homogeneous networks-representative of
singular type of nodes and relationships.
• A large number of social and information networks are heterogeneous in nature, involving diversity of node
types and/or relationships between nodes
3
Introduction
Problem Statements
• The latent heterogeneous network embeddings can be further applied to various network mining tasks,
such as node classification, clustering, and similarity search
• In contrast to conventional meta-path-based methods, the advantage of latent-space representation
learning lies in its ability to model similarities between nodes without connected meta-paths.
• If there is a problem such that similarity is zero, will be naturally overcome by network representation
learning
4
Introduction
Contributions
1. Formalizes the problem of heterogeneous network representation learning and identifies its unique
challenges resulting from network heterogeneity
2. Develops effective and efficient network embedding frameworks, metapath2vec & metapath2vec++, for
preserving both structural and semantic correlations of heterogeneous networks
3. Through extensive experiments, demonstrates the efficacy and scalability of the presented methods in
various heterogeneous network mining tasks, such as node classification and node clustering.
4. Demonstrates the automatic discovery of internal semantic relationships between different types of nodes
in heterogeneous networks by metapath2vec & metapath2vec++, not discoverable by existing work.
5
Related Works
PTE
• Deep neural networks fully leverage labeled information that is available for a task when they learn the
representations of the data
• Most text embedding methods are not able to consider labeled information when learning the
representations
• Previous embedding models using unsupervised text embeddings
→ Generalizable for different tasks but have a weaker predictive power for a particular task
• Text embedding methods are much more efficient, are much easier to tune, and naturally accommodate
unlabeled data
6
Related Works
PTE
• Fist learns a low dimensional embedding for words through a heterogeneous text network.
• The network encodes different levels of co-occurrence information between words and words, words and
documents, and words and labels.
• The network is embedded into a low dimensional vector space that preserves the second-order proximity
between the vertices in the network.
• Proposed to learn predictive text embeddings in a semi-supervised manner.
• Unlabeled data and labeled information are integrated into a heterogeneous text network which
incorporates different levels of cooccurrence information in text
• Propose PTE, which learns a distributed representation of text through embedding the heterogeneous text
network into a low dimensional space
• PTE uses the average value of word embeddings
• PTE considers the linear relationship between words and labels
7
Problem Definition
Definition
A Heterogeneous Network defined as a graph G = 𝑉, 𝐸, 𝑇
node v is associated mapping functions 𝜙 𝑣 : 𝑉 → 𝑇𝑉
Link e is associated mapping functions 𝜑 𝑒 : 𝐸 → 𝑇𝐸
𝑇𝑉, 𝑇𝐸 denote the sets of object and relation types where |𝑇𝑉| + 𝑇𝐸 > 2
8
Problem Definition
Definition
A : authors
P : papers
V : venues
O : organizations
→ Nodes
A-A : coauthor
A-P,P-V : publish
O-A : affiliation
9
Problem Definition
Problem1 : Heterogeneous Network Representation Learning
• The task is to learn the d-dimensional latent representations X ∈ ℝ 𝑉 X𝑑
, 𝑑 ≪ 𝑉 that are able to capture
the structural and semantic relations among them
• The output : the low-dimensional matrix X with the vth row-a d-dimensional vector 𝑋𝑣-corresponding to the
representation of node v
• There are different types of nodes in V  their representations are mapped into the same latent space
• The learned node representations can benefit embedding vector of each node can be used as the feature
input of node classification, clustering, and similarity search tasks
• The main challenge of this problem comes from the network heterogeneity, wherein it is difficult to directly
apply homogeneous language and network embedding methods
• The premise of network embedding models is to preserve the proximity between a anode and its
neighborhood.

More Related Content

PPTX
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
PPTX
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
PDF
Concurrent Inference of Topic Models and Distributed Vector Representations
PPTX
Deep Neural Methods for Retrieval
PDF
network mining and representation learning
PDF
Easing embedding learning by comprehensive transcription of heterogeneous inf...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
240715_JW_labseminar[metapath2vec: Scalable Representation Learning for Heter...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
Concurrent Inference of Topic Models and Distributed Vector Representations
Deep Neural Methods for Retrieval
network mining and representation learning
Easing embedding learning by comprehensive transcription of heterogeneous inf...

Similar to NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation learning for heterogeneous networks", KDD 2017 (20)

PDF
Predicting Communication Intention in Social Media
PDF
Java Abs Peer To Peer Design & Implementation Of A Tuple Space
PDF
Java Abs Peer To Peer Design & Implementation Of A Tuple S
PPTX
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
PDF
Deep learning and reasoning: Recent advances
PPTX
Topic Extraction on Domain Ontology
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...
PPTX
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Heterogeneous Graph Attentio...
PPTX
240527_Thuy_Labseminar[Self-supervised Heterogeneous Graph Pre-training Based...
PPTX
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
PDF
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
PDF
Texts Classification with the usage of Neural Network based on the Word2vec’s...
PDF
Texts Classification with the usage of Neural Network based on the Word2vec’s...
PPTX
Dcnn for text
PDF
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
PDF
New prediction method for data spreading in social networks based on machine ...
PDF
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
PPTX
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
PDF
Hou_Resume
PPTX
Odsc 2019 entity_reputation_knowledge_graph
Predicting Communication Intention in Social Media
Java Abs Peer To Peer Design & Implementation Of A Tuple Space
Java Abs Peer To Peer Design & Implementation Of A Tuple S
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
Deep learning and reasoning: Recent advances
Topic Extraction on Domain Ontology
NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...
NS-CUK Joint Journal Club: V.T.Hoang, Review on "Heterogeneous Graph Attentio...
240527_Thuy_Labseminar[Self-supervised Heterogeneous Graph Pre-training Based...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
TEXTS CLASSIFICATION WITH THE USAGE OF NEURAL NETWORK BASED ON THE WORD2VEC’S...
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Dcnn for text
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
New prediction method for data spreading in social networks based on machine ...
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
Hou_Resume
Odsc 2019 entity_reputation_knowledge_graph
Ad

More from ssuser4b1f48 (20)

PPTX
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
PPTX
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
PPTX
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
PDF
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
PDF
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
PDF
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
PDF
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
PPTX
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
PPTX
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
PPTX
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
PPTX
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Net...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Net...
Ad

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Spectroscopy.pptx food analysis technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectroscopy.pptx food analysis technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
“AI and Expert System Decision Support & Business Intelligence Systems”
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf
Encapsulation_ Review paper, used for researhc scholars
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation learning for heterogeneous networks", KDD 2017

  • 1. Ho-Beom Kim Network Science Lab Dept. of Mathematics The Catholic University of Korea E-mail: hobeom2001@catholic.ac.kr 2023 / 07 / 24 DONG, Yuxiao; CHAWLA, Nitesh V.; SWAMI, Ananthram. ACM SIGKDD
  • 2. 2 Introduction Problem Statements • Neural network-based learning models can represent latent embeddings that capture the internal relations of rich, complex data across various modalities, such as image, audio, and language. • Social and information networks are similarly rich and complex data that encode the dynamics and types of human interactions, and are similarly amenable to representation learning using neural networks. • Recent research publications have proposed word2vec-based network representation learning frameworks • DeepWalk, LINE, node2vec → These work has far focused on representation learning for homogeneous networks-representative of singular type of nodes and relationships. • A large number of social and information networks are heterogeneous in nature, involving diversity of node types and/or relationships between nodes
  • 3. 3 Introduction Problem Statements • The latent heterogeneous network embeddings can be further applied to various network mining tasks, such as node classification, clustering, and similarity search • In contrast to conventional meta-path-based methods, the advantage of latent-space representation learning lies in its ability to model similarities between nodes without connected meta-paths. • If there is a problem such that similarity is zero, will be naturally overcome by network representation learning
  • 4. 4 Introduction Contributions 1. Formalizes the problem of heterogeneous network representation learning and identifies its unique challenges resulting from network heterogeneity 2. Develops effective and efficient network embedding frameworks, metapath2vec & metapath2vec++, for preserving both structural and semantic correlations of heterogeneous networks 3. Through extensive experiments, demonstrates the efficacy and scalability of the presented methods in various heterogeneous network mining tasks, such as node classification and node clustering. 4. Demonstrates the automatic discovery of internal semantic relationships between different types of nodes in heterogeneous networks by metapath2vec & metapath2vec++, not discoverable by existing work.
  • 5. 5 Related Works PTE • Deep neural networks fully leverage labeled information that is available for a task when they learn the representations of the data • Most text embedding methods are not able to consider labeled information when learning the representations • Previous embedding models using unsupervised text embeddings → Generalizable for different tasks but have a weaker predictive power for a particular task • Text embedding methods are much more efficient, are much easier to tune, and naturally accommodate unlabeled data
  • 6. 6 Related Works PTE • Fist learns a low dimensional embedding for words through a heterogeneous text network. • The network encodes different levels of co-occurrence information between words and words, words and documents, and words and labels. • The network is embedded into a low dimensional vector space that preserves the second-order proximity between the vertices in the network. • Proposed to learn predictive text embeddings in a semi-supervised manner. • Unlabeled data and labeled information are integrated into a heterogeneous text network which incorporates different levels of cooccurrence information in text • Propose PTE, which learns a distributed representation of text through embedding the heterogeneous text network into a low dimensional space • PTE uses the average value of word embeddings • PTE considers the linear relationship between words and labels
  • 7. 7 Problem Definition Definition A Heterogeneous Network defined as a graph G = 𝑉, 𝐸, 𝑇 node v is associated mapping functions 𝜙 𝑣 : 𝑉 → 𝑇𝑉 Link e is associated mapping functions 𝜑 𝑒 : 𝐸 → 𝑇𝐸 𝑇𝑉, 𝑇𝐸 denote the sets of object and relation types where |𝑇𝑉| + 𝑇𝐸 > 2
  • 8. 8 Problem Definition Definition A : authors P : papers V : venues O : organizations → Nodes A-A : coauthor A-P,P-V : publish O-A : affiliation
  • 9. 9 Problem Definition Problem1 : Heterogeneous Network Representation Learning • The task is to learn the d-dimensional latent representations X ∈ ℝ 𝑉 X𝑑 , 𝑑 ≪ 𝑉 that are able to capture the structural and semantic relations among them • The output : the low-dimensional matrix X with the vth row-a d-dimensional vector 𝑋𝑣-corresponding to the representation of node v • There are different types of nodes in V  their representations are mapped into the same latent space • The learned node representations can benefit embedding vector of each node can be used as the feature input of node classification, clustering, and similarity search tasks • The main challenge of this problem comes from the network heterogeneity, wherein it is difficult to directly apply homogeneous language and network embedding methods • The premise of network embedding models is to preserve the proximity between a anode and its neighborhood.