SlideShare a Scribd company logo
Hyo Eun Lee
Network Science Lab
Dept. of Biotechnology
The Catholic University of Korea
E-mail: gydnsml@gmail.com
2023.08.14
1
 Problem definition
 Predictive text embedding
• Bipartite Network Embedding
• Heterogeneous Text Network Embedding
• Text Embedding
 Experiments
 Discussion and conclusion
2
1. Problem definition
Definition
• Definition 1. (Word-Word Network)
• Capture co-occurrence information in unlabeled local contexts
𝐺𝑊𝑊 = (𝑉 , 𝐸𝑊𝑊)
• Traditional word embedding approaches such as skipgrams
• Definition 2. (Word-Document Network) Word-document
• Capture connections between words and documents in a corpus
𝐺𝑊𝐷 = (𝑉 ∪ 𝐷 , 𝐸𝑊𝐷)
• Definition 3. (Word-Label Network) Word-label
𝐺𝑊𝑙 = 𝑉 ∪ 𝐿 , 𝐸𝑊𝑙
𝑤𝑖𝑗 = 𝑛𝑑𝑙
3
1. Problem definition
Definition
• Definition 4. (Heterogeneous Text Network) The heterogeneous text network
• Represents a combination of defined networks
• Captures co-occurrences at multiple levels and includes both labeled and unlabeled data
• Definition 5. (Predictive Text Embedding)
• The resulting low-dimensional embeddings are powerful for certain tasks
4
2. Predictive text embedding
Bipartite Network Embedding
• LINE model was introduced for large-scale information embedding, but weights for different types of
edges cannot be compared
• Therefore, we propose an applied method that applies quadratic proximity between nodes
• 𝐺 = (𝑉𝐴 ∪ 𝑉𝐵, 𝐸)
5
2. Predictive text embedding
Bipartite Network Embedding
• Optimization of the objective function using stochastic gradient descent.
• Using edge sampling and negative sampling techniques.
• Edge sampling method to obtain binary edges e with probability proportional to their weights at each
step and negative samples from the noise distribution p.
• After learning all the embeddings, we can define the objective function
6
Heterogeneous Text Network Embedding
• There are three different networks shared by the word vertices
2. Predictive text embedding
7
Heterogeneous Text Network Embedding
• Training unlabeled and labeled together
2. Predictive text embedding
8
Heterogeneous Text Network Embedding
• Train with unlabeled data and refine using labeled
2. Predictive text embedding
9
Text Embedding
• After training the vector representation, it can be averaged to obtain a representation of all the text.
• Learn by minimizing a loss function, specified as the Euclidean distance between embeddings, using a
gradient descent algorithm.
2. Predictive text embedding
10
3. Experiments
11
3. Experiments
12
3. Experiments
13
3. Experiments
14
3. Experiments
15
3. Experiments
16
3. Experiments
17
3. Experiments
18
4. Discussion and conclusion
Discussion and conclusion
• Unsupervised learning uses either local context-level or document-level word co-occurrences, with
document-level co-occurrences being more useful for long documents and local context-level
being more useful for short documents.
• PTE joint training on both labeled and unlabeled data, and outperforms CNNs with more labeled
data.
• PTE needs improvement, such as taking into account the order of words.
NS-CUK Seminar: H.E.Lee,  Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

More Related Content

PPTX
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
PDF
Predictive Text Embedding using LINE
PPTX
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
PPTX
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...
PPTX
240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx
PDF
Deep learning for nlp
PPTX
LINE: Large-scale Information Network Embedding.pptx
PDF
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through L...
Predictive Text Embedding using LINE
[20240311_LabSeminar_Huy]LINE: Large-scale Information Network Embedding.pptx
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation l...
240401_JW_labseminar[LINE: Large-scale Information Network Embeddin].pptx
Deep learning for nlp
LINE: Large-scale Information Network Embedding.pptx
IRJET- Short-Text Semantic Similarity using Glove Word Embedding

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015 (20)

PPTX
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
PDF
Easing embedding learning by comprehensive transcription of heterogeneous inf...
PPT
SNLI_presentation_2
PPTX
Word_Embeddings.pptx
PDF
Word2Vec
PPTX
Word embedding
PDF
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
PDF
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
PDF
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
PPTX
Text Classification Using Machine Learning.pptx
PPTX
Word_Embedding.pptx
PPTX
wordembedding.pptx
PPTX
A Neural Probabilistic Language Model
PDF
Challenges in transfer learning in nlp
PPTX
word vector embeddings in natural languag processing
PPTX
Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...
PDF
「知識」のDeep Learning
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
PPTX
Lecture1.pptx
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
Easing embedding learning by comprehensive transcription of heterogeneous inf...
SNLI_presentation_2
Word_Embeddings.pptx
Word2Vec
Word embedding
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Text Classification Using Machine Learning.pptx
Word_Embedding.pptx
wordembedding.pptx
A Neural Probabilistic Language Model
Challenges in transfer learning in nlp
word vector embeddings in natural languag processing
Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...
「知識」のDeep Learning
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...
Lecture1.pptx

More from ssuser4b1f48 (20)

PPTX
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
PPTX
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
PPTX
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
PDF
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
PDF
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
PDF
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
PDF
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
PPTX
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
PPTX
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
PPTX
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
PPTX
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
PPTX
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Net...
PPTX
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...
NS-CUK Seminar: V.T.Hoang, Review on "GOAT: A Global Transformer on Large-sca...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: H.B.Kim, Review on "Cluster-GCN: An Efficient Algorithm for ...
NS-CUK Seminar: H.E.Lee, Review on "Weisfeiler and Leman Go Neural: Higher-O...
NS-CUK Seminar:V.T.Hoang, Review on "GRPE: Relative Positional Encoding for G...
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
Aug 22nd, 2023: Case Studies - The Art and Science of Animation Production)
Aug 17th, 2023: Case Studies - Examining Gamification through Virtual/Augment...
Aug 10th, 2023: Case Studies - The Power of eXtended Reality (XR) with 360°
Aug 8th, 2023: Case Studies - Utilizing eXtended Reality (XR) in Drones)
NS-CUK Seminar: J.H.Lee, Review on "Learnable Structural Semantic Readout for...
NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...
NS-CUK Seminar:V.T.Hoang, Review on "Augmentation-Free Self-Supervised Learni...
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
NS-CUK Seminar: H.B.Kim, Review on "Inductive Representation Learning on Lar...
NS-CUK Seminar: J.H.Lee, Review on "Relational Self-Supervised Learning on Gr...
NS-CUK Seminar: H.E.Lee, Review on "Graph Star Net for Generalized Multi-Tas...
NS-CUK Seminar: V.T.Hoang, Review on "Namkyeong Lee, et al. Relational Self-...
NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Net...
NS-CUK Seminar: H.B.Kim, Review on "Deep Gaussian Embedding of Graphs: Unsup...

Recently uploaded (20)

PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
August Patch Tuesday
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Modernising the Digital Integration Hub
PPTX
The various Industrial Revolutions .pptx
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Hybrid model detection and classification of lung cancer
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Group 1 Presentation -Planning and Decision Making .pptx
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
DP Operators-handbook-extract for the Mautical Institute
WOOl fibre morphology and structure.pdf for textiles
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
August Patch Tuesday
A contest of sentiment analysis: k-nearest neighbor versus neural network
Modernising the Digital Integration Hub
The various Industrial Revolutions .pptx
observCloud-Native Containerability and monitoring.pptx
A novel scalable deep ensemble learning framework for big data classification...
TLE Review Electricity (Electricity).pptx
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
Final SEM Unit 1 for mit wpu at pune .pptx
Hybrid model detection and classification of lung cancer
A comparative study of natural language inference in Swahili using monolingua...
Getting started with AI Agents and Multi-Agent Systems
O2C Customer Invoices to Receipt V15A.pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

  • 1. Hyo Eun Lee Network Science Lab Dept. of Biotechnology The Catholic University of Korea E-mail: gydnsml@gmail.com 2023.08.14
  • 2. 1  Problem definition  Predictive text embedding • Bipartite Network Embedding • Heterogeneous Text Network Embedding • Text Embedding  Experiments  Discussion and conclusion
  • 3. 2 1. Problem definition Definition • Definition 1. (Word-Word Network) • Capture co-occurrence information in unlabeled local contexts 𝐺𝑊𝑊 = (𝑉 , 𝐸𝑊𝑊) • Traditional word embedding approaches such as skipgrams • Definition 2. (Word-Document Network) Word-document • Capture connections between words and documents in a corpus 𝐺𝑊𝐷 = (𝑉 ∪ 𝐷 , 𝐸𝑊𝐷) • Definition 3. (Word-Label Network) Word-label 𝐺𝑊𝑙 = 𝑉 ∪ 𝐿 , 𝐸𝑊𝑙 𝑤𝑖𝑗 = 𝑛𝑑𝑙
  • 4. 3 1. Problem definition Definition • Definition 4. (Heterogeneous Text Network) The heterogeneous text network • Represents a combination of defined networks • Captures co-occurrences at multiple levels and includes both labeled and unlabeled data • Definition 5. (Predictive Text Embedding) • The resulting low-dimensional embeddings are powerful for certain tasks
  • 5. 4 2. Predictive text embedding Bipartite Network Embedding • LINE model was introduced for large-scale information embedding, but weights for different types of edges cannot be compared • Therefore, we propose an applied method that applies quadratic proximity between nodes • 𝐺 = (𝑉𝐴 ∪ 𝑉𝐵, 𝐸)
  • 6. 5 2. Predictive text embedding Bipartite Network Embedding • Optimization of the objective function using stochastic gradient descent. • Using edge sampling and negative sampling techniques. • Edge sampling method to obtain binary edges e with probability proportional to their weights at each step and negative samples from the noise distribution p. • After learning all the embeddings, we can define the objective function
  • 7. 6 Heterogeneous Text Network Embedding • There are three different networks shared by the word vertices 2. Predictive text embedding
  • 8. 7 Heterogeneous Text Network Embedding • Training unlabeled and labeled together 2. Predictive text embedding
  • 9. 8 Heterogeneous Text Network Embedding • Train with unlabeled data and refine using labeled 2. Predictive text embedding
  • 10. 9 Text Embedding • After training the vector representation, it can be averaged to obtain a representation of all the text. • Learn by minimizing a loss function, specified as the Euclidean distance between embeddings, using a gradient descent algorithm. 2. Predictive text embedding
  • 19. 18 4. Discussion and conclusion Discussion and conclusion • Unsupervised learning uses either local context-level or document-level word co-occurrences, with document-level co-occurrences being more useful for long documents and local context-level being more useful for short documents. • PTE joint training on both labeled and unlabeled data, and outperforms CNNs with more labeled data. • PTE needs improvement, such as taking into account the order of words.