NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

Hyo Eun Lee
Network Science Lab
Dept. of Biotechnology
The Catholic University of Korea
E-mail: gydnsml@gmail.com
2023.08.07
KDD 2015

1
 Introduction
• Limitation of previous study
• Skip-gram
• BOW
• Paragraph Vector
 Related work
• Distributed Text Embedding
• Information Network Embedding

2
1. Introduction
Limitation of previous study
• When learning word representations, problems like word sparsity, multiplicity, and synonyms in a document
• Distributed representation, where similar words and documents are represented closely
in a low-dimensional space, is an effective solution to the above problem
• Skip-grams and Paragraph Vectors are representations based on unsupervised learning
• They are more effective than Brown clustering or nearest neighbors, using similarity to context words

3
1. Introduction
Skip-gram
• Represent words as high-dimensional vectors
using their relationships to neighboring words
• Computing the probability of a word occurring
around a center word

4
1. Introduction
BOW
• Measuring the frequency of words in a document
• Doesn't account for sentence order

5
1. Introduction
Fill in this black
Paragraph Vectors
• Utilize co-occurrence information between nearby
contextual words similar to skipgrams
• PV-DM
: Learns vectors for the entire sentence and
combines them with word vectors to capture both
context and meanings of the sentence

6
1. Introduction
Fill in this black
Paragraph Vectors
• Utilize co-occurrence information between nearby
context words similar to skip-grams
• PV-DBOW
: Use paragraph vectors to infer which words are
in which sentences

7
1. Introduction
Conclusion
• Embedding methods using unsupervised learning are commonly used for
classification, clustering, and ranking
• However,
compared to deep learning approaches, they have weaker predictive performance on certain tests
→ deep learning approaches include the labeling information in the data for embedding
• On the other hand, they are less computationally expensive than deep learning approaches
and do not require pre-training
• It also requires fewer parameters to be tuned
• This paper proposes PTE, a method that utilizes the advantages of
unsupervised learning-based embedding and label information

8
1. Introduction
PTE
• Using a heterogeneous text network to encode
word-word, word-document, and word-label co-occurrence information
• Learning low-dimensional embeddings in a semi-supervised method with heterogeneous text networks
• Learn distributed representations based on embedding information

9
2. Related work
Distributed Text Embedding
• Distributed representation can be categorized into unsupervised and supervised learning
• Unsupervised learning: embedding learning through common word combinations
, scalable to millions of documents
• Supervised learning: Generally, embedding learning based on neural networks
• Main difference is that unsupervised learning uses labels only for classifier training
, while supervised learning also uses labels for representation training
and learns them using pre-training if no labels are available

10
2. Related work
• PTE is a training algorithm that utilizes semi-supervised methods.
Distributed Text Embedding
• Supervised learning: Using label information ex. CNN and RNTN
RNTN
• Predict labels by embedding words or document as Vectors
• Sentence analyzed as a binary tree and represented as a Vector
using the same Tensor-based Composition Function for all nodes
• Compute the parent vector using a bottom-up method

11
1. Introduction
Information Network Embedding
• Using a heterogeneous text network to encode word-word, word-document, and word-label co-occurrence
information
• Learning low-dimensional embeddings in a semi-supervised method with heterogeneous text networks
• Learn distributed representations based on embedding information

12
2. Related work
• Unsupervised methods can only handle homogeneous networks
• Extend LINE to analyze heterogeneous networks (networks with multiple types of nodes and edges)
• Training on a heterogeneous text network makes it relevant to network embedding problems
→ Useful in various areas such as node classification and link prediction ex. DeepWalk, LINE
DeepWalk
• Use a random walk
• only available on networks with binary edges

13
2. Related work
• Unsupervised methods can only handle homogeneous networks
• Extend LINE to analyze heterogeneous networks (networks with multiple types of nodes and edges)
• Training on a heterogeneous text network makes it relevant to network embedding problems
→ Useful in various areas such as node classification and link prediction ex. DeepWalk, LINE
LINE
• Learn the embedding of a node
by considering 1st order proximity and
2nd order neighborhood similarity together

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

More Related Content

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015 (20)

More from ssuser4b1f48 (20)

Recently uploaded (20)

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015