Improving Data Quality with Active Learning for Emotion Analysis

Improving Data Quality
with Active Learning
for Emotion Analysis
Lu Chen
Justin Martineau
Doreen Cheng
Amit Sheth

May 20 - 26 May 27 – June 2 June 3 – 16 June 17 – 30 July Aug. 15 Aug. 23
Oh, my summer
Read code;
understand the problem and algorithms;
implement the SVM baselines
Tune SVM parameters
Trying to improve the algorithms;
Debugging the code
Re-defined the problem
Re-annotate the data;
Improve the algorithm
Evaluation; Showcase

The First Week (May 20 – 26, 2013)
• A clearly defined problem: Emotion Identification --
funny, happy, sad, exciting, boring, angry, fear, and
heartwarming
 Multiclass classification – Can be implemented by
combining eight binary classifiers.
• A labeled dataset: Tweets talking about TV shows
collected from Twitter, annotated through Amazon
Mechanic Turk.
• A clear research direction: Supervised Learning
 Designing novel feature weighting methods

More Than 500K Labeled Data
Funny Happy Sad Exciting Boring Angry Fear Heartwarming
# Pos. 1,324 405 618 313 209 92 164 24
# Neg. 88,782 95,639 84,212 79,902 82,443 57,326 46,746 15,857
# Total 90,106 96,044 84,830 80,215 82,652 57,418 46,910 15,881

Preliminary Work – A Clear Direction
• Text Classification Problem
• Supervised Learning Algorithm: e.g., Support Vector
Machines (SVMs)
• Imbalanced Dataset -- Undersampling
• Feature Weighting
 Delta-IDF
 A new idea: Emotion Spread

Feature Weighting
• Bag-of-words model, n-gram model:
Term weighting: Each word or n-gram (e.g., unigram,
bigram) is associated with a value. It is common to weigh
terms by
 Term presence
 Term Frequency (TF)
 Term Frequency–Inverse Document Frequency (TF-IDF)
(# of occurrences of term t in this document D) *
log((total # of documents)/(# of documents with mention of term t))

An Example (1)
Family Guy be having me rolling .
Family Guy and Modern Family always raise my mood XD
BOW: {Family-0, Guy-1, be-2, having-3, me-4, rolling-5, and-6,
Modern-7, always-8, raise-9, my-10, mood-11, XD-12}
• Term presence
<1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0>
<1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1>
• Term frequency
<1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0>
<2, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1>
• TF-IDF
<log1, log1, log2, log2, log2, log2, 0, 0, 0, 0, 0, 0, 0>
<2log1, log1, 0, 0, 0, 0, log2, log2, log2, log2, log2, log2, log2>

Delta-IDF
• Basic idea of Delta-IDF: Treat the positive and negative
training points as two different corpora. Term counts
are weighted by how biased the terms are to one
corpus using the difference of that term's IDF scores in
the two corpora.
𝑽 𝒕 = 𝒍𝒐𝒈 𝟐(
𝑷 𝒕 + 𝟏
|𝑵 𝒕| + 𝟏
)
 𝑉𝑡 – feature value for term t
 |𝑃𝑡| (|𝑁𝑡|) -- the number of positively (negatively)
labeled training points with term t

An Example (2)
Family Guy be having me rolling . (funny)
Family Guy and Modern Family always raise my mood XD
(not funny)
BOW: {Family-0, Guy-1, be-2, having-3, me-4, rolling-5, and-
6, Modern-7, always-8, raise-9, my-10, mood-11, XD-12}
• Delta-IDF
{0, 0, 1, 1, 1, 1, -1, -1, -1, -1 -1, -1, -1}
• TF * Delta-IDF
<0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0>
<0, 0, 0, 0, 0, 0, -1, -1, -1, -1, -1, -1, -1>

Emotion Spread
• Basic idea of Emotion Spread: Utilize corpora of
different emotion types to identify emotion-specific
features and adjust their weights accordingly.
Weight of :D
funny
happy
sad
boring
exciting
Measure of Distribution Spread

Experimental Setup
• Delta-IDF weights for Dot Product Classification (Delta-IDF)
• Emotion Spread for Dot Product Classification (Emo-
Spread)
• Delta-IDF weights for SVMs (SVM-Delta-IDF)
• Emo-Spread weights for SVMs (SVM-Emo-Spread)
• SVM baseline (SVM-TF)
 Topic-based data folds, Cross-Validation
 Undersampling

The Second Week (May 27 – June 2, 2013)
• LIBLINEAR: A SVM Library for large sparse data with a
huge number of instances and features.
 Selection of Solvers
We selected support vector Regression (SVR)
instead of Classification model
 Tune SVM Parameters: Grid Search of the penalty
factor C, e.g., C = 2-6 – 210
C = 1.0

Evaluation Matrix
• F-1 Score
• Mean Average Precision (MAP)

How Dirty was the Dataset?
# Pos. 1,324 405 618 313 209 92 164 24
# Neg. 88,782 95,639 84,212 79,902 82,443 57,326 46,746 15,857
# Total 90,106 96,044 84,830 80,215 82,652 57,418 46,910 15,881
Dirty Dataset (from Amazon Mechanic Turk):
# Pos. 1,781 4,847 788 1,613 216 763 285 326
# Neg. 88,277 91,075 84,031 78,573 82,416 56,584 46,622 15,542
# Total 90,058 95,922 84,819 80,186 82,632 57,347 46,907 15,868
Clean Dataset (after manually re-annotation):
* Some off-topic tweets are removed from the dataset during re-annotation

Re-define the Problem
What is the problem?
• Identifying eight different emotions – funny, happy,
sad, exciting, boring, angry, fear, and
heartwarming from tweets talking about TV shows
• Low quality dataset with noisy labels provided by
non-expert annotators recruited through Amazon
Mechanic Turk
Why is it important?
• The performance of the classifiers can be
significantly affected by the quality of the data labels.
• Re-annotation is very time-consuming and
expensive.

Re-shape the Research Topic
Exploring active learning approaches
based on Delta-IDF and Emotion Spread
to improve the label quality
with reduced annotation cost for emotion analysis.

Active Learning
• This is a type of iterative supervised learning.
• The primary motivation for active learning comes from
the time or expense of obtaining labeled training
examples.
• Definition

Emotion Spread
• Basic idea of Emotion Spread: Utilize corpora of different
emotion types to identify emotion-specific features and
make their weights more extreme,
so that
• it could counteract the effects of subdued weights of these
features due to the noisy labels.
 𝑉𝑡
𝑖
-- Delta-IDF value for term t on emotion i
 𝐸 – a set of emotions
 𝑁 – the number of emotions in 𝐸
 𝑠 specifies the spread
𝑾 𝒕
𝒆
= 𝑽 𝒕
𝒆
×
𝒊∈𝑬−𝒆(𝑽 𝒕
𝒆
− 𝑽 𝒕
𝒊
) 𝒔
𝑵 − 𝟏

Experimental Setup
• Features: bag-of-words (unigram and bigram)
• Active learning selection strategy: in each iteration, select
the top k most certain instances that are misclassified.
• Approaches:
o Delta-IDF weights for Dot Product Classification (Delta-IDF)
o Emotion Spread for Dot Product Classification (Emo-Spread)
o Delta-IDF weights for SVMs (SVM-Delta-IDF)
o SVM baseline (SVM-TF)
 Topic-based data folds, Cross-Validation
 Undersampling

Evaluation (1)
0.52
0.56
0.6
0.64
0.68
0.72
0.76
0.8
300 900 1500 3600 9600
Funny
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
300
600
900
1200
1500
2400
3600
6000
9600
Happy
0.52
0.56
0.6
0.64
0.68
0.72
0.76
0.8
300 900 1500 3600 9600
Sad
0.16
0.26
0.36
0.46
0.56
0.66
0.76
300 900 1500 3600 9600
Exciting
0.2
0.26
0.32
0.38
0.44
0.5
0.56
0.62
300 900 1500 3600 9600
Boring
0
0.06
0.12
0.18
0.24
0.3
0.36
0.42
300 900 1500 3600 9600
Angry
0.1
0.16
0.22
0.28
0.34
0.4
0.46
300 900 1500 3600 9600
Fear
0.16
0.26
0.36
0.46
0.56
0.66
0.76
300 900 1500 3600 9600
Heartwarming
MAP MAP MAP MAP
MAP MAP MAP MAP

Evaluation (2)
0
500
1000
1500
2000
2500
3000
AverageTramingTime(s)
Average Training Time on Eight Emotions

Evaluation (3)
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
PercentageofFixedLabels
the Number of Selected Instances in Each Iteration
Delta-IDF
Emo-Spread
SVM-Delta-IDF
SVM-TF
Random
Accumulated Average Percentage of
Fixed Labels on Eight Emotions

Observations
• On emotions funny, sad, boring, angry, fear and
heartwarming, SVM-Delta-IDF significantly outperforms
SVM-TF, on emotions happy and exciting, SVM-Delta-IDF
is also competitive as compared with SVM-TF. On
emotions boring, angry, fear and heartwarming, Emo-
Spread significantly outperforms SVM-TF.
• The time spent on training SVM-TF classifiers is twice as
much on training SVM-Delta-IDF classifiers, and 17 times
as much on training Emo-Spread classifiers. Active
learning with Emo-Spread or two SVM classifiers
significantly reduce the annotation effort.

Thank you !
Subjective Information Extraction, Lu Chen 28

Improving Data Quality with Active Learning for Emotion Analysis

More Related Content

Viewers also liked (9)

Similar to Improving Data Quality with Active Learning for Emotion Analysis (20)

Recently uploaded (20)

Improving Data Quality with Active Learning for Emotion Analysis

Editor's Notes