SlideShare a Scribd company logo
A Framework for Emotion Mining from
    Text in Online Social Networks

                    Published in:
     Data Mining Workshops (ICDMW), 2010 IEEE
              International Conference
               Mohamed Yassine Hazem Hajj
      Department of Electrical and Computer Engineering
                American University of Beirut
                       Beirut, Lebanon
             E-mail: {mmy18, hh63}@aub.edu.lb
                   Advisors: Yin-Fu Huang
                   Student: Chen-Ting Huang
Abstract & Introduction
   A new framework is proposed for characterizing emotional interactions in
    social networks, and then using these characteristics to distinguish friends
    from acquaintances.

   The framework includes a model for data collection, database schemas,
    data processing and data mining steps.

   The paper presents a new perspective for studying friendship relations and
    emotions’ expression in online social networks
Abstract & Introduction
   Text mining techniques are performed on comments retrieved from a
    social network.

   The purpose is not to identify specific emotions but rather to tell if the text
    contains emotions or not.

   In other words, if the text is subjective reflecting the writer’s affect and
    emotional state or if it is factual and objective.
Related Work


   SentiWordNet: a publicly available lexical resource for opinion mining

   NTUSD

   Face Book
Social Network – text example
                           Post




                            Comment
Row Data Collection




                      Row data collection
Row Data Collection
   Posts              The texts to be mined for emotions
   Comments
   User Information
   Friends
Lexicons Development




Lexicons
Development
Lexicons Development
   This step deals with the informal language of social network
    a.  Social acronyms




    a.   Emotions
Lexicons Development
   This step deals with the informal language of social network
    c.  Foreign Languages Lexicons:
         It is also necessary to take into consideration the use of foreign
        languages.
Feature generation




 Feature generation
Feature generation
   Computes new features from available raw data collected in step 1 to
    assess subjectivity of text.

   It uses word-matches to existing affective lexicons (by SetiWordNet)
    and employs new lexicons developed in step 2

   Comments collected from step 1 along with the features computed in this
    step will be stored in the Sentiment Mining Database
Feature generation
   In order to assess the subjectivity of the text, several features need to be
    computed.

    a.   Affective lexicon based on SentiWordNet

    b.   Based on intentional misspelling errors and grammatical
         markers (such as punctuation and capitalized letters)

    c.   Based on social acronyms, interjections and emoticons.
Data Preprocessing


                     Data
                     preprocessing
Data Preprocessing
   Feature selection was performed based on a correlation measure to
    remove redundant attributes.
Data Preprocessing

   The goal was to deduce whether the text is subjective or not.

   Example: knowing whether the writer uses few or many punctuation
    marks would definitely denote some sort of subjectivity.

   Discretization was done using clustering. The k-means algorithm was
    performed on each attribute with k=3 or k=4.

   The values in each attribute were normalized using Min-Max normalization
    which mapped them to the range [0;1].
Training Model for Text Subjectivity


Creating a Training
model for text subjectivity
Training Model for Text Subjectivity
   The training was done in step 5 using the k-means clustering algorithm
    with k =3.

    a.   Objective or factual texts,

    b.   Moderately subjective texts

    c.   Subjective texts

   The output of the model was the centroids of the three clusters and is
    used in step 6 for classifying the comments with respect to subjectivity.
Text subjectivity classification


                            Text subjectivity
                           classification
Text subjectivity classification
   This step uses the centroids generated in the previous steps.

   Employs the k-nearest neighbor algorithm with k=1 to classify all
    comments into one of three subjectivity levels.

   The manual annotation was then compared to the predicted result by the
    model.
Model Evaluation
   eg: Comment 1
    •   Repeated letters: ‘o’ ‘i’ ’l’ ’u’ ’.’ ’o’ ’h’
    •   Emotion: ‘ :P ’ ‘ :) ’
    •   Affective words:
    •   Acronyms: mwahhh
Model Evaluation




               k-means algorithm
                     &
               Min-Max normalization
Model Evaluation
Model Evaluation
   Evaluating the model was done by applying the same above procedure on
    the sample of 850 comments.




                                                                   84%




                                     81%
Friendship Classification
                            Friendship
                            Classification
Friendship Classification

   We collected the comments shared for 1213 pairs of users.

   We used the output of step 6 which is text subjectivity classification in
    order to get the subjectivity measure of all comments shared between
    the pair.

   SVM algorithm was used to predict whether the pair consists of close
    friends or just acquaintances.

   Every user reported a list of his/her close friends. Using 10-fold cross-
    validation, the SVM model reported an accuracy of 87%.
Conclusion
   It presents a new perspective for studying friendship relations and
    emotions’ expression in online social networks

   We developed new lexicons that cover common expressions used by
    online users.

   The model predicted the right class with 88% accuracy. When the
    resulting model was used to predict relationship strength between two
    users, the prediction reported an accuracy of 87%.

More Related Content

PPTX
detect emotion from text
PDF
P1803018289
PDF
J1803015357
PDF
A SURVEY OF S ENTIMENT CLASSIFICATION TECHNIQUES USED FOR I NDIAN REGIONA...
PPTX
Opinion Mining – Twitter
PDF
C5 giruba beulah
PDF
G1803013542
detect emotion from text
P1803018289
J1803015357
A SURVEY OF S ENTIMENT CLASSIFICATION TECHNIQUES USED FOR I NDIAN REGIONA...
Opinion Mining – Twitter
C5 giruba beulah
G1803013542

What's hot (20)

PDF
Evaluation of Support Vector Machine and Decision Tree for Emotion Recognitio...
PDF
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
PDF
NLPinAAC
PDF
Text to Emotion Extraction Using Supervised Machine Learning Techniques
PDF
Analysis of anaphora resolution system for
PPTX
Natural Language Processing
PDF
Supervised Approach to Extract Sentiments from Unstructured Text
PDF
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
PDF
N01741100102
PDF
PDF
A scalable, lexicon based technique for sentiment analysis
PDF
IRJET- Conversational Assistant based on Sentiment Analysis
PDF
S33100107
PDF
Introduction to Recurrent Neural Network with Application to Sentiment Analys...
ODP
Sentiment Analysis on Twitter
PDF
F0363942
ODP
PDF
Speech Emotion Recognition by Using Combinations of Support Vector Machine (S...
PPTX
SPEECH BASED EMOTION RECOGNITION USING VOICE
PDF
NLP_Project_Paper_up276_vec241
Evaluation of Support Vector Machine and Decision Tree for Emotion Recognitio...
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...
NLPinAAC
Text to Emotion Extraction Using Supervised Machine Learning Techniques
Analysis of anaphora resolution system for
Natural Language Processing
Supervised Approach to Extract Sentiments from Unstructured Text
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
N01741100102
A scalable, lexicon based technique for sentiment analysis
IRJET- Conversational Assistant based on Sentiment Analysis
S33100107
Introduction to Recurrent Neural Network with Application to Sentiment Analys...
Sentiment Analysis on Twitter
F0363942
Speech Emotion Recognition by Using Combinations of Support Vector Machine (S...
SPEECH BASED EMOTION RECOGNITION USING VOICE
NLP_Project_Paper_up276_vec241
Ad

Viewers also liked (19)

ODP
Emotion detection from text using data mining and text mining
PPTX
Emotion mining in text
PPTX
Emotion Detection in text
PDF
Emotion Detection from Text
PPTX
Emotion detection
PPTX
HaiXiu: Emotion Recognition from Movements
PPTX
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
PPTX
Model Based Emotion Detection using Point Clouds
PDF
MOODetector: Automatic Music Emotion Recognition
PDF
From Music Information Retrieval to Music Emotion Recognition
DOCX
Emotion Recognition and Emotional Resonance: Exploring the Relationship betwe...
PPTX
Music Mood Detection (Lyrics based Approach)
PDF
Elements of Text Mining Part - I
PPTX
HUMAN EMOTION RECOGNIITION SYSTEM
PDF
Emotion based music player
PDF
MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song...
PPTX
Emotion based music player
PDF
Sketching for Design
PDF
Real time emotion_detection_from_videos
Emotion detection from text using data mining and text mining
Emotion mining in text
Emotion Detection in text
Emotion Detection from Text
Emotion detection
HaiXiu: Emotion Recognition from Movements
Mind Control to Major Tom: Is It Time to Put Your EEG Headset On?
Model Based Emotion Detection using Point Clouds
MOODetector: Automatic Music Emotion Recognition
From Music Information Retrieval to Music Emotion Recognition
Emotion Recognition and Emotional Resonance: Exploring the Relationship betwe...
Music Mood Detection (Lyrics based Approach)
Elements of Text Mining Part - I
HUMAN EMOTION RECOGNIITION SYSTEM
Emotion based music player
MusicMood - Machine Learning in Automatic Music Mood Prediction Based on Song...
Emotion based music player
Sketching for Design
Real time emotion_detection_from_videos
Ad

Similar to A framework for emotion mining from text in online social networks(final) (20)

PPTX
Collective sensing
PDF
Opinion mining for social media
PDF
Kishaloy Haldar and Wenqiang Lei - WESST - Sentiment Analysis of Social Media
PDF
Intro to sentiment analysis
PDF
IRJET - Sentiment Analysis of Posts and Comments of OSN
PDF
A Review Paper on Analytic System Based on Prediction Analysis of Social Emot...
PDF
Do we really know what people mean when they tweet?
PDF
Acm tist-v3 n4-tist-2010-11-0317
PDF
Sentiment analysis of comments in social media
PPTX
Sentiment analysis
PDF
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
PDF
Multimodal opinion mining from social media
PPT
Opinion Mining
PPTX
Sentimental Analysis - Naive Bayes Algorithm
PDF
A Review On Sentiment Analysis And Emotion Detection From Text
PDF
Sentiment Analysis with NVivo 11 Plus
PDF
Graph-based Analysis and Opinion Mining in Social Network
PDF
Ijcatr04061001
PPTX
Major presentation
PDF
Product Analyst Advisor
Collective sensing
Opinion mining for social media
Kishaloy Haldar and Wenqiang Lei - WESST - Sentiment Analysis of Social Media
Intro to sentiment analysis
IRJET - Sentiment Analysis of Posts and Comments of OSN
A Review Paper on Analytic System Based on Prediction Analysis of Social Emot...
Do we really know what people mean when they tweet?
Acm tist-v3 n4-tist-2010-11-0317
Sentiment analysis of comments in social media
Sentiment analysis
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
Multimodal opinion mining from social media
Opinion Mining
Sentimental Analysis - Naive Bayes Algorithm
A Review On Sentiment Analysis And Emotion Detection From Text
Sentiment Analysis with NVivo 11 Plus
Graph-based Analysis and Opinion Mining in Social Network
Ijcatr04061001
Major presentation
Product Analyst Advisor

More from es712 (9)

PPT
Extracting ocean
PPTX
Feature selection and classification in supporting report based self-manageme...
PPT
Exploiting social tagging in a web 2.0 recommender system(lab)
PPTX
Cervical cancer classification using gabor filters 1026
PPT
Pca and kpca of ecg signal
PPTX
Automatic road environment classification 20121002
PPTX
Classification of commercial and personal profiles on my space
PPTX
Tennis video shot classification based on support vector
PPT
Social media recommendation based on people and tags (final)
Extracting ocean
Feature selection and classification in supporting report based self-manageme...
Exploiting social tagging in a web 2.0 recommender system(lab)
Cervical cancer classification using gabor filters 1026
Pca and kpca of ecg signal
Automatic road environment classification 20121002
Classification of commercial and personal profiles on my space
Tennis video shot classification based on support vector
Social media recommendation based on people and tags (final)

A framework for emotion mining from text in online social networks(final)

  • 1. A Framework for Emotion Mining from Text in Online Social Networks Published in: Data Mining Workshops (ICDMW), 2010 IEEE International Conference Mohamed Yassine Hazem Hajj Department of Electrical and Computer Engineering American University of Beirut Beirut, Lebanon E-mail: {mmy18, hh63}@aub.edu.lb Advisors: Yin-Fu Huang Student: Chen-Ting Huang
  • 2. Abstract & Introduction  A new framework is proposed for characterizing emotional interactions in social networks, and then using these characteristics to distinguish friends from acquaintances.  The framework includes a model for data collection, database schemas, data processing and data mining steps.  The paper presents a new perspective for studying friendship relations and emotions’ expression in online social networks
  • 3. Abstract & Introduction  Text mining techniques are performed on comments retrieved from a social network.  The purpose is not to identify specific emotions but rather to tell if the text contains emotions or not.  In other words, if the text is subjective reflecting the writer’s affect and emotional state or if it is factual and objective.
  • 4. Related Work  SentiWordNet: a publicly available lexical resource for opinion mining  NTUSD  Face Book
  • 5. Social Network – text example Post Comment
  • 6. Row Data Collection Row data collection
  • 7. Row Data Collection  Posts The texts to be mined for emotions  Comments  User Information  Friends
  • 9. Lexicons Development  This step deals with the informal language of social network a. Social acronyms a. Emotions
  • 10. Lexicons Development  This step deals with the informal language of social network c. Foreign Languages Lexicons: It is also necessary to take into consideration the use of foreign languages.
  • 12. Feature generation  Computes new features from available raw data collected in step 1 to assess subjectivity of text.  It uses word-matches to existing affective lexicons (by SetiWordNet) and employs new lexicons developed in step 2  Comments collected from step 1 along with the features computed in this step will be stored in the Sentiment Mining Database
  • 13. Feature generation  In order to assess the subjectivity of the text, several features need to be computed. a. Affective lexicon based on SentiWordNet b. Based on intentional misspelling errors and grammatical markers (such as punctuation and capitalized letters) c. Based on social acronyms, interjections and emoticons.
  • 14. Data Preprocessing Data preprocessing
  • 15. Data Preprocessing  Feature selection was performed based on a correlation measure to remove redundant attributes.
  • 16. Data Preprocessing  The goal was to deduce whether the text is subjective or not.  Example: knowing whether the writer uses few or many punctuation marks would definitely denote some sort of subjectivity.  Discretization was done using clustering. The k-means algorithm was performed on each attribute with k=3 or k=4.  The values in each attribute were normalized using Min-Max normalization which mapped them to the range [0;1].
  • 17. Training Model for Text Subjectivity Creating a Training model for text subjectivity
  • 18. Training Model for Text Subjectivity  The training was done in step 5 using the k-means clustering algorithm with k =3. a. Objective or factual texts, b. Moderately subjective texts c. Subjective texts  The output of the model was the centroids of the three clusters and is used in step 6 for classifying the comments with respect to subjectivity.
  • 19. Text subjectivity classification Text subjectivity classification
  • 20. Text subjectivity classification  This step uses the centroids generated in the previous steps.  Employs the k-nearest neighbor algorithm with k=1 to classify all comments into one of three subjectivity levels.  The manual annotation was then compared to the predicted result by the model.
  • 21. Model Evaluation  eg: Comment 1 • Repeated letters: ‘o’ ‘i’ ’l’ ’u’ ’.’ ’o’ ’h’ • Emotion: ‘ :P ’ ‘ :) ’ • Affective words: • Acronyms: mwahhh
  • 22. Model Evaluation k-means algorithm & Min-Max normalization
  • 24. Model Evaluation  Evaluating the model was done by applying the same above procedure on the sample of 850 comments. 84% 81%
  • 25. Friendship Classification Friendship Classification
  • 26. Friendship Classification  We collected the comments shared for 1213 pairs of users.  We used the output of step 6 which is text subjectivity classification in order to get the subjectivity measure of all comments shared between the pair.  SVM algorithm was used to predict whether the pair consists of close friends or just acquaintances.  Every user reported a list of his/her close friends. Using 10-fold cross- validation, the SVM model reported an accuracy of 87%.
  • 27. Conclusion  It presents a new perspective for studying friendship relations and emotions’ expression in online social networks  We developed new lexicons that cover common expressions used by online users.  The model predicted the right class with 88% accuracy. When the resulting model was used to predict relationship strength between two users, the prediction reported an accuracy of 87%.

Editor's Notes

  • #3: 藉由
  • #4: 在這篇 paper 中主要的目標便是從 social network 的文字中提取 emotional content
  • #5: SentiWordNet 為 wordnet 的同義詞集提供 3 中層次的情緒分級 : positivity, negativity, objectivity.
  • #7: Word match
  • #11: 一些在 social network 常見的本土外語表達方式 , 在論文中會把這部分納入考量目的是在於嘗試處理外語的內容
  • #22: 提出 3 個分別代表「 subjective 」「 moderately subjective 」 「 objective 」的 comment 作例子
  • #27: 使用 SVM 來根據每一對底下的 subjectivity measure 實作 friendship 預測
  • #28: 這篇 paper 藉由情緒探看台提供 1. 包含 emotion, acronyms