SlideShare a Scribd company logo
Entity-oriented sentiment analysis
of tweets: results and problems
Natalia Loukachevitch
Lomonosov Moscow State University
Yuliya Rubtsova
A.P. Ershov Institute of Informatics Systems
Entity-Oriented analysis of tweets:
reputation monitoring
Sentiment
Analysis
In
general
Entity-
oriented
SentiRuEval 2014-2015
Aspect-oriented
analysis of reviews
• Restaurants
• Cars
Entity-Oriented analysis
of tweets: reputation
monitoring
• Banks [8]
• Telecom companies [7]
Testing of sentiment analysis systems
of Russian texts
SentiRuEval: Entity-Oriented
analysis of tweets
Reputation-oriented tweet may express
Task: to determine sentiment towards the mentioned
company
Participation
9 participants 33 runs
positive or negative opinion
about a company
positive or negative fact
concerning a company
SentiRuEval: Entity-Oriented
analysis of tweets
Training collection
5000 banking tweets
5000 telecom tweets
Test collection
4549 banking tweets
3845 telecom tweets
December
2013
February
2014
July
2014
August
2014
Test collection Train collection
Expert annotation
• Tweet considered as neutral
0
• Positive fact or opinion
1
• Negative fact or opinion
-1
• Positive and negative sentiments in
the same tweet
+-
• Meaningless
--
Annotation problem
Test data were annotated using the voting scheme
Agreement between 2 or 3 annotators
The number of
tweets with the
same labels from
at least 2 assessors
Full agreement The final
number
of tweets in the
test collection
Telecom 4 503 (90.06%) 2 233 (44.66%) 3 845
Banks 4 915 (98.3%) 3 818 (76.36%) 4 549
Distribution of messages in collections
according to sentiment classes
2397
973
1667
2816
413
944
Neutral Positive Negative
Telecom Training collecion
Gold standard test
collection
3569
410
2138
3592
350
670
Neutral Positive Negative
Banks Training collecion
Gold standard test
collection
Quality measure
macro-average F-measure:
F-measure of the
positive class
F-measure of the
negative class
+
2
ignored F-measure of neutral class
this does not reduce the task to the two-class prediction
Additionally micro-average F-measures were
calculated for two sentiment classes
Results
Run id Macro F Micro F
Baseline 0.1823 0.337
2 0.4882 0.5355
3 0.4804 0.5094
4 0.467 0.506
Run id Macro F Micro F
Baseline 0.1267 0.2377
4 0.3598 0.343
10 0.352 0.337
2 0.3354 0.3656
Top 3 results for telecom
tweets
Top 3 results for bank
tweets
Manual labeling of participant for telecom domain
Macro-F – 0.703
Micro-F – 0.7487
Classification methods
•lemmas and syntactic links presented as triples (head word,
dependent word, type of relation)
2
•rule-based approach accounting syntactic relations between
sentiment words and the target entities
3
•maximum entropy method on the basis of word n-grams, symbol n-
grams, and topic modeling results.
4
•word n-grams, letter n-grams, emoticons, punctuation marks,
smilies, a manual sentiment vocabulary, and automatically
generated sentiment list based on (PMI) of a word occurrences in
positive or negative training subsets.
10
Classification methods
SVM + syntactic relations
Linguistic syntax-based pattern (without
machine learning)
Maxent, SVM using various features
Explaining the difference in the
perfomance in two domains
Best results in banking and telecom domains are
different: 0.36 vs. 0.488
Difference between training and test collections:
Kullback-Leibler divergence
Explaining the difference in the
performance in two domains
The topics of reputation-oriented tweets greatly
depend on positive or negative events with
the regard of the target entities
Problems of reputation
analysis of tweets
In any moment some events influencing reputation can
occur => absence in training data
Test collections. December 2013-
February 2014. Ukraine events did
not influence target entities
Train collections in both domains.
July-August 2014 after Ukraine
events 2013-2014 Sanctions
against banks. Problems with
communication in Crimea
Analyzing difficult tweets
71 tweets in the
banking domain
wrongly classified by all
participants
85 tweets in the
telecom domain
difficult for almost all
participants (maximum 2
systems were correct)
First group. 1.1
Contains evident sentiment words
(such as понравиться – to like)
that were absent in the training set
General vocabulary of
Russian sentiment words could help
First group. 1.2
Contains words expressing well-known positive
or negative situations such as theft or murder
but absent in the training collection
General vocabulary of connotative
words would be useful
First group. 1.3
Tweets contains words and phrases describing
current events, concerning the current news
flow
Parallel analysis of the current news, revealing
correlations between tweet words and general
sentiment and connotation vocabularies in
news texts
Second group
Misclassified tweets includes
tweets that are really complicated
Mention more than one entity with
different attitudes
Several sentiment words with different
polarity orientation
Contain irony
vocabularies M-L
framework
30% Tweet in
Bank collection
15% Tweet in
Telecom collection
Were systems entity-oriented?
Test tweets mentioning two or more entities
• 58 tweets in the banking domain (15 tweets with different
polarity labels),
• 232 tweets in the telecom domain (71 tweets with
different polarity labels)
3 of 9 participants considered the task as
entity-oriented one
• Other participants always assigned the same polarity
class to all entities mentioned in a tweet
Performance
• Worse than for all tweets on average
• Entity-oriented approaches did not achieve better
results
Conclusion
We described the tasks, approaches and results in
SentiRuEval testing
– High dependence from train collections
– High impact from current dramatic events
– Capability to do entity-oriented analysis is quite restricted
– large impact for improving results can be based on
integration of a general sentiment vocabulary and a
general vocabulary of connotative words
– The most participants solved the general task of tweet
classification;
– Entity-oriented approaches did not achieve better results.
All prepared materials are accessible for research purposes
http://goo.gl/qHeAVo
Thank you!
You can help us to assess
tweets for SentiRuEval-2016
http://sentimeter.ru/assess/texts/
Yuliya Rubtsova

More Related Content

PPTX
Hybrid sentiment and network analysis of social opinion polarization icoict
PPTX
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
DOCX
M phil
PDF
Implicit vs. Explicit Trust in Social Matrix Factorization
PDF
Implicit vs Explicit trust in Social Matrix Factorization
PPTX
From measurement model to structural model
PDF
Ijmer 46067276
PPTX
Finding Pattern in Dynamic Network Analysis
Hybrid sentiment and network analysis of social opinion polarization icoict
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELING
M phil
Implicit vs. Explicit Trust in Social Matrix Factorization
Implicit vs Explicit trust in Social Matrix Factorization
From measurement model to structural model
Ijmer 46067276
Finding Pattern in Dynamic Network Analysis

What's hot (20)

PDF
Final Poster for Engineering Showcase
PPT
Sentiment Analysis in Twitter
PDF
Unsupervised Word Usage Similarity in Social Media Texts
PDF
Analysis of sms feedback and online feedback using sentiment analysis for ass...
DOCX
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
PPTX
Social Media Sentiments Analysis
PPTX
Towards Validating Social Network Simulations
PPTX
Generic polling pres.
PPTX
Sentiment Analysis in Twitter
PPT
Customer insight workshop c steve postlethwaite and sam hepenstal
PPTX
Web version polling
PDF
Semantics-based Graph Approach to Complex Question-Answering
PDF
Sentiment Analysis Using Hybrid Approach: A Survey
PDF
Conceptual Sentiment Analysis Model
PPTX
A data mining tool for the detection of suicide in social networks
PPTX
Sentimental Analysis of twitter data .
PDF
An overview of text mining and sentiment analysis for Decision Support System
PDF
Evolving Swings (topics) from Social Streams using Probability Model
PPTX
Twitter sentiment analysis
DOCX
Mj0014 online journalism
Final Poster for Engineering Showcase
Sentiment Analysis in Twitter
Unsupervised Word Usage Similarity in Social Media Texts
Analysis of sms feedback and online feedback using sentiment analysis for ass...
IEEE 2014 JAVA DATA MINING PROJECTS Discovering emerging topics in social str...
Social Media Sentiments Analysis
Towards Validating Social Network Simulations
Generic polling pres.
Sentiment Analysis in Twitter
Customer insight workshop c steve postlethwaite and sam hepenstal
Web version polling
Semantics-based Graph Approach to Complex Question-Answering
Sentiment Analysis Using Hybrid Approach: A Survey
Conceptual Sentiment Analysis Model
A data mining tool for the detection of suicide in social networks
Sentimental Analysis of twitter data .
An overview of text mining and sentiment analysis for Decision Support System
Evolving Swings (topics) from Social Streams using Probability Model
Twitter sentiment analysis
Mj0014 online journalism
Ad

Similar to Entity-oriented sentiment analysis of tweets: results and problems (20)

PDF
Sentiment Analysis of Twitter Data
PPTX
Telecom Data Analysis Using Social Media Feeds
PDF
Twitter Sentiment Analysis
PDF
Online Reputation Monitoring in Twitter from an Information Access Perspective
PDF
Sentiment Analysis
PDF
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
PDF
Sentiment Analysis of Twitter Data
PDF
IRJET - Twitter Sentimental Analysis
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
PDF
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
PPTX
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
PDF
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
PDF
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
PDF
Sentiment Analysis on Twitter Dataset using R Language
PDF
SA_presentation2_no_animations
PDF
Lexicon Based Emotion Analysis on Twitter Data
PDF
Classification of Disastrous Tweets on Twitter using BERT Model
PPTX
Online social network analysis with machine learning techniques
PPT
Sentiment Analysis in Twitter
PPT
Sentiment Analysis
Sentiment Analysis of Twitter Data
Telecom Data Analysis Using Social Media Feeds
Twitter Sentiment Analysis
Online Reputation Monitoring in Twitter from an Information Access Perspective
Sentiment Analysis
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
Sentiment Analysis of Twitter Data
IRJET - Twitter Sentimental Analysis
IRJET - Twitter Sentiment Analysis using Machine Learning
Twitter Text Sentiment Analysis: A Comparative Study on Unigram and Bigram Fe...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
Sentiment Analysis on Twitter Dataset using R Language
SA_presentation2_no_animations
Lexicon Based Emotion Analysis on Twitter Data
Classification of Disastrous Tweets on Twitter using BERT Model
Online social network analysis with machine learning techniques
Sentiment Analysis in Twitter
Sentiment Analysis
Ad

More from Yuliya Rubtsova (17)

PPTX
Как продать самолет с помощью соц.сетей или социальные сети для бизнеса
PPTX
Aspect extraction using conditional random fields [SentiRuEval]
PPTX
Automatic term extraction of dynamically updated text collections for sentime...
PPT
Измеряй и властвуй или практическая web-аналитика
PPTX
Метод построения корпуса коротких текстов
PPTX
Веб аналитика на практике
PPTX
Mad analyst
PPTX
Курс леций по основам интернет маркетинга и поисковой оптимизации
PPTX
Web analytics в картинках и денежных знаках
PPT
Продвижение мобильных приложений в AppStore и Google Play
PPT
Увеличение конверсии сайта
PPTX
Как из посетителя сделать покупателя
PPTX
Mobile applications market
PPTX
Intranet
PPTX
Networking
PPTX
Usability testing
PPT
Twitter marketing communications
Как продать самолет с помощью соц.сетей или социальные сети для бизнеса
Aspect extraction using conditional random fields [SentiRuEval]
Automatic term extraction of dynamically updated text collections for sentime...
Измеряй и властвуй или практическая web-аналитика
Метод построения корпуса коротких текстов
Веб аналитика на практике
Mad analyst
Курс леций по основам интернет маркетинга и поисковой оптимизации
Web analytics в картинках и денежных знаках
Продвижение мобильных приложений в AppStore и Google Play
Увеличение конверсии сайта
Как из посетителя сделать покупателя
Mobile applications market
Intranet
Networking
Usability testing
Twitter marketing communications

Recently uploaded (20)

PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
Sciences of Europe No 170 (2025)
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
An interstellar mission to test astrophysical black holes
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
Derivatives of integument scales, beaks, horns,.pptx
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
BIOMOLECULES PPT........................
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
famous lake in india and its disturibution and importance
PDF
The scientific heritage No 166 (166) (2025)
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
The KM-GBF monitoring framework – status & key messages.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
Sciences of Europe No 170 (2025)
TOTAL hIP ARTHROPLASTY Presentation.pptx
An interstellar mission to test astrophysical black holes
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Introduction to Cardiovascular system_structure and functions-1
Derivatives of integument scales, beaks, horns,.pptx
Viruses (History, structure and composition, classification, Bacteriophage Re...
7. General Toxicologyfor clinical phrmacy.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
microscope-Lecturecjchchchchcuvuvhc.pptx
BIOMOLECULES PPT........................
Phytochemical Investigation of Miliusa longipes.pdf
Biophysics 2.pdffffffffffffffffffffffffff
famous lake in india and its disturibution and importance
The scientific heritage No 166 (166) (2025)
HPLC-PPT.docx high performance liquid chromatography
AlphaEarth Foundations and the Satellite Embedding dataset
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS

Entity-oriented sentiment analysis of tweets: results and problems

  • 1. Entity-oriented sentiment analysis of tweets: results and problems Natalia Loukachevitch Lomonosov Moscow State University Yuliya Rubtsova A.P. Ershov Institute of Informatics Systems
  • 2. Entity-Oriented analysis of tweets: reputation monitoring Sentiment Analysis In general Entity- oriented
  • 3. SentiRuEval 2014-2015 Aspect-oriented analysis of reviews • Restaurants • Cars Entity-Oriented analysis of tweets: reputation monitoring • Banks [8] • Telecom companies [7] Testing of sentiment analysis systems of Russian texts
  • 4. SentiRuEval: Entity-Oriented analysis of tweets Reputation-oriented tweet may express Task: to determine sentiment towards the mentioned company Participation 9 participants 33 runs positive or negative opinion about a company positive or negative fact concerning a company
  • 5. SentiRuEval: Entity-Oriented analysis of tweets Training collection 5000 banking tweets 5000 telecom tweets Test collection 4549 banking tweets 3845 telecom tweets December 2013 February 2014 July 2014 August 2014 Test collection Train collection
  • 6. Expert annotation • Tweet considered as neutral 0 • Positive fact or opinion 1 • Negative fact or opinion -1 • Positive and negative sentiments in the same tweet +- • Meaningless --
  • 7. Annotation problem Test data were annotated using the voting scheme Agreement between 2 or 3 annotators The number of tweets with the same labels from at least 2 assessors Full agreement The final number of tweets in the test collection Telecom 4 503 (90.06%) 2 233 (44.66%) 3 845 Banks 4 915 (98.3%) 3 818 (76.36%) 4 549
  • 8. Distribution of messages in collections according to sentiment classes 2397 973 1667 2816 413 944 Neutral Positive Negative Telecom Training collecion Gold standard test collection 3569 410 2138 3592 350 670 Neutral Positive Negative Banks Training collecion Gold standard test collection
  • 9. Quality measure macro-average F-measure: F-measure of the positive class F-measure of the negative class + 2 ignored F-measure of neutral class this does not reduce the task to the two-class prediction Additionally micro-average F-measures were calculated for two sentiment classes
  • 10. Results Run id Macro F Micro F Baseline 0.1823 0.337 2 0.4882 0.5355 3 0.4804 0.5094 4 0.467 0.506 Run id Macro F Micro F Baseline 0.1267 0.2377 4 0.3598 0.343 10 0.352 0.337 2 0.3354 0.3656 Top 3 results for telecom tweets Top 3 results for bank tweets Manual labeling of participant for telecom domain Macro-F – 0.703 Micro-F – 0.7487
  • 11. Classification methods •lemmas and syntactic links presented as triples (head word, dependent word, type of relation) 2 •rule-based approach accounting syntactic relations between sentiment words and the target entities 3 •maximum entropy method on the basis of word n-grams, symbol n- grams, and topic modeling results. 4 •word n-grams, letter n-grams, emoticons, punctuation marks, smilies, a manual sentiment vocabulary, and automatically generated sentiment list based on (PMI) of a word occurrences in positive or negative training subsets. 10
  • 12. Classification methods SVM + syntactic relations Linguistic syntax-based pattern (without machine learning) Maxent, SVM using various features
  • 13. Explaining the difference in the perfomance in two domains Best results in banking and telecom domains are different: 0.36 vs. 0.488 Difference between training and test collections: Kullback-Leibler divergence
  • 14. Explaining the difference in the performance in two domains The topics of reputation-oriented tweets greatly depend on positive or negative events with the regard of the target entities
  • 15. Problems of reputation analysis of tweets In any moment some events influencing reputation can occur => absence in training data Test collections. December 2013- February 2014. Ukraine events did not influence target entities Train collections in both domains. July-August 2014 after Ukraine events 2013-2014 Sanctions against banks. Problems with communication in Crimea
  • 16. Analyzing difficult tweets 71 tweets in the banking domain wrongly classified by all participants 85 tweets in the telecom domain difficult for almost all participants (maximum 2 systems were correct)
  • 17. First group. 1.1 Contains evident sentiment words (such as понравиться – to like) that were absent in the training set General vocabulary of Russian sentiment words could help
  • 18. First group. 1.2 Contains words expressing well-known positive or negative situations such as theft or murder but absent in the training collection General vocabulary of connotative words would be useful
  • 19. First group. 1.3 Tweets contains words and phrases describing current events, concerning the current news flow Parallel analysis of the current news, revealing correlations between tweet words and general sentiment and connotation vocabularies in news texts
  • 20. Second group Misclassified tweets includes tweets that are really complicated Mention more than one entity with different attitudes Several sentiment words with different polarity orientation Contain irony
  • 21. vocabularies M-L framework 30% Tweet in Bank collection 15% Tweet in Telecom collection
  • 22. Were systems entity-oriented? Test tweets mentioning two or more entities • 58 tweets in the banking domain (15 tweets with different polarity labels), • 232 tweets in the telecom domain (71 tweets with different polarity labels) 3 of 9 participants considered the task as entity-oriented one • Other participants always assigned the same polarity class to all entities mentioned in a tweet Performance • Worse than for all tweets on average • Entity-oriented approaches did not achieve better results
  • 23. Conclusion We described the tasks, approaches and results in SentiRuEval testing – High dependence from train collections – High impact from current dramatic events – Capability to do entity-oriented analysis is quite restricted – large impact for improving results can be based on integration of a general sentiment vocabulary and a general vocabulary of connotative words – The most participants solved the general task of tweet classification; – Entity-oriented approaches did not achieve better results. All prepared materials are accessible for research purposes http://goo.gl/qHeAVo
  • 24. Thank you! You can help us to assess tweets for SentiRuEval-2016 http://sentimeter.ru/assess/texts/ Yuliya Rubtsova

Editor's Notes

  • #3: In general: sentiment of the whole document, fragment or sentence Entity-oriented Sentiment about a specific entity Politician, political party Company etc. Sentiment about specific parts or properties of an entity (aspects) Переходи в Билайн. «Все за 300» — отличный тариф!
  • #5: The goal of the Twitter sentiment analysis at SentiRuEval was to find tweets influencing the reputation of a company in two domains
  • #6: The datasets were collected with Streaming API Twitter
  • #7: To prepare the datasets, 20,000 messages were labeled including 5,000 messages in each domain for training and test collections Each collection was labeled at least by two assessors. The gold standard test collections were labeled by three assessors. Irrelevant or unclear messages were removed from the training and test sets.
  • #8: To avoid inconsistency and disputes, the voting scheme was applied to the test collections labeling
  • #9: We noticed that sometimes users do not want to be rude and add positive emoticons to clearly negative or ironic messages. That is why simple methods based on extraction of emoticons, which are used for classification on the whole tweet level, do not work well
  • #10: Main quality measure:
  • #11: The baselines are based on the majority reputation-oriented category (negative one in this case). one of the participants fulfilled independent expert labeling of telecom tweets which can be considered as the maximum possible performance of automated systems in this task.
  • #12: Most participants used the SVM classification method.
  • #13: Most participants used the SVM classification method.
  • #14: we computed the Kullback-Leibler divergence to compare the difference of word probability distributions in the test collections in relation to the training collections
  • #18: includes tweets that were misclassified because of the restricted size of the training collection, which did not contain appropriate training
  • #19: These words are usually considered as neutral, not-opinionated, but having positive or negative associations (so called connotations). For solving these problems, a general vocabulary of connotative words would be useful because the appearance of these words in connection with a company influences its reputation.
  • #20: Problematic tweets contains words and phrases describing current events, concerning the current news flow. The apperance of some events and their influence the company’s reputation are very difficult to predict, their mentioning will always be absent in the training collection. In this case, the parallel analysis of the current news, revealing correlations between tweet words and general sentiment and connotation vocabulaties in news texts, can help.
  • #22: It means that integration of various vocabularies into the machine-learning framework can improve the performance of reputation-oriented automatic systems