SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1307
A Survey on Analysis of Twitter Opinion Mining Using Sentiment
Analysis
Anusha K S1 , Radhika A D2
1M Tech, CSE Dept. of Computer Science and Engineering VVCE, Mysuru
2Assistant Professor, CSE Dept. of Computer Science and Engineering VVCE, Mysuru
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In recent years, there is a rapid growth in online
communication. There are many social networking sites and
related mobile applications, and some more are still
emerging. Huge amount of data is generated by these sites
everyday and this data can be used as a source for various
analysis purposes. Twitter is one of the most popular
networking sites with millions of users. There are users with
different views and varieties of reviews in the form of tweets
are generated by them. Nowadays Opinion Mining has
become an emerging topic of research due to lot of
opinionated data available on Blogs & social networking
sites. Tracking different types of opinions & summarizing
them can provide valuable insight to different types of
opinions to users who use Social networking sites to get
reviews about any product, service or any topic. Analysis of
opinions & its classification on the basis of polarity (positive,
negative, neutral) is a challenging task. Lot of work has been
done on sentiment analysis of twitterdata andlotneedsto be
done.
In this paper we discuss the levels, approaches of
sentiment analysis, sentiment analysis of twitter data,
existing tools available for sentiment analysis and the steps
involved for same. Two approaches are discussed with an
example which works on machinelearningandlexiconbased
respectively.
Keywords—Twitter data, opinion mining, sentiment
analysis.
I. INTRODUCTION
Sentimentanalysistechniqueisaneffectivemeansof
discovering public opinions. Various companies often use
online or paper based surveys tocollectcustomercomments.
Due to the emergence of social networking sites and
applications, people tend to comment on their facebook or
tweet profile. Therefore the paper based approach is not an
efficient approach. Only a very small customer base can be
reached and there is no guarantee that their answers in the
survey are honest or not. Here social media comes into play.
Facebook, Twitter and all other social media sites are full of
people’s opinions about products/services they use,
comments about popular personalities and much more [1].
Nasukawa & YI first introduced the term Sentiment Analysis
& Opinion Mining in the year 2003. Opinion mining toolswill
help in providing the opinion about the product. Sentiment
analysis on tweet data involves data collection, extraction,
classification, understanding and providing the opinion that
are expressed in various tweets [2].
Opinion Mining and Sentiment analysis is done on
three levels [3]
 Document Level: Analysis is done on the whole
document and then expresses whether the
document is positive or negative sentiment.
 Sentence Level: It is related to find sentiment
polarity from short sentences. Sentence level is
merely close to subjectivity classification.
 Entity /Aspect Level: sentiment analysis performs
augmented analysis. The aim is to find sentiment on
entities or aspects.
Two approaches of Sentiment Analysis
 Supervised approaches or machine learning
method:
Machine learning is one of the most prominent
techniques gaining researchers interest [10] due to
its adaptability and accuracy. This method
comprises of three stages:(i)Data collection(ii)Pre-
processing and(iii)Trainingdata Classification
[9].
 Unsupervised (or lexicon-based):
Lexical analysis estimates the sentiment from the
semantic orientation of words [8] or phrases that
occur in a text. In this approach a dictionary
containing positive and negative words that are
matched with the words containing in tweet.
However, these techniques totally depend onlexical
resources [6] which are concerned with mapping
words [7] to a categorical (positive, negative,
neutral) or numerical sentiment score. In this
method the unigrams, whicharefoundinthelexicon
[9] are assigned a polarity score.
Sentiment Analysis of twitter data
Twitter is popular online social networking service
launched in March 2006.It enables users to send and read
tweets with about 140 characters length. Currently Twitter
acts as opinionated Data Bank with large amount of data
available used for sentiment analysis. Twitter is very
convenient for research because there are very large
numbers of messages, many of which are publicly available,
and obtaining them is technically simple compared to
scarping blogs from the web.
Twitter data is collected for analysis using Twitter API.
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1308
II.Tools Available for sentiment analysis
S. No Tools Applications
1 NLTK
NLTK toolkit is widely used nowadays
for sentiment analysis task. Main
features of NLTK used in Sentiment
analysis process are Tokenization, Stop
Word removal, Stemming and tagging.
This tool is written in Python language
and can be downloaded from
www.nltk.org.
2 GATE
General Architecture for Text
Engineering (GATE) is information
Extraction System, Stemming and Partof
speech tagger. This tool is written in Java
language. https://gate.ac.uk/
3
Red
Opal
This tool is widely used for users who
want to buy any products based on
different features. Users can search for
any product depending upon the feature
selected and can get reviews related to
their search.
4
Opinion
finder
Opinion Finder is used for analysis of
different Subjective sentences related to
any topic & classification of sentences is
done based on their polarity. It’s written
in Java and is platform Independent tool.
Table-1: Sentiment Analysis Tools
III.TWITTER SENTIMENT ANALYSIS PROCEDURE
The framework used for this analysis is depicted in
below figure. Different processing steps had their own
important role. We discussed about all steps below.
Figure-1 Sentiment analysis Framework
A. Data Collection:
Collection of data is an important part of Sentiment Analysis.
Various data Sources like Blogs, Review Sites, Online Posts &
Micro Blogging like Twitter, Facebook are used for Data
Collection.
B. Data Preprocessing:
Now before Sentiment Analysis we need to process the
collected data using the following steps of data processing-
1) Stemming- In this process thepostfixfromeachwords
like “ing”,“tion” etc are removed.
2) Tokenization- This process is very important for Data
pre processing as it includes several sub steps like
“Removal of Extra spaces”, “Emoticons (,) used
replaced with their actual meaning like Happy, Sad by using
Emoticon data set available on Internet”, “Abbreviations like
OMG, WTF are replaced by their actual meanings”,
“Pragmatics handling like hapyyyyyyy as happy, guddddd as
good etc.”
3) Stop Word Removal- In this, stop words which are not
of any use in analysis like Prepositions (a, an) and
Conjunctions (and, between) used are removed.
C. Feature Extraction:
Feature extraction specifies the type of features used for
opinion Mining [6]. There are different typesoffeaturesused
like-
1)Term Frequency- Frequency of any termina document
carries weight age. [6]
2)Term Co-occurrence- Repeatedly occurrence ofa word
like Unigram, Bigram or n-gram etc. [6]
3)Part of Speech-For each tweet we have features for
counts of the number of Verbs, adjectives, nouns. [7]
D. Sentiment Analysis & Polarity Classification:
Emotions,opinionsandsentimentsplayanimportantrole
in all human life. Mining such opinions termed as sentiment
analysis [10]. Performing task of Sentiment analysis &
polarity classification is a challengingtask.SentiWordnet isa
standard dictionary used by most researchers today for
sentiment analysis. Task of Polarity classification mean the
reviews collected areclassifieddependingupontheemotions
expressed as Positive, Negative and Neutral.
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1309
IV. APPROACHES USED FOR SENTIMENT
ANALYSIS ON TWITTER
A. SENTIMENT ANALYSIS ON TWITTER USING
SUPERVISED APPROACH (MACHINE LEARNING)
This approach extracts the data from SNS services which is
done using Streaming APIoftwitter.The extractedtweets are
loaded into hadoop and it is been preprocessed using map
reduce.This task is followed by classification whichusesNLP
and machine learning techniques. The classification used
here is uni-word naïve bayes’ classification.
FIGURE 2: real time sentiment analysis on twitter
Consider the number of all positive tweets, positive words
and negative words from our training phase. Then calculate
the probability of a tweet being positive.
P(C) = No. of Positive Tweets / Total Number of Tweets - (1)
For each word in each tweet that is being streamed is
checked for the probability of it being used given that it is
positive.
P(D/C) = Positive score of the words / Total number of
Positive Words - (2)
Then checked the word itself being used irrespective of
whether or not it is positive.
P(D) = Positive score of the word + Negative score of the
Word / Total Number of words - (3)
In-order to check the probabilityofwordbeing positivegiven
that is used in a tweet which is given as follows:
P(C/D) = P(C) * P(D/C) / P(D) - (4)
The probability of a word is then passed to the Sentiment
function which then classifies,iftheprobabilityofthe wordis
greater than 0.6then it is positive, asneutral iftheprobability
is between 0.4 and 0.6 and negative if it is lesser than 0.
B. SENTIMENT ANALYSIS ON TWITTER USING
UNSUPERVISED APPROACH (LEXICON METHOD)
The data is collected from the twitter API and that data is
pre-processed to eliminate all unwanted information and to
replace the emoticons. Here lexical method is used for
classification and work on dictionary-based approach. The
dictionary-based approach depends on finding words from
tweets, and then matches the word with the dictionary. If
there is a positive match, the positive score is shown or the
word is tagged as positive. If it is negative word then the
negative score is incremented or the word is tagged as
negative. Otherwise tag neutral word.
FIGURE 3: working of a lexical technique
Let a sentence s contains a set of entities {e1, e2, …, er} and a
subset of their aspects {a1, …, am} from a set of opinion
holders[2] {h1, h2, …, hp} at some particular time pointanda
set of sentiment words or phrases {sw1, …, swn} with their
sentiment scores . The sentiment orientation for each aspect
ai in s is determined by the following aggregation function:
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1310
Where dist(swj, ai) is the distance between aspect ai and
sentiment word swj in s. swj.so is the sentiment score of swi.
If the final score is positive, then the opinion on aspect ai in s
is positive. If the final score is negative,thenthesentiment on
the aspect is negative. It is neutral otherwise.
CONCLUSION
In this paper, we have done a short survey on sentiment
analysis and opinion mining. We have discussed about three
major levels of sentiment analysis, two approaches of
sentiment analysis and sentiment analysis of twitter data.
Further, we studied and listed some of the tools available for
sentiment analysis and the general procedure for sentiment
analysis. By analyzing the examples of both approaches we
understood the two approaches in detail.
REFERENCES
[1] Syed Akib Anwar Hridoy, M.Tahmid Ekram,
Mohammad Samiul Islam, Faysal Ahmed and
Rashedur M. Rahman “Localized twitter opinion
minimg using sentiment analysis”.
[2] Roshan Fornandes, Dr. Rio D’Souza “Analysis of
product twitter data through opnionmining”©2016
IEEE.
[3] M. Trupthi, Suresh Pabboju, G. Narasimha
“Sentiment analysis on twitter using streamingAPI”
2017 IEEE &th International Advance Computing
Conference.
[4] Prerna Mishra, Dr. Ranjana Rajnish, Dr. Pankaj
Kumar “Sentiment analysis of twitter data: Case
study on digital india” 2016(InCITe).
[5] Paramita Ray, Amlan Chakrabarti “Twitter
sentiment analysis for product reviews using
Lexicon Method” 2017(ICDMAI).
[6] A Kowcika and Aditi Guptha “sentiment Analysis for
social media” ,International journal of advanced
research in computer science and software
engineering,216-221,Volume 3,Issue 7, july 2013.
[7] G. Vinodini and RM.Chandrashekaran, “sentiment
analysis and opinion mining: A survey”,
International journal of advanced research in
computer science and software enginnering,283-
294, Volume 2,Issue 6, june 2012.
[8] Cataldo Musto, Giovanni Semeraro, Marco
Polignano, “A comparison of Lexicon-based
approaches for Sentiment Analysis of microblog
posts”, Department of Computer Science, University
of Bari Aldo Moro, Italy.
[9] James Spencer and Gulden Uchyigit, Sentimentor:
Sentiment Analysis of Twitter Data. School of
Computing, Engineering and Mathematics.
University of Brighton.
[10] Anna Jurek, Maurice D. Mulvenna and Yaxin Bi,
Improved lexiconbased sentimentanalysisforsocial
media analytics Science direct, Published: 9
December 2015.
[11] Apoorv Agarwal Boyi Xie Ilia Vovsha OwenRambow
Rebecca Passonneau, “SentimentAnalysisofTwitter
Data”, Columbia University, Newyork.
[12] Sang-Hyun Cho and Hang-Bong Kang, “Text
Sentiment Classification for SNS-based Marketing
Using Domain Sentiment Dictionary”, IEEE
International Conference on Conference on
consumer Electronics(ICCE), p.717-718, 2012.
[13] Patricia L V Ribeiro, Li Weigang and Tiancheng Li “A
Unified Approach for Domain-Specific Tweet
Sentiment Analysis”, FUSION, 2015.
[14] Tiara, Mira Kania Sabariah, Veronikha Effendy,
“Sentiment Analysis on Twitter Using the
Combination of Lexicon-Based and Support Vector
Machine for Assessing the Performance of a
Television Program”, 3rd International Conference
on Information and Communication Technology
(ICoICT), 2015.
[15] Asmita Dhokrat, Sunil Khillare, C. Namrata
Mahender, “Review on Techniques and Tools used
for Opinion Mining,” IJCAT, 2015.
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072

More Related Content

PDF
IRJET - Sentiment Analysis of Posts and Comments of OSN
PDF
IRJET- Sentiment Analysis of Twitter Data using Python
PDF
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
PDF
Methods for Sentiment Analysis: A Literature Study
PDF
project sentiment analysis
PDF
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
PDF
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
DOCX
295B_Report_Sentiment_analysis
IRJET - Sentiment Analysis of Posts and Comments of OSN
IRJET- Sentiment Analysis of Twitter Data using Python
IRJET- A Real-Time Twitter Sentiment Analysis and Visualization System: Twisent
Methods for Sentiment Analysis: A Literature Study
project sentiment analysis
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
IRJET- Analytic System Based on Prediction Analysis of Social Emotions from U...
295B_Report_Sentiment_analysis

What's hot (20)

PDF
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
PDF
Project sentiment analysis
PDF
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
PDF
Project report
PDF
Ijmer 46067276
PDF
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
PPTX
Tweet sentiment analysis (Data mining)
PDF
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
ODP
Sentiment Analysis on Twitter
PDF
IRJET- A Survey on Graph based Approaches in Sentiment Analysis
PDF
A Survey Of Collaborative Filtering Techniques
DOCX
Twitter sentiment analysis project report
PDF
Multi-lingual Twitter sentiment analysis using machine learning
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
PDF
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
PDF
IRJET- Sentimental Analysis of Twitter Data for Job Opportunities
PDF
IRJET- Product Aspect Ranking
PDF
IRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s Opinion
PDF
Vol 7 No 1 - November 2013
PDF
Zomato eda report
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
Project sentiment analysis
A Paper on Web Data Segmentation for Terrorism Detection using Named Entity R...
Project report
Ijmer 46067276
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...
Tweet sentiment analysis (Data mining)
IRJET- Sentimental Analysis of Product Reviews for E-Commerce Websites
Sentiment Analysis on Twitter
IRJET- A Survey on Graph based Approaches in Sentiment Analysis
A Survey Of Collaborative Filtering Techniques
Twitter sentiment analysis project report
Multi-lingual Twitter sentiment analysis using machine learning
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
IRJET- Sentimental Analysis of Twitter Data for Job Opportunities
IRJET- Product Aspect Ranking
IRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s Opinion
Vol 7 No 1 - November 2013
Zomato eda report
Ad

Similar to A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis (20)

PDF
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
PDF
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
PDF
IRJET- Review Analyser with Bot
PDF
IRJET - Twitter Sentimental Analysis
PDF
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
PDF
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
PDF
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
PDF
Sentiment Analysis of Twitter tweets using supervised classification technique
PDF
IRJET - Election Result Prediction using Sentiment Analysis
PDF
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
PDF
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
PDF
Sentiment Analysis on Twitter data using Machine Learning
PDF
Sentiment Analysis of Twitter Data
PDF
Emotion Recognition By Textual Tweets Using Machine Learning
PDF
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
PDF
Sentiment Analysis on Twitter Data
PDF
PDF
Twitter Sentiment Analysis
PDF
IRJET- Sentiment Analysis using Twitter Data
PDF
Election Result Prediction using Twitter Analysis
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
IRJET- Review Analyser with Bot
IRJET - Twitter Sentimental Analysis
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
Sentiment Analysis of Twitter tweets using supervised classification technique
IRJET - Election Result Prediction using Sentiment Analysis
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
Sentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis of Twitter Data
Emotion Recognition By Textual Tweets Using Machine Learning
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
Sentiment Analysis on Twitter Data
Twitter Sentiment Analysis
IRJET- Sentiment Analysis using Twitter Data
Election Result Prediction using Twitter Analysis
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
PPT on Performance Review to get promotions
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
composite construction of structures.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
additive manufacturing of ss316l using mig welding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPT
Project quality management in manufacturing
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
UNIT 4 Total Quality Management .pptx
Internet of Things (IOT) - A guide to understanding
PPT on Performance Review to get promotions
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Foundation to blockchain - A guide to Blockchain Tech
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
composite construction of structures.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
additive manufacturing of ss316l using mig welding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Project quality management in manufacturing
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx

A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1307 A Survey on Analysis of Twitter Opinion Mining Using Sentiment Analysis Anusha K S1 , Radhika A D2 1M Tech, CSE Dept. of Computer Science and Engineering VVCE, Mysuru 2Assistant Professor, CSE Dept. of Computer Science and Engineering VVCE, Mysuru ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - In recent years, there is a rapid growth in online communication. There are many social networking sites and related mobile applications, and some more are still emerging. Huge amount of data is generated by these sites everyday and this data can be used as a source for various analysis purposes. Twitter is one of the most popular networking sites with millions of users. There are users with different views and varieties of reviews in the form of tweets are generated by them. Nowadays Opinion Mining has become an emerging topic of research due to lot of opinionated data available on Blogs & social networking sites. Tracking different types of opinions & summarizing them can provide valuable insight to different types of opinions to users who use Social networking sites to get reviews about any product, service or any topic. Analysis of opinions & its classification on the basis of polarity (positive, negative, neutral) is a challenging task. Lot of work has been done on sentiment analysis of twitterdata andlotneedsto be done. In this paper we discuss the levels, approaches of sentiment analysis, sentiment analysis of twitter data, existing tools available for sentiment analysis and the steps involved for same. Two approaches are discussed with an example which works on machinelearningandlexiconbased respectively. Keywords—Twitter data, opinion mining, sentiment analysis. I. INTRODUCTION Sentimentanalysistechniqueisaneffectivemeansof discovering public opinions. Various companies often use online or paper based surveys tocollectcustomercomments. Due to the emergence of social networking sites and applications, people tend to comment on their facebook or tweet profile. Therefore the paper based approach is not an efficient approach. Only a very small customer base can be reached and there is no guarantee that their answers in the survey are honest or not. Here social media comes into play. Facebook, Twitter and all other social media sites are full of people’s opinions about products/services they use, comments about popular personalities and much more [1]. Nasukawa & YI first introduced the term Sentiment Analysis & Opinion Mining in the year 2003. Opinion mining toolswill help in providing the opinion about the product. Sentiment analysis on tweet data involves data collection, extraction, classification, understanding and providing the opinion that are expressed in various tweets [2]. Opinion Mining and Sentiment analysis is done on three levels [3]  Document Level: Analysis is done on the whole document and then expresses whether the document is positive or negative sentiment.  Sentence Level: It is related to find sentiment polarity from short sentences. Sentence level is merely close to subjectivity classification.  Entity /Aspect Level: sentiment analysis performs augmented analysis. The aim is to find sentiment on entities or aspects. Two approaches of Sentiment Analysis  Supervised approaches or machine learning method: Machine learning is one of the most prominent techniques gaining researchers interest [10] due to its adaptability and accuracy. This method comprises of three stages:(i)Data collection(ii)Pre- processing and(iii)Trainingdata Classification [9].  Unsupervised (or lexicon-based): Lexical analysis estimates the sentiment from the semantic orientation of words [8] or phrases that occur in a text. In this approach a dictionary containing positive and negative words that are matched with the words containing in tweet. However, these techniques totally depend onlexical resources [6] which are concerned with mapping words [7] to a categorical (positive, negative, neutral) or numerical sentiment score. In this method the unigrams, whicharefoundinthelexicon [9] are assigned a polarity score. Sentiment Analysis of twitter data Twitter is popular online social networking service launched in March 2006.It enables users to send and read tweets with about 140 characters length. Currently Twitter acts as opinionated Data Bank with large amount of data available used for sentiment analysis. Twitter is very convenient for research because there are very large numbers of messages, many of which are publicly available, and obtaining them is technically simple compared to scarping blogs from the web. Twitter data is collected for analysis using Twitter API. Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1308 II.Tools Available for sentiment analysis S. No Tools Applications 1 NLTK NLTK toolkit is widely used nowadays for sentiment analysis task. Main features of NLTK used in Sentiment analysis process are Tokenization, Stop Word removal, Stemming and tagging. This tool is written in Python language and can be downloaded from www.nltk.org. 2 GATE General Architecture for Text Engineering (GATE) is information Extraction System, Stemming and Partof speech tagger. This tool is written in Java language. https://gate.ac.uk/ 3 Red Opal This tool is widely used for users who want to buy any products based on different features. Users can search for any product depending upon the feature selected and can get reviews related to their search. 4 Opinion finder Opinion Finder is used for analysis of different Subjective sentences related to any topic & classification of sentences is done based on their polarity. It’s written in Java and is platform Independent tool. Table-1: Sentiment Analysis Tools III.TWITTER SENTIMENT ANALYSIS PROCEDURE The framework used for this analysis is depicted in below figure. Different processing steps had their own important role. We discussed about all steps below. Figure-1 Sentiment analysis Framework A. Data Collection: Collection of data is an important part of Sentiment Analysis. Various data Sources like Blogs, Review Sites, Online Posts & Micro Blogging like Twitter, Facebook are used for Data Collection. B. Data Preprocessing: Now before Sentiment Analysis we need to process the collected data using the following steps of data processing- 1) Stemming- In this process thepostfixfromeachwords like “ing”,“tion” etc are removed. 2) Tokenization- This process is very important for Data pre processing as it includes several sub steps like “Removal of Extra spaces”, “Emoticons (,) used replaced with their actual meaning like Happy, Sad by using Emoticon data set available on Internet”, “Abbreviations like OMG, WTF are replaced by their actual meanings”, “Pragmatics handling like hapyyyyyyy as happy, guddddd as good etc.” 3) Stop Word Removal- In this, stop words which are not of any use in analysis like Prepositions (a, an) and Conjunctions (and, between) used are removed. C. Feature Extraction: Feature extraction specifies the type of features used for opinion Mining [6]. There are different typesoffeaturesused like- 1)Term Frequency- Frequency of any termina document carries weight age. [6] 2)Term Co-occurrence- Repeatedly occurrence ofa word like Unigram, Bigram or n-gram etc. [6] 3)Part of Speech-For each tweet we have features for counts of the number of Verbs, adjectives, nouns. [7] D. Sentiment Analysis & Polarity Classification: Emotions,opinionsandsentimentsplayanimportantrole in all human life. Mining such opinions termed as sentiment analysis [10]. Performing task of Sentiment analysis & polarity classification is a challengingtask.SentiWordnet isa standard dictionary used by most researchers today for sentiment analysis. Task of Polarity classification mean the reviews collected areclassifieddependingupontheemotions expressed as Positive, Negative and Neutral. Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1309 IV. APPROACHES USED FOR SENTIMENT ANALYSIS ON TWITTER A. SENTIMENT ANALYSIS ON TWITTER USING SUPERVISED APPROACH (MACHINE LEARNING) This approach extracts the data from SNS services which is done using Streaming APIoftwitter.The extractedtweets are loaded into hadoop and it is been preprocessed using map reduce.This task is followed by classification whichusesNLP and machine learning techniques. The classification used here is uni-word naïve bayes’ classification. FIGURE 2: real time sentiment analysis on twitter Consider the number of all positive tweets, positive words and negative words from our training phase. Then calculate the probability of a tweet being positive. P(C) = No. of Positive Tweets / Total Number of Tweets - (1) For each word in each tweet that is being streamed is checked for the probability of it being used given that it is positive. P(D/C) = Positive score of the words / Total number of Positive Words - (2) Then checked the word itself being used irrespective of whether or not it is positive. P(D) = Positive score of the word + Negative score of the Word / Total Number of words - (3) In-order to check the probabilityofwordbeing positivegiven that is used in a tweet which is given as follows: P(C/D) = P(C) * P(D/C) / P(D) - (4) The probability of a word is then passed to the Sentiment function which then classifies,iftheprobabilityofthe wordis greater than 0.6then it is positive, asneutral iftheprobability is between 0.4 and 0.6 and negative if it is lesser than 0. B. SENTIMENT ANALYSIS ON TWITTER USING UNSUPERVISED APPROACH (LEXICON METHOD) The data is collected from the twitter API and that data is pre-processed to eliminate all unwanted information and to replace the emoticons. Here lexical method is used for classification and work on dictionary-based approach. The dictionary-based approach depends on finding words from tweets, and then matches the word with the dictionary. If there is a positive match, the positive score is shown or the word is tagged as positive. If it is negative word then the negative score is incremented or the word is tagged as negative. Otherwise tag neutral word. FIGURE 3: working of a lexical technique Let a sentence s contains a set of entities {e1, e2, …, er} and a subset of their aspects {a1, …, am} from a set of opinion holders[2] {h1, h2, …, hp} at some particular time pointanda set of sentiment words or phrases {sw1, …, swn} with their sentiment scores . The sentiment orientation for each aspect ai in s is determined by the following aggregation function: Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1310 Where dist(swj, ai) is the distance between aspect ai and sentiment word swj in s. swj.so is the sentiment score of swi. If the final score is positive, then the opinion on aspect ai in s is positive. If the final score is negative,thenthesentiment on the aspect is negative. It is neutral otherwise. CONCLUSION In this paper, we have done a short survey on sentiment analysis and opinion mining. We have discussed about three major levels of sentiment analysis, two approaches of sentiment analysis and sentiment analysis of twitter data. Further, we studied and listed some of the tools available for sentiment analysis and the general procedure for sentiment analysis. By analyzing the examples of both approaches we understood the two approaches in detail. REFERENCES [1] Syed Akib Anwar Hridoy, M.Tahmid Ekram, Mohammad Samiul Islam, Faysal Ahmed and Rashedur M. Rahman “Localized twitter opinion minimg using sentiment analysis”. [2] Roshan Fornandes, Dr. Rio D’Souza “Analysis of product twitter data through opnionmining”©2016 IEEE. [3] M. Trupthi, Suresh Pabboju, G. Narasimha “Sentiment analysis on twitter using streamingAPI” 2017 IEEE &th International Advance Computing Conference. [4] Prerna Mishra, Dr. Ranjana Rajnish, Dr. Pankaj Kumar “Sentiment analysis of twitter data: Case study on digital india” 2016(InCITe). [5] Paramita Ray, Amlan Chakrabarti “Twitter sentiment analysis for product reviews using Lexicon Method” 2017(ICDMAI). [6] A Kowcika and Aditi Guptha “sentiment Analysis for social media” ,International journal of advanced research in computer science and software engineering,216-221,Volume 3,Issue 7, july 2013. [7] G. Vinodini and RM.Chandrashekaran, “sentiment analysis and opinion mining: A survey”, International journal of advanced research in computer science and software enginnering,283- 294, Volume 2,Issue 6, june 2012. [8] Cataldo Musto, Giovanni Semeraro, Marco Polignano, “A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts”, Department of Computer Science, University of Bari Aldo Moro, Italy. [9] James Spencer and Gulden Uchyigit, Sentimentor: Sentiment Analysis of Twitter Data. School of Computing, Engineering and Mathematics. University of Brighton. [10] Anna Jurek, Maurice D. Mulvenna and Yaxin Bi, Improved lexiconbased sentimentanalysisforsocial media analytics Science direct, Published: 9 December 2015. [11] Apoorv Agarwal Boyi Xie Ilia Vovsha OwenRambow Rebecca Passonneau, “SentimentAnalysisofTwitter Data”, Columbia University, Newyork. [12] Sang-Hyun Cho and Hang-Bong Kang, “Text Sentiment Classification for SNS-based Marketing Using Domain Sentiment Dictionary”, IEEE International Conference on Conference on consumer Electronics(ICCE), p.717-718, 2012. [13] Patricia L V Ribeiro, Li Weigang and Tiancheng Li “A Unified Approach for Domain-Specific Tweet Sentiment Analysis”, FUSION, 2015. [14] Tiara, Mira Kania Sabariah, Veronikha Effendy, “Sentiment Analysis on Twitter Using the Combination of Lexicon-Based and Support Vector Machine for Assessing the Performance of a Television Program”, 3rd International Conference on Information and Communication Technology (ICoICT), 2015. [15] Asmita Dhokrat, Sunil Khillare, C. Namrata Mahender, “Review on Techniques and Tools used for Opinion Mining,” IJCAT, 2015. Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072