SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 95
UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT
ANALYSIS
Akanksha Srivastava1, Mr. Sambhav Agarwal2
1M.Tech, Computer Science and Engineering, SR Institute of Management & Technology, Lucknow, India
2Associate Professor, Computer Science, and Engineering, SR Institute of Management & Technology, Lucknow
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Applications in many domains make Sentiment
Analysis an exciting area for study. The use of online polls and
surveys to get feedback from the public regarding goods,
current events, and societal or political issues are on the rise.
The public and the stakeholders benefit from hearing the
thoughts and feelings of the general public when important
choices must be made. Opinion mining is the practice of
gleaning insights from online sources including web search
engines, blogs, micro-blogs, Twitter, and social networks to
produce meaningful conclusions. Twitter's user base provides
a wealth of material from which to get insight intothepublic's
perspective. The massive volume of tweets as theunstructured
text makes it challenging to physically delineate the
information. Consequently, extracting and condensing the
tweets from corpora calls for expert computational
methodologies, which in turn necessitates familiarity with
terms that convey emotion. Sentiment analysis from the
unstructured text may be accomplished usingawidevarietyof
computer methodologies, models, and algorithms. The vast
majority are based on machine learning methods, namely the
Bag-of-Words (BoW) representation. In thisresearch, weused
a lexicon-based strategy to automatically identify sentiment
for tweets gathered from the Twitter public domain. To
further investigate the efficacy of alternative feature
combinations, we have used three distinct machine learning
algorithms for the task of tweet sentiment identification:
Naive Bayes (NB), Maximum Entropy (ME), and Support
Vector Machines (SVM). Our results suggest that bothNBwith
Laplace smoothing and SVM are successful in categorizingthe
tweets. The feature used for NB is unigramandPart-of-Speech
(POS), while unigram is utilized for SVM.
Key Words: Bag-of-Words, Lexicon, Machine Learning
Algorithms, Laplace Smoothing, Part-of-Speech.
1. INTRODUCTION
It has been found via two separate polls of over 2000
American adults that 81% of Internet users (or 60% of
Americans) have done product research online at least once
and that 20% of Internet users (15% of Americans) prefer it
on a certain day. We may claim that people's consumption of
goods and services is not the only factor for their online
information-seekingandopinion-sharingactivities.The need
for access to current political information is another critical
factor to consider. At the moment, individuals may utilize
email for political campaigns by sharing information and
discussing candidates and issues online. The user trusts
internet advice and suggestions since they deal mostly with
an opinion. Despite the generally pleasant experiences of
American Internet users during online product research,
Horrigan [1] found that 58% of users reported experiencing
missing, difficult-to-discover, confused, or overwhelming
online information. Therefore, there is a significant need for
improved information-access technologies to aid shoppers
and researchers. Web 2.0 sites like blogs, message boards,
and other kinds of social media havemadeiteasierthan ever
for customers to voice their thoughts and views on the
brands they use. In recent years, businesses have begun to
acknowledge the power that user reviews have on shaping
the perceptions of others and the standing of certain brands.
Companies are beginning to watch social media to react to
customer feedback and adjust their marketing, brand
positioning, product development, and other strategies
appropriately.
1.1. Opinion Mining and Sentiment Analysis
Extracting views from text is called "Opinion Mining" (OM).
Viewpoint mining (OM) is a new field at the intersection of
information retrieval, text mining, and computational
linguistics that seeks to detect the opinion represented in
natural language texts, as described by Pang et al. [3].
Opinion mining is a subfield of KDD that employs Natural
Language Processing (NLP) and statistical machine learning
methods to identify and distinguish between opinionated
and factual content. Tasks in opinionminingincludelocating
opinions, labeling them as favorable, negative, or neutral,
determining where those opinions originated, and
summarising them. To automatically extract a summary of
an entity's opinion from a largebodyof theunstructuredtext
is the primary goal of the Opinion Mining assignment.
Opinion Mining and Sentiment Analysis (SA) are two names
for the same thing: the study of how people feel about
something. An individual's thoughts, feelings, and
impressions about a matter, as expressed in the form of an
opinion, are deeply personal and confidential. Individuals,
groups, and societies may benefit greatly from the advice
and counsel of others throughout the decision-making
process, as concluded by the work of Liu et al. [2]. To act
swiftly and wisely, humans demand information that isboth
precise and brief. While making a choice, people often seek
advice from friends, family, and experts for whom they have
developed an opinion or point of view based on their own
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 96
experiences, observations, conceptions, and beliefs (which
may or may not be good or negative).
2. SENTIMENT TARGET IDENTIFICATION
Identifying sentiment (opinion) targets isa crucial partofSA
work. The aim here might be anything from the subject of
the statement to the object of that statement. Everyone
involved in making and selling a product has to do a
thorough evaluation of it in light of public and buyer
feedback. Automatically identifying and extracting aspects
mentioned in reviews is a key step in conducting a review
comparison. Opinion mining and summarization, thus, rely
heavily on product feature mining[10].Sentimentanalysisis
a difficult field of study. This is because a system has to be
able to discern evaluative expressions and some qualities
that are not overtly present and need to be identified from
the term semantic to correctly identify opinion targets in a
phrase or document. Previous studies on the topic of
sentiment target identification have shown that several
Natural Language Processing (NLP) methods, including
processing, Part-of-Speech tagging, noise reduction, feature
selection, and classification, are all necessary stages in the
extraction process.
3. METHODOLOGY
Research data collecting is more complex than it may seem
since it requires drawing important and relevantinferences.
Test data, subjective training data, and objective (neutral)
training data are the three types of data that have been
gathered. The Twitter API will be covered beforehand.
3.1. Twitter API
Developers may access Tweets, DMs, media, and other
Twitter data using the Twitter API, which provides a
collection of programming interfaces. Through the API,
programmers may create products that communicate with
the Twitter service and carry out actions like publishing
Tweets, getting user information, and viewing trending
topics, among other things. Different endpoints,
authentication mechanisms, and useconstraintsapplyto the
API's several flavors, which include REST (Representational
State Transfer), streaming, and advertising. A Twitter
developer account and API keys (also known as access
tokens) are prerequisites for interacting with the API.
3.2. Twython
Twython is a Python library for accessing the Twitter API. It
provides a simple andconvenientwayforPythondevelopers
to interact with the Twitter platform and performtaskssuch
as posting Tweets,retrieving userinformation,andaccessing
timelines. Twython abstracts manyofthecomplexitiesofthe
Twitter API and provides a simple, Pythonic interface for
accessing the API's resources. To useTwython,you will need
to obtain API keys or access tokens froma Twitterdeveloper
account, and then use these credentials to initialize a
Twython client object, which you can use to make API
requests. The library supports both REST and Streaming
APIs and includes functionalityforOAuth1.0a andOAuth2.0
authentication.
3.3. Data Preprocessing in Twitter
Data preprocessing in Twitter involves cleaning and
transforming Twitter data into a format that is suitable for
further analysis or modeling. This may includetaskssuchas:
1. Data Collection: Collect raw data from the Twitter
API, such as tweets, user profiles, and trends.
2. Data Cleaning: Removing irrelevant information,
correcting errors, handling missing values, and
removing duplicates from the collected data.
3. Text Processing: Processing textual data from
tweets, such as removing stop words, stemming,
and converting text to lowercase.
4. Sentiment Analysis: Classifyingtweetsintopositive,
negative, or neutral sentiment categories.
5. Data Transformation: Converting the data into a
format that is suitable for analysis, such as
converting textual data into numerical
representations.
6. Data Reduction: Reducing the dimensionality ofthe
data, such as aggregating data by user or period.
These steps ensure that the data is in a clean, consistent,and
usable format, and help improve the accuracy and reliability
of any subsequent analysis or modeling.
3.4. Lexicon-Based Approach
The lexicon-based approach is a method used in sentiment
analysis and opinion mining to classify the sentiment of a
piece of text, such as a tweet, into positive, negative, or
neutral categories. Theapproachinvolvesusinga predefined
lexicon, or a list of words, that are associated with specific
sentiments.
In a lexicon-based approach, the sentiment of a piece of text
is determined by counting the number of words in the text
that match words in the lexicon and then aggregating the
sentiment scores associated with these words.Theresulting
sentiment score is then used to classify the text as positive,
negative, or neutral.
There are many different lexicons available for use in
sentiment analysis, each with its strengths and weaknesses.
Some popularlexiconsincludeSentiWordNet,theHarvardIV
dictionary, and the AFINN lexicon.
The lexicon-based approach is simple to implement and has
been widely used in sentiment analysis. However, it has
some limitations, such as being limited to the words in the
lexicon and not taking into account the context in which
words are used. To overcome these limitations, other
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 97
approaches such as machine learning and deep learning
models have been developed.
3.5. SentiWordNet
SentiWordNet is a lexiconfor sentimentanalysisandopinion
mining. It is a manually constructed, multi-word expression
resource for the English language that provides sentiment
scores for words and phrases.
SentiWordNet assigns sentiment scores to words based on
three dimensions: positivity, negativity,andobjectivity.Each
word in the lexicon is associated with three sentiment
scores, representing its positivity,negativity,andobjectivity.
The scores are based on the collective sentiment of words
that are semantically similar to the word being scored.
SentiWordNet can be used as a resource in sentiment
analysis and opinion mining to classify the sentiment of a
piece of text into positive, negative, or neutral categories. To
do this, the sentiment scores of the words in the text are
aggregated to determine the overall sentiment of the text.
SentiWordNet has been widely used in sentiment analysis
and has been shown to perform well in comparison to other
lexicons and machine learning models. It is a valuable
resource for researchers and practitioners in the field of
sentiment analysis.
4. RESULTS AND ANALYSIS
4.1. Naive Bayes
Naive Bayes is a simple probabilistic classifier based on
Bayes' Theorem. It is a popular algorithm in the field of
machine learning and is widely used for tasks such as text
classification, sentiment analysis, and spam filtering.
The basic idea behind Naive Bayes is to use Bayes' Theorem
to calculate the probability of a class (e.g., positive, negative,
or neutral sentiment) given a set of features (e.g., words in a
text). The algorithm assumes that the features are
conditionally independent, meaningthatthe presenceof one
feature does not affect the presence of another feature. This
is the "naive" part of the algorithm, hence its name.
There are several variants of the Naive Bayes algorithm,
including the Multinomial Naive Bayes, Bernoulli Naive
Bayes, and Gaussian Naive Bayes. Each variant is suited for
different types of data and different classification tasks.
Naive Bayes is a fast and effective algorithm for text
classification and sentiment analysis. It is simple to
implement and requires little data preparation.However, its
performance can be limited by the "naive" assumption of
independence between features, which is not always
accurate in practice. Despite this, Naive Bayes remains a
popular and widely used algorithm in the field of text
classification and sentiment analysis.
4.2. For Twitter Dataset
We investigate a wide range of characteristics that have a
significant impact on sentiment analysis. We have made use
of N-gram features such as unigrams (n = 1) and bigrams (n
= 2), which are used often in a variety of text classifications
including sentiment analysis. In the course of our research,
we played around with boolean features using both
unigrams and bigrams. Each n-gram feature has a boolean
value that is connected with it. This value is set to true if and
only if the corresponding n-gram appears in the tweet [12].
The many characteristics that we have employed are
outlined in Table 1, along with the accuracy results obtained
from each particular classifier. A comparison of this dataset
with the one that Pang Lee et al. utilized fortheirresearchon
movie reviews has been carried out here. According to what
was found in Table 1, the classification accuracies that
resulted from using unigrams as features gave better results
in the case of tweets than movie reviews when we used the
NB classifier with Laplace smoothing; however, when we
used the MaxEnt classifier, the accuracy result of movie
reviews was more than the tweets.
Table 1: Accuracy of tweets using different features
Table 2: F1 score of MNB classifier
We investigate a wide range of characteristics that have a
significant impact on sentiment analysis. We have made use
of N-gram features such as unigrams (n = 1) and bigrams (n
= 2), which are used often in a variety of text classifications
including sentiment analysis. In the course of our research,
we played around with boolean features using both
unigrams and bigrams. Each n-gram feature has a boolean
value that is connected with it. This value is set to true if and
only if the corresponding n-gram appears in the tweet [12].
The many characteristics that we have employed are
outlined in Table 1, along with the accuracy results obtained
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 98
from each particular classifier. A comparison of this dataset
with the one that Pang Lee et al. utilized fortheirresearchon
movie reviews has been carried out here. According to what
was found in Table 1, the classification accuracies that
resulted from using unigrams as features gave better results
in the case of tweets than movie reviews when we used the
NB classifier with Laplace smoothing; however, when we
used the MaxEnt classifier, the accuracy result of movie
reviews was more than the tweets.
The effectiveness of POS features has been validated using
sentiment analysis. As a general rule,adjectivesareregarded
as useful components for sentimentanalysissincetheyserve
as reliable indicators of a subject's feelings. Taking into
account solely adjectives provides results that are
comparable to those produced by employing unigrams and
bigrams, as can be seen in Line (5) of the table displayingthe
results of our experiment. Line (4) of the tabledisplayingthe
results demonstrates that when unigrams and POS are used
as a feature, all three classifiers generate superior results.
The first line of the table displayingtheresultsdemonstrates
that using SVM with unigram as a feature yields the best
result out of all the characteristics that were taken into
consideration. The comprehensive findings of the MNB
classifier may be seen in Table 2, which displays the F1
score. The Receiver Operating Characteristic (ROC) curve of
the MNB classifier is shown in Figure 1. This curve is for
tweets that have been manually annotated.
Figure-1: ROC curve of MNB classifier for tweets
4.3. Emotion Dataset
Hashtags are often used as a means for people to
communicate their thoughts and feelings. Therefore, a
satisfactory amount of feelings and sentiments may be
gleaned fromthesehashtaggedphrases.Thesehashtagshave
been included in our machine-learning algorithm to provide
it with more data. Figure 2 depicts a snapshot of the
confusion matrix forouremotiondataset'sunigramfeatures.
Additionally, the F1 score of each class for the unigram
feature is shown in this figure. Figure 3 shows theROCcurve
that was generated by our classifier.
Figure 2: Snapshot of emotion dataset
Table 3: Accuracy of emotion dataset using different
features
Table 4: F1 score of MNB classifier for unigram feature
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 99
Figure 3: ROC curve of MNB classifier for an emotion
data set
When compared to the data set that is generated by
manually annotatingtweets, weobservedthatconstructing a
dataset by automatically collecting tweets via the use of
hashtags demonstrates a clear advantage. This was one of
the findings of our experiment. This is because authors are
accurate about their feelings, buttheconventional methodof
annotating material requires annotators toinferthewriters'
feelings from the text, which is not possible to do accurately.
5. CONCLUSION
As part of our study, we looked at the difficulties of
Sentiment Analysis and the many approaches used in this
area. Identification of sentiment in social media data is
notoriously challenging due to the data's richness and
subtlety. To determine which characteristicsaremostuseful
for Sentiment Analysis, we experimented using tweets
collected from the public domain. We have used Machine
Learning and lexicon-based algorithms for SA. The goal of
our project was to make the most efficient use of the
SentiWordNet vocabulary to develop a Twitter Sentiment
Analysis platform. Using the SentiWordNet lexicon, we
obtained an accuracy of 75.20 percent for our dataset,
although we observed that this number varied significantly
from one area to the next. Because the current lexicon has a
huge number of terms with their emotion score, it is lacking
specific words that are common in a certain domain, it is
preferable to construct a lexicon from the test corpus and
use it for classification. Our model, which uses the Google
search engine to determinea term'sscoreutilizingpointwise
mutual information, outperforms the SentiWordNet lexicon
on our dataset and can deal with one of the difficulties of
Sentiment Analysis—the unexpected shift from positive to
negative sentiments.
REFERENCES
[1] C. Alm, D. Roth, and R. Sproat, “Emotions from the text:
machine learning for text-based emotion prediction,” in
Proceedings of HLT and EMNLP. ACL, 2005, pp. 579–586.
[2] S. Aman and S. Szpakowicz, “Using Roget’s thesaurus for
fine-grained emotion recognition,” inProceedingsofIJCNLP,
2008, pp. 296–302.
[3] P. Chesley, B. Vincent, L. Xu, and R. K. Srihari, “Using
verbs and adjectives to automatically classify blog
sentiment,” in AAAI Spring Symposium: Computational
Approaches to Analyzing Weblogs, 2006, pp. 27–29.
[4] M. D. Choudhury, S. Counts, and M. Gamon, “Not all
moods are created equal! exploring human emotional states
in social media,” in Proceedings of ICWSM, 2012.
[5] R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, “Liblinear:
A library for large linear classification,” The Journal of
Machine Learning Research, vol. 9, pp. 1871–1874, 2008.
[6] K. Gimpel, N. Schneider, B. O’Connor, D. Das, D. Mills, J.
Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A.
Smith, “Part-of-speech tagging for Twitter: annotation,
features, and experiments,” in Proceedings of HLT: short
papers, ser. HLT ’11. Stroudsburg, PA, USA: ACL, 2011, pp.
42–47.
[7] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann,
and I. Witten, “The weka data mining software: an update,”
ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–
18, 2009.
[8] G. Mishne, “Experiments with mood classification inblog
posts,” in Proceedings of ACM SIGIR 2005 Workshop on
Stylistic Analysis of Text for Information Access.
[9] S. Mohammad,“#emotional tweets,”in Proceedingsofthe
Sixth International Workshop on Semantic Evaluation. ACL,
7-8 June 2012, pp. 246–255.
[10] A. Neviarouskaya, H. Prendinger, and M. Ishizuka,
“Affect analysis model: A novel rule-basedapproachtoaffect
sensing from text,” Natural Language Engineering, vol. 17,
no. 1, pp. 95–135, 2011.
[11] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?:
sentiment classification using machinelearningtechniques,”
in Proceedings of EMNLP. ACL, 2002, pp. 79–86.
[12] P. Shaver, J. Schwartz, D. Kirson, and C. O’Connor,
“Emotion knowledge: Further exploration of a prototype
approach.” Journal of personality and social psychology,vol.
52, no. 6, pp. 1061–1086, 1987.
[13] C. Strapparava and R. Mihalcea, “Learning to identify
emotions in text,” in Proceedings of the 2008 ACM
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072
© 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 100
symposium on Applied computing. ACM, 2008, pp. 1556–
1560.
[14] C. Strapparava and A. Valitutti, “Wordnet-affect: an
affective extension of wordnet,” in Proceedings of LREC, vol.
4. Citeseer, 2004, pp. 1083– 1086.
[15] C. Strapparava and R. Mihalcea, “Semeval-2007task 14:
affective text,” in Proceedings of the 4th International
Workshop on Semantic Evaluations, ser. SemEval ’07, 2007,
pp. 70–74.
[16] R. Tokuhisa, K. Inui, and Y. Matsumoto, “Emotion
classification using massive examples extracted from the
web,” in Proceedings of COLING. ACL, 2008, pp. 881–888.
[17] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing
contextual polarity in phrase-level sentiment analysis,” in
Proceedings of HLT and EMNLP. ACL, 2005, pp. 347–354.
[18] I. Witten, E. Frank, and M. Hall, Data Mining: Practical
machine learning tools and techniques. Morgan Kaufmann,
2011.
[19] C. Yang, K. Lin, and H. Chen, “Emotion classification
using web blog corpora,” in IEEE/WIC/ACM International
Conference on Web Intelligence. IEEE, 2007, pp. 275–278.

More Related Content

PDF
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
PDF
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
PDF
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
PDF
IRJET - Twitter Sentimental Analysis
PDF
Emotion Recognition By Textual Tweets Using Machine Learning
PDF
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
PDF
IRJET - Sentiment Analysis and Rumour Detection in Online Product Reviews
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
IRJET- Interpreting Public Sentiments Variation by using FB-LDA Technique
IRJET - Twitter Sentiment Analysis using Machine Learning
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
IRJET - Twitter Sentimental Analysis
Emotion Recognition By Textual Tweets Using Machine Learning
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
IRJET - Sentiment Analysis and Rumour Detection in Online Product Reviews

Similar to UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS (20)

PDF
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
PDF
Sentiment Analysis on Twitter data using Machine Learning
PDF
BANKING CHATBOT USING NLP AND MACHINE LEARNING ALGORITHMS
PDF
Review on Opinion Targets and Opinion Words Extraction Techniques from Online...
PDF
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
PDF
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
PDF
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
PDF
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
PDF
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
PDF
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
PDF
Sentiment Analysis of Product Reviews and Trustworthiness Evaluation using TRS
PDF
Sentiment Analysis of Product Reviews and Trustworthiness Evaluation using TRS
PDF
An Opinion Mining and Sentiment Analysis Techniques: A Survey
PDF
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
PDF
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
DOCX
Python report on twitter sentiment analysis
PDF
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
PDF
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
PDF
Sentiment Analysis on Product Reviews Using Supervised Learning Techniques
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
Sentiment Analysis on Twitter data using Machine Learning
BANKING CHATBOT USING NLP AND MACHINE LEARNING ALGORITHMS
Review on Opinion Targets and Opinion Words Extraction Techniques from Online...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Sentiment Analysis of Product Reviews and Trustworthiness Evaluation using TRS
Sentiment Analysis of Product Reviews and Trustworthiness Evaluation using TRS
An Opinion Mining and Sentiment Analysis Techniques: A Survey
THE ANALYSIS FOR CUSTOMER REVIEWS THROUGH TWEETS, BASED ON DEEP LEARNING
[IJET V2I4P9] Authors: Praveen Jayasankar , Prashanth Jayaraman ,Rachel Hannah
Python report on twitter sentiment analysis
IRJET- The Sentimental Analysis on Product Reviews of Amazon Data using the H...
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
Sentiment Analysis on Product Reviews Using Supervised Learning Techniques
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Ad

Recently uploaded (20)

PPTX
web development for engineering and engineering
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Construction Project Organization Group 2.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPT
Mechanical Engineering MATERIALS Selection
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT
Project quality management in manufacturing
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Welding lecture in detail for understanding
PDF
composite construction of structures.pdf
web development for engineering and engineering
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CH1 Production IntroductoryConcepts.pptx
OOP with Java - Java Introduction (Basics)
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Lecture Notes Electrical Wiring System Components
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Construction Project Organization Group 2.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Automation-in-Manufacturing-Chapter-Introduction.pdf
Mechanical Engineering MATERIALS Selection
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Project quality management in manufacturing
R24 SURVEYING LAB MANUAL for civil enggi
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Welding lecture in detail for understanding
composite construction of structures.pdf

UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 95 UTILIZING TWITTER TO PERFORM AUTONOMOUS SENTIMENT ANALYSIS Akanksha Srivastava1, Mr. Sambhav Agarwal2 1M.Tech, Computer Science and Engineering, SR Institute of Management & Technology, Lucknow, India 2Associate Professor, Computer Science, and Engineering, SR Institute of Management & Technology, Lucknow ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Applications in many domains make Sentiment Analysis an exciting area for study. The use of online polls and surveys to get feedback from the public regarding goods, current events, and societal or political issues are on the rise. The public and the stakeholders benefit from hearing the thoughts and feelings of the general public when important choices must be made. Opinion mining is the practice of gleaning insights from online sources including web search engines, blogs, micro-blogs, Twitter, and social networks to produce meaningful conclusions. Twitter's user base provides a wealth of material from which to get insight intothepublic's perspective. The massive volume of tweets as theunstructured text makes it challenging to physically delineate the information. Consequently, extracting and condensing the tweets from corpora calls for expert computational methodologies, which in turn necessitates familiarity with terms that convey emotion. Sentiment analysis from the unstructured text may be accomplished usingawidevarietyof computer methodologies, models, and algorithms. The vast majority are based on machine learning methods, namely the Bag-of-Words (BoW) representation. In thisresearch, weused a lexicon-based strategy to automatically identify sentiment for tweets gathered from the Twitter public domain. To further investigate the efficacy of alternative feature combinations, we have used three distinct machine learning algorithms for the task of tweet sentiment identification: Naive Bayes (NB), Maximum Entropy (ME), and Support Vector Machines (SVM). Our results suggest that bothNBwith Laplace smoothing and SVM are successful in categorizingthe tweets. The feature used for NB is unigramandPart-of-Speech (POS), while unigram is utilized for SVM. Key Words: Bag-of-Words, Lexicon, Machine Learning Algorithms, Laplace Smoothing, Part-of-Speech. 1. INTRODUCTION It has been found via two separate polls of over 2000 American adults that 81% of Internet users (or 60% of Americans) have done product research online at least once and that 20% of Internet users (15% of Americans) prefer it on a certain day. We may claim that people's consumption of goods and services is not the only factor for their online information-seekingandopinion-sharingactivities.The need for access to current political information is another critical factor to consider. At the moment, individuals may utilize email for political campaigns by sharing information and discussing candidates and issues online. The user trusts internet advice and suggestions since they deal mostly with an opinion. Despite the generally pleasant experiences of American Internet users during online product research, Horrigan [1] found that 58% of users reported experiencing missing, difficult-to-discover, confused, or overwhelming online information. Therefore, there is a significant need for improved information-access technologies to aid shoppers and researchers. Web 2.0 sites like blogs, message boards, and other kinds of social media havemadeiteasierthan ever for customers to voice their thoughts and views on the brands they use. In recent years, businesses have begun to acknowledge the power that user reviews have on shaping the perceptions of others and the standing of certain brands. Companies are beginning to watch social media to react to customer feedback and adjust their marketing, brand positioning, product development, and other strategies appropriately. 1.1. Opinion Mining and Sentiment Analysis Extracting views from text is called "Opinion Mining" (OM). Viewpoint mining (OM) is a new field at the intersection of information retrieval, text mining, and computational linguistics that seeks to detect the opinion represented in natural language texts, as described by Pang et al. [3]. Opinion mining is a subfield of KDD that employs Natural Language Processing (NLP) and statistical machine learning methods to identify and distinguish between opinionated and factual content. Tasks in opinionminingincludelocating opinions, labeling them as favorable, negative, or neutral, determining where those opinions originated, and summarising them. To automatically extract a summary of an entity's opinion from a largebodyof theunstructuredtext is the primary goal of the Opinion Mining assignment. Opinion Mining and Sentiment Analysis (SA) are two names for the same thing: the study of how people feel about something. An individual's thoughts, feelings, and impressions about a matter, as expressed in the form of an opinion, are deeply personal and confidential. Individuals, groups, and societies may benefit greatly from the advice and counsel of others throughout the decision-making process, as concluded by the work of Liu et al. [2]. To act swiftly and wisely, humans demand information that isboth precise and brief. While making a choice, people often seek advice from friends, family, and experts for whom they have developed an opinion or point of view based on their own
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 96 experiences, observations, conceptions, and beliefs (which may or may not be good or negative). 2. SENTIMENT TARGET IDENTIFICATION Identifying sentiment (opinion) targets isa crucial partofSA work. The aim here might be anything from the subject of the statement to the object of that statement. Everyone involved in making and selling a product has to do a thorough evaluation of it in light of public and buyer feedback. Automatically identifying and extracting aspects mentioned in reviews is a key step in conducting a review comparison. Opinion mining and summarization, thus, rely heavily on product feature mining[10].Sentimentanalysisis a difficult field of study. This is because a system has to be able to discern evaluative expressions and some qualities that are not overtly present and need to be identified from the term semantic to correctly identify opinion targets in a phrase or document. Previous studies on the topic of sentiment target identification have shown that several Natural Language Processing (NLP) methods, including processing, Part-of-Speech tagging, noise reduction, feature selection, and classification, are all necessary stages in the extraction process. 3. METHODOLOGY Research data collecting is more complex than it may seem since it requires drawing important and relevantinferences. Test data, subjective training data, and objective (neutral) training data are the three types of data that have been gathered. The Twitter API will be covered beforehand. 3.1. Twitter API Developers may access Tweets, DMs, media, and other Twitter data using the Twitter API, which provides a collection of programming interfaces. Through the API, programmers may create products that communicate with the Twitter service and carry out actions like publishing Tweets, getting user information, and viewing trending topics, among other things. Different endpoints, authentication mechanisms, and useconstraintsapplyto the API's several flavors, which include REST (Representational State Transfer), streaming, and advertising. A Twitter developer account and API keys (also known as access tokens) are prerequisites for interacting with the API. 3.2. Twython Twython is a Python library for accessing the Twitter API. It provides a simple andconvenientwayforPythondevelopers to interact with the Twitter platform and performtaskssuch as posting Tweets,retrieving userinformation,andaccessing timelines. Twython abstracts manyofthecomplexitiesofthe Twitter API and provides a simple, Pythonic interface for accessing the API's resources. To useTwython,you will need to obtain API keys or access tokens froma Twitterdeveloper account, and then use these credentials to initialize a Twython client object, which you can use to make API requests. The library supports both REST and Streaming APIs and includes functionalityforOAuth1.0a andOAuth2.0 authentication. 3.3. Data Preprocessing in Twitter Data preprocessing in Twitter involves cleaning and transforming Twitter data into a format that is suitable for further analysis or modeling. This may includetaskssuchas: 1. Data Collection: Collect raw data from the Twitter API, such as tweets, user profiles, and trends. 2. Data Cleaning: Removing irrelevant information, correcting errors, handling missing values, and removing duplicates from the collected data. 3. Text Processing: Processing textual data from tweets, such as removing stop words, stemming, and converting text to lowercase. 4. Sentiment Analysis: Classifyingtweetsintopositive, negative, or neutral sentiment categories. 5. Data Transformation: Converting the data into a format that is suitable for analysis, such as converting textual data into numerical representations. 6. Data Reduction: Reducing the dimensionality ofthe data, such as aggregating data by user or period. These steps ensure that the data is in a clean, consistent,and usable format, and help improve the accuracy and reliability of any subsequent analysis or modeling. 3.4. Lexicon-Based Approach The lexicon-based approach is a method used in sentiment analysis and opinion mining to classify the sentiment of a piece of text, such as a tweet, into positive, negative, or neutral categories. Theapproachinvolvesusinga predefined lexicon, or a list of words, that are associated with specific sentiments. In a lexicon-based approach, the sentiment of a piece of text is determined by counting the number of words in the text that match words in the lexicon and then aggregating the sentiment scores associated with these words.Theresulting sentiment score is then used to classify the text as positive, negative, or neutral. There are many different lexicons available for use in sentiment analysis, each with its strengths and weaknesses. Some popularlexiconsincludeSentiWordNet,theHarvardIV dictionary, and the AFINN lexicon. The lexicon-based approach is simple to implement and has been widely used in sentiment analysis. However, it has some limitations, such as being limited to the words in the lexicon and not taking into account the context in which words are used. To overcome these limitations, other
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 97 approaches such as machine learning and deep learning models have been developed. 3.5. SentiWordNet SentiWordNet is a lexiconfor sentimentanalysisandopinion mining. It is a manually constructed, multi-word expression resource for the English language that provides sentiment scores for words and phrases. SentiWordNet assigns sentiment scores to words based on three dimensions: positivity, negativity,andobjectivity.Each word in the lexicon is associated with three sentiment scores, representing its positivity,negativity,andobjectivity. The scores are based on the collective sentiment of words that are semantically similar to the word being scored. SentiWordNet can be used as a resource in sentiment analysis and opinion mining to classify the sentiment of a piece of text into positive, negative, or neutral categories. To do this, the sentiment scores of the words in the text are aggregated to determine the overall sentiment of the text. SentiWordNet has been widely used in sentiment analysis and has been shown to perform well in comparison to other lexicons and machine learning models. It is a valuable resource for researchers and practitioners in the field of sentiment analysis. 4. RESULTS AND ANALYSIS 4.1. Naive Bayes Naive Bayes is a simple probabilistic classifier based on Bayes' Theorem. It is a popular algorithm in the field of machine learning and is widely used for tasks such as text classification, sentiment analysis, and spam filtering. The basic idea behind Naive Bayes is to use Bayes' Theorem to calculate the probability of a class (e.g., positive, negative, or neutral sentiment) given a set of features (e.g., words in a text). The algorithm assumes that the features are conditionally independent, meaningthatthe presenceof one feature does not affect the presence of another feature. This is the "naive" part of the algorithm, hence its name. There are several variants of the Naive Bayes algorithm, including the Multinomial Naive Bayes, Bernoulli Naive Bayes, and Gaussian Naive Bayes. Each variant is suited for different types of data and different classification tasks. Naive Bayes is a fast and effective algorithm for text classification and sentiment analysis. It is simple to implement and requires little data preparation.However, its performance can be limited by the "naive" assumption of independence between features, which is not always accurate in practice. Despite this, Naive Bayes remains a popular and widely used algorithm in the field of text classification and sentiment analysis. 4.2. For Twitter Dataset We investigate a wide range of characteristics that have a significant impact on sentiment analysis. We have made use of N-gram features such as unigrams (n = 1) and bigrams (n = 2), which are used often in a variety of text classifications including sentiment analysis. In the course of our research, we played around with boolean features using both unigrams and bigrams. Each n-gram feature has a boolean value that is connected with it. This value is set to true if and only if the corresponding n-gram appears in the tweet [12]. The many characteristics that we have employed are outlined in Table 1, along with the accuracy results obtained from each particular classifier. A comparison of this dataset with the one that Pang Lee et al. utilized fortheirresearchon movie reviews has been carried out here. According to what was found in Table 1, the classification accuracies that resulted from using unigrams as features gave better results in the case of tweets than movie reviews when we used the NB classifier with Laplace smoothing; however, when we used the MaxEnt classifier, the accuracy result of movie reviews was more than the tweets. Table 1: Accuracy of tweets using different features Table 2: F1 score of MNB classifier We investigate a wide range of characteristics that have a significant impact on sentiment analysis. We have made use of N-gram features such as unigrams (n = 1) and bigrams (n = 2), which are used often in a variety of text classifications including sentiment analysis. In the course of our research, we played around with boolean features using both unigrams and bigrams. Each n-gram feature has a boolean value that is connected with it. This value is set to true if and only if the corresponding n-gram appears in the tweet [12]. The many characteristics that we have employed are outlined in Table 1, along with the accuracy results obtained
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 98 from each particular classifier. A comparison of this dataset with the one that Pang Lee et al. utilized fortheirresearchon movie reviews has been carried out here. According to what was found in Table 1, the classification accuracies that resulted from using unigrams as features gave better results in the case of tweets than movie reviews when we used the NB classifier with Laplace smoothing; however, when we used the MaxEnt classifier, the accuracy result of movie reviews was more than the tweets. The effectiveness of POS features has been validated using sentiment analysis. As a general rule,adjectivesareregarded as useful components for sentimentanalysissincetheyserve as reliable indicators of a subject's feelings. Taking into account solely adjectives provides results that are comparable to those produced by employing unigrams and bigrams, as can be seen in Line (5) of the table displayingthe results of our experiment. Line (4) of the tabledisplayingthe results demonstrates that when unigrams and POS are used as a feature, all three classifiers generate superior results. The first line of the table displayingtheresultsdemonstrates that using SVM with unigram as a feature yields the best result out of all the characteristics that were taken into consideration. The comprehensive findings of the MNB classifier may be seen in Table 2, which displays the F1 score. The Receiver Operating Characteristic (ROC) curve of the MNB classifier is shown in Figure 1. This curve is for tweets that have been manually annotated. Figure-1: ROC curve of MNB classifier for tweets 4.3. Emotion Dataset Hashtags are often used as a means for people to communicate their thoughts and feelings. Therefore, a satisfactory amount of feelings and sentiments may be gleaned fromthesehashtaggedphrases.Thesehashtagshave been included in our machine-learning algorithm to provide it with more data. Figure 2 depicts a snapshot of the confusion matrix forouremotiondataset'sunigramfeatures. Additionally, the F1 score of each class for the unigram feature is shown in this figure. Figure 3 shows theROCcurve that was generated by our classifier. Figure 2: Snapshot of emotion dataset Table 3: Accuracy of emotion dataset using different features Table 4: F1 score of MNB classifier for unigram feature
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 99 Figure 3: ROC curve of MNB classifier for an emotion data set When compared to the data set that is generated by manually annotatingtweets, weobservedthatconstructing a dataset by automatically collecting tweets via the use of hashtags demonstrates a clear advantage. This was one of the findings of our experiment. This is because authors are accurate about their feelings, buttheconventional methodof annotating material requires annotators toinferthewriters' feelings from the text, which is not possible to do accurately. 5. CONCLUSION As part of our study, we looked at the difficulties of Sentiment Analysis and the many approaches used in this area. Identification of sentiment in social media data is notoriously challenging due to the data's richness and subtlety. To determine which characteristicsaremostuseful for Sentiment Analysis, we experimented using tweets collected from the public domain. We have used Machine Learning and lexicon-based algorithms for SA. The goal of our project was to make the most efficient use of the SentiWordNet vocabulary to develop a Twitter Sentiment Analysis platform. Using the SentiWordNet lexicon, we obtained an accuracy of 75.20 percent for our dataset, although we observed that this number varied significantly from one area to the next. Because the current lexicon has a huge number of terms with their emotion score, it is lacking specific words that are common in a certain domain, it is preferable to construct a lexicon from the test corpus and use it for classification. Our model, which uses the Google search engine to determinea term'sscoreutilizingpointwise mutual information, outperforms the SentiWordNet lexicon on our dataset and can deal with one of the difficulties of Sentiment Analysis—the unexpected shift from positive to negative sentiments. REFERENCES [1] C. Alm, D. Roth, and R. Sproat, “Emotions from the text: machine learning for text-based emotion prediction,” in Proceedings of HLT and EMNLP. ACL, 2005, pp. 579–586. [2] S. Aman and S. Szpakowicz, “Using Roget’s thesaurus for fine-grained emotion recognition,” inProceedingsofIJCNLP, 2008, pp. 296–302. [3] P. Chesley, B. Vincent, L. Xu, and R. K. Srihari, “Using verbs and adjectives to automatically classify blog sentiment,” in AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, 2006, pp. 27–29. [4] M. D. Choudhury, S. Counts, and M. Gamon, “Not all moods are created equal! exploring human emotional states in social media,” in Proceedings of ICWSM, 2012. [5] R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, “Liblinear: A library for large linear classification,” The Journal of Machine Learning Research, vol. 9, pp. 1871–1874, 2008. [6] K. Gimpel, N. Schneider, B. O’Connor, D. Das, D. Mills, J. Eisenstein, M. Heilman, D. Yogatama, J. Flanigan, and N. A. Smith, “Part-of-speech tagging for Twitter: annotation, features, and experiments,” in Proceedings of HLT: short papers, ser. HLT ’11. Stroudsburg, PA, USA: ACL, 2011, pp. 42–47. [7] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten, “The weka data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10– 18, 2009. [8] G. Mishne, “Experiments with mood classification inblog posts,” in Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access. [9] S. Mohammad,“#emotional tweets,”in Proceedingsofthe Sixth International Workshop on Semantic Evaluation. ACL, 7-8 June 2012, pp. 246–255. [10] A. Neviarouskaya, H. Prendinger, and M. Ishizuka, “Affect analysis model: A novel rule-basedapproachtoaffect sensing from text,” Natural Language Engineering, vol. 17, no. 1, pp. 95–135, 2011. [11] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: sentiment classification using machinelearningtechniques,” in Proceedings of EMNLP. ACL, 2002, pp. 79–86. [12] P. Shaver, J. Schwartz, D. Kirson, and C. O’Connor, “Emotion knowledge: Further exploration of a prototype approach.” Journal of personality and social psychology,vol. 52, no. 6, pp. 1061–1086, 1987. [13] C. Strapparava and R. Mihalcea, “Learning to identify emotions in text,” in Proceedings of the 2008 ACM
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 02 | Feb 2023 www.irjet.net p-ISSN: 2395-0072 © 2023, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 100 symposium on Applied computing. ACM, 2008, pp. 1556– 1560. [14] C. Strapparava and A. Valitutti, “Wordnet-affect: an affective extension of wordnet,” in Proceedings of LREC, vol. 4. Citeseer, 2004, pp. 1083– 1086. [15] C. Strapparava and R. Mihalcea, “Semeval-2007task 14: affective text,” in Proceedings of the 4th International Workshop on Semantic Evaluations, ser. SemEval ’07, 2007, pp. 70–74. [16] R. Tokuhisa, K. Inui, and Y. Matsumoto, “Emotion classification using massive examples extracted from the web,” in Proceedings of COLING. ACL, 2008, pp. 881–888. [17] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual polarity in phrase-level sentiment analysis,” in Proceedings of HLT and EMNLP. ACL, 2005, pp. 347–354. [18] I. Witten, E. Frank, and M. Hall, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2011. [19] C. Yang, K. Lin, and H. Chen, “Emotion classification using web blog corpora,” in IEEE/WIC/ACM International Conference on Web Intelligence. IEEE, 2007, pp. 275–278.