SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1303
Sentiment Analysis on Twitter data using Machine Learning
Madikonda Jagadish1, Cholleti Shiva Kumar2, Dobbala Sandeep3, 4G Bhargavi
1,2,3B. Tech Scholars, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India
4Assistant Professor, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Twitter is a popular platform for people to
express their thoughts and emotions on various
occasions. Sentiment analysis is a method of analyzing
data in order to extract the sentiment that it contains.
Twitter sentiment analysis is theapplication of sentiment
analysis to data from Twitter (tweets) in order to extract
user sentiments. Over the last few decades, research in
this field has steadily increased. The reason for this is the
difficult format of the tweets, which makes processing
difficult. Because the tweet format is so small, it creates a
whole new set of issues, such as the use of slang and
abbreviations. In this paper, we demonstrate the use of
sentimental analysis as well as how to connect to Twitter
and execute queries usingsentimentanalysis.Weconduct
tests on several issues, ranging from politicstohumanity,
and provide the intriguing findings. We discovered that
the level of neutral sentiment for tweets is very high,
which amply demonstrates the shortcomings of the
existing works.
1. INTRODUCTION
Twitter has emerged as a major microbloggingwebsite,with
over 100 million users daily sending out over 500 million
tweets. Twitter's large audience has consistently drawn
users to express their opinions and perspectives on any
issue, brand, company, or another topic of interest. As a
result, many organizations, institutions, and businesses use
Twitter as a source of information.
Twitter users can express themselves in the form of tweets,
which are limited to 140 characters. As a result, people
condense their statements by using slang, abbreviations,
emoticons, short forms, and so on. Along with this, people
express themselves through sarcasm and polysemy. As a
result, the term "unstructured" isappropriateforthe Twitter
language. To elicit emotion fromSentimentanalysisinvolves
determining the sentiment of a specific remark or sentence.
It's a categorization technique that extracts opinion from
tweets and creates a sentiment, which is individualized
depending on the topic of interest. It's our responsibility to
determine what characteristics will determine the feeling it
conveys.
The class of entities that the person conducting sentiment
analysis intends to find in the tweets is referred to as
sentiment in the programming model. The sentiment class's
dimension is a key aspect in determining the model's
effectiveness. For instance, we may classify tweet sentiment
into two categories—positive and negative—or three
categories (positive, negative and neutral). The class of
entities that the person conducting sentiment analysis
intends to find in the tweets is referredtoassentimentinthe
programming model. The sentiment class's dimension is a
key aspect in determining the model's effectiveness.
Many businesses and organizations now utilize sentiment
analysis to evaluate customer feedback on a productortheir
response to an event without the need for surveys or other
pricey and time-consuming methods. One such social
networking site, Twitter, one of thebiggest networkingsites,
is considered in this thesis. According to the data, there are
around 316 million active users monthly, and on average,
500 million tweets are sent each day.
II. LITERATURE SURVEY
Sentiment analysis involves determining the sentiment of a
specific remark or sentence. It's a categorization technique
that extracts opinion from tweets and creates a sentiment,
which is individualized depending on the topic of interest.
It's our responsibility to determine what characteristicswill
determine the feeling it conveys. Theclassofentitiesthat the
person conducting sentiment analysis intends to find in the
tweets is referred to as sentiment in the programming
model. The sentiment class's dimension is a key aspect in
determining the model's effectiveness.
For instance, we may classify tweet sentiment into two
categories—positive and negative—or three categories
(positive, negative and neutral). The class of entitiesthatthe
person conducting sentiment analysis intends to find in the
tweets is referred to as sentiment in the programming
model. The sentiment class's dimension is a key aspect in
determining the model's effectiveness. learning approach
uses feature extraction while training the model with a
feature set and dataset.
III. DESIGN AND IMPLEMENTATION
Through the use of Twitter's own APIs, this technical paper
documents the implementation of Twitter sentiment
analysis. For text mining on social networks, there are
excellent resources and tools. The full range of the libraries
utilized in this project has been available.
We use following approaches to extract sentiment from the
tweets.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1304
1. Download and cache the sentiment dictionary first.
2. Download the testing data sets for Twitter and enter
them into the software.
3. Remove the stop words from the tweets to clean them up.
4. Tokenize all of the dataset's words before feeding them to
the program.
5. For each term, contrast it with the dictionary's definitions
of words having positive and negative connotations. Then
raise either the positive or negative count.
6. In order to determine the polarity, we can calculate the
outcome percentage based on the positive and negative
counts.
3.1 IMPLIMENTATION
Python was used in this study to implement sentimental
analysis. Several packages have used it, notablytextbloband
tweepy. The commands listed below can be used to install
the necessary libraries:
Install tweepy via pip.
Install textblob via pip.
Textblob:
A Python (2 and 3) package called TextBlob is used to
process textual data. It offers a straightforward API for
getting started with typical natural language processing
(NLP) activities like part-of-speech tagging, noun phrase
extraction, sentimentanalysis,classification,translation,and
others.
Tweepy:
You may access the Twitter API with Python quite
conveniently using the open source Tweepy library.Tweepy
contains a collection of classes and methods that represent
the models and API endpoints of Twitter.User’sneedtogoto
the apps.twitter.com/app/new and generate the API keys.
With following steps, we can connect twitter API with
python:
Create a free Rapid API user account (or log in).
Open the Twitter API page on Rapid API.
As soon as you click "Connect to API," you can start entering
the parameters and fields for your API Key.
Start the Twitter API Endpoints testing.
The below figure says about connecting of python with
twitter API
Fig 1. User API Keys
3.1 Dataset:
The data is retrieved form the twitter using API and that is
stored in the csv file for the data visualization. This stores
the tweets that are retrieved form the twitter API and the
user can see the data for clarification.
Fig 2. Dataset with tweets
IV. TWITTERSENTIMENTANALYSISWITHPYTHON
4.1 Python:
Python is a preferred programming language because of its
extensive capabilities, applicability,andsimplicity.Dueto its
independent platform and widespread use in the
programming community, the Python programming
language is the most suitable for machine learning. The
requirement for intelligent answerstopractical issuesneeds
the further development ofAIinordertoautomatelaborious
processes that would be difficult to program withoutAI.The
Python programming language is thought to be the ideal
technique for automating these processes since it is more
straightforward and consistent than other programming
languages. Additionally, the vibrant Python community
makes it simple for developers to discuss projects and offer
suggestions for improving their code.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1305
4.2 Tweepy:
You may access the Twitter API with Python quite
conveniently using the open-sourceTweepylibrary.Tweepy
contains a collection of classes and methods that represent
the models and API endpoints of Twitter.
It also handles following things like data encoding and
decoding
The following figure say about setup of tweepy API.
Fig 3. Tweepy API connection
4.3 Textblob:
A Python library for Natural Language Processing is called
TextBlob (NLP). Natural Language Toolkit(NLTK)wasa tool
that TextBlob actively employed to complete its tasks. The
NLTK library enables users to do categorization,
classification, and a variety of other tasks while providing
quick access to a large number of lexical resources.TextBlob
is a straightforward library that provides intricate textual
analysis and processing.
A sentiment is identified by its semantic orientation and the
force of each word in the sentence for lexicon-based
techniques. This calls for a pre-defined dictionary that
divides words into negative and positive categories. A text
message will typically be represented by a bag of words.
Following the individual scoring of each word, the ultimate
sentiment is determined by performinga poolingprocedure,
such as averaging all the sentiments.
TextBlob returns a sentence's polarity and subjectivity. The
polarity scale is [-1,1], where -1 represents a negative
emotion and 1 represents a good emotion. Negative words
turn the polarity around. Semantic labels in TextBlob
facilitate detailed analysis. Emoticons, exclamation points,
emoticons, etc. are a few examples. The range of subjectivity
is [0, 1]. Subjectivitymeasureshowmuchfactual information
and subjective opinion are present in the text. The content
contains personal opinion rather than factual information
due to the text's heightened subjectivity. One other setting
for TextBlob is intensity. TextBlob uses the "intensity" to
determine subjectivity. Whether a word modifies the next
word depends on its intensity.Adverbsareusedasmodifiers
in English, such as "extremely good."
4.3 NLTK:
The Natural Language Toolkit (NLTK) is a Python
programming environment for creating applications for
statistical natural language processing (NLP).
Steven Bird, Edward Looper, and Ewan Klein created the
Natural Language Toolkit as an open-source library for the
Python programming language with the intention of using it
for development and education.
It is appropriate for linguists without extensive
programming experience, engineers and researchers who
need to delve into computational linguistics, students, and
educators because it includes a hands-on guide that
introduces topics in computational linguistics as well as
Python programming fundamentals.
To gain insights from linguistic data, you can use these
methods with NLTK using robust built-in machine learning
procedures. Tasks like tokenization, stemming,
lemmatization, punctuation, character count, word count,
etc. can be accomplished with this library. This does an
analysis of the data and produces the necessary results.
4.4 Matplotlib:
For Python and its numerical extensionNumPy,Matplotlibis
a cross-platform packagefor graphical data visualizationand
charting. This makes it a strong opensource substitute for
MATLAB. The APIs (Application Programming Interfaces)of
matplotlib can also be used by developers to integrate plots
into GUI programmers.
The way a Python matplotlib script is written makes it
possible to create a visual data plot in the majority of cases
with just a few lines of code. Overlaying two APIs is the
Matplotlib scripting layer.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1306
The matplotlib object is the top-level Python code object in
the pyplot API hierarchy.
An Object-Oriented API collection of objects that is more
flexible than pyplot in how it may be put together. The
backend layers of Matplotlib are directly accessible through
this API.
IV. RESULT
The tweets are received from Twitter using the API,
analyzed, and the results are shown in the below pie chart.
The below figure shows the tweets that are retrieved from
the twitter.
Fig 4. Tweets retrieved form twitter
The analysis's findings, which show different people's
opinions on numerous issues, are summarized in the pie
chart below. In order to analyze their product or business,
these tweets are analyzed and results are recorded. the
analysis' findings are based on a variety ofqueries,including
those related to movies, politics, fashion, and more.Thedata
based on the tweets retrieved are illustrated by thepiechart
in figure 5. Based on the tweets we retrieve, if we run the
program at other times, we can receive slightly different
results.
Fig 5. Output of the tweets analysis
This shows the analysis of the tweetsaboutViratKohlibased
on the latest 100 tweets and more tweets can also be
retrieved based on the user’s needs.
Three different categories are defined as positive, negative
and neutral tweets.
In the above pie chart, the results are as follows:
Positive tweets percentage: 23.0 %
Negative tweets percentage: 69.0 %
Neutral tweets percentage: 8.0 %
The fraction of neutral tweets is notably high, as shown in
the pie chart. It's also crucial to note that, depending on the
experiment's data, we can obtain various conclusions
because people's opinions can alter in response to external
factors.
V. CONCLUSION
Twitter sentiment analysis comes underthecategoryoftext
and opinion mining. It focuses on examining the sentiments
of the tweets and feeding the data to a machine learning
model in order to train it and then test its precision, so that
we may use this model going forward based on theresults.It
entails actions including gathering data, textpre-processing,
sentiment categorization, sentiment detection, model
training, and testing. The models used in this research have
improved over the past ten years, attaining efficiencies of
roughly 85%–90%. However, the dimension of data variety
is still missing. In addition, it has numerous application
problems due to the slang and abbreviations employed. The
performance of many analyzers suffers as the number of
classes rises. Therefore, there is a very promising future for
the advancement of sentiment analysis.
REFERENCES
[1] Pak, A., & Paroubek, P. (2010, May). Twitter as a corpus
for sentiment analysis and opinion mining.InLREc(Vol.
10, No. 2010).
[2] TextBlob, 2017,
https://textblob.readthedocs.io/en/dev/
[3] Liu, B. (2012). Sentiment analysis and opinion mining.
Synthesis lectures on human language technologies,
5(1), 1-167.
[4] Neethu MS and Rajashree R,” Sentiment Analysis in
Twitter using Machine Learning Techniques” 4th
ICCCNT 2013, at Tiruchengode, India. IEEE – 31661
[5] Kiritchenko, S., Zhu, X., & Mohammad, S. M. (2014).
Sentiment analysis of short informal texts. Journal of
Artificial Intelligence Research, 50, 723-762.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1307
[6] Agarwal, A., Xie, B., Vovsha, I., Rambow, O., &
Passonneau, R. (2011, June). Sentiment analysis of
twitter data. In Proceedings of the workshop on
languages in social media (pp. 30-38). Association for
Computational Linguistics.
[7] Pang, B.and Lee, L. “A sentimental education: Sentiment
analysis using subjectivity summarization based on
minimum cuts”. 42nd Meeting of the Association for
Computational Linguistics[C] (ACL-04). 2004, 271-278.
[8] Rosenthal, S., Farra, N., & Nakov, P. (2017). SemEval-
2017 task 4: Sentiment analysis in Twitter. In
Proceedings of the 11th International Workshop on
Semantic Evaluation (SemEval-2017) (pp. 502-518).
[9] Nehal Mamgain, Ekta Mehta, Ankush Mittal and Gaurav
Bhatt, “Sentiment AnalysisofTopCollegesinIndia Using
Twitter Data”, (IEEE) ISBN -978-1-5090-0082-1, 2016.

More Related Content

PDF
IRJET - Twitter Sentimental Analysis
PDF
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
PDF
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...
PDF
IRJET - Sentiment Analysis of Posts and Comments of OSN
PDF
Twitter Sentiment Analysis
PDF
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
DOCX
Python report on twitter sentiment analysis
PDF
Emotion Recognition By Textual Tweets Using Machine Learning
IRJET - Twitter Sentimental Analysis
Combining Lexicon based and Machine Learning based Methods for Twitter Sentim...
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...
IRJET - Sentiment Analysis of Posts and Comments of OSN
Twitter Sentiment Analysis
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
Python report on twitter sentiment analysis
Emotion Recognition By Textual Tweets Using Machine Learning

Similar to Sentiment Analysis on Twitter data using Machine Learning (20)

PDF
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
PDF
IRJET - Twitter Sentiment Analysis using Machine Learning
PDF
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
PDF
IRJET- Sentiment Analysis of Twitter Data using Python
PDF
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
PDF
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
PDF
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
PDF
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
PDF
Sentiment Analysis of Twitter Data
PDF
Sentiment Analysis of Twitter tweets using supervised classification technique
PDF
IRJET- Sentimental Analysis of Twitter Data for Job Opportunities
PDF
Detection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
PDF
Sentiment Analysis on Twitter Data
PDF
IRJET - Suicidal Text Detection using Machine Learning
PDF
Sentimental Emotion Analysis using Python and Machine Learning
PDF
Twitter Sentiment Analysis
PDF
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
PDF
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
PDF
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
PDF
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Sentiment Analysis of Twitter Data using Python
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
A Survey on Analysis of Twitter Opinion Mining using Sentiment Analysis
Estimating the Efficacy of Efficient Machine Learning Classifiers for Twitter...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter tweets using supervised classification technique
IRJET- Sentimental Analysis of Twitter Data for Job Opportunities
Detection and Analysis of Twitter Trending Topics via Link-Anomaly Detection
Sentiment Analysis on Twitter Data
IRJET - Suicidal Text Detection using Machine Learning
Sentimental Emotion Analysis using Python and Machine Learning
Twitter Sentiment Analysis
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
IRJET- Sentimental Analysis of Twitter for Stock Market Investment
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Ad

Recently uploaded (20)

PPTX
additive manufacturing of ss316l using mig welding
PPT
Project quality management in manufacturing
PDF
composite construction of structures.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Construction Project Organization Group 2.pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Sustainable Sites - Green Building Construction
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
UNIT 4 Total Quality Management .pptx
additive manufacturing of ss316l using mig welding
Project quality management in manufacturing
composite construction of structures.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Construction Project Organization Group 2.pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Sustainable Sites - Green Building Construction
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Operating System & Kernel Study Guide-1 - converted.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Foundation to blockchain - A guide to Blockchain Tech
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
UNIT 4 Total Quality Management .pptx

Sentiment Analysis on Twitter data using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1303 Sentiment Analysis on Twitter data using Machine Learning Madikonda Jagadish1, Cholleti Shiva Kumar2, Dobbala Sandeep3, 4G Bhargavi 1,2,3B. Tech Scholars, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India 4Assistant Professor, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Twitter is a popular platform for people to express their thoughts and emotions on various occasions. Sentiment analysis is a method of analyzing data in order to extract the sentiment that it contains. Twitter sentiment analysis is theapplication of sentiment analysis to data from Twitter (tweets) in order to extract user sentiments. Over the last few decades, research in this field has steadily increased. The reason for this is the difficult format of the tweets, which makes processing difficult. Because the tweet format is so small, it creates a whole new set of issues, such as the use of slang and abbreviations. In this paper, we demonstrate the use of sentimental analysis as well as how to connect to Twitter and execute queries usingsentimentanalysis.Weconduct tests on several issues, ranging from politicstohumanity, and provide the intriguing findings. We discovered that the level of neutral sentiment for tweets is very high, which amply demonstrates the shortcomings of the existing works. 1. INTRODUCTION Twitter has emerged as a major microbloggingwebsite,with over 100 million users daily sending out over 500 million tweets. Twitter's large audience has consistently drawn users to express their opinions and perspectives on any issue, brand, company, or another topic of interest. As a result, many organizations, institutions, and businesses use Twitter as a source of information. Twitter users can express themselves in the form of tweets, which are limited to 140 characters. As a result, people condense their statements by using slang, abbreviations, emoticons, short forms, and so on. Along with this, people express themselves through sarcasm and polysemy. As a result, the term "unstructured" isappropriateforthe Twitter language. To elicit emotion fromSentimentanalysisinvolves determining the sentiment of a specific remark or sentence. It's a categorization technique that extracts opinion from tweets and creates a sentiment, which is individualized depending on the topic of interest. It's our responsibility to determine what characteristics will determine the feeling it conveys. The class of entities that the person conducting sentiment analysis intends to find in the tweets is referred to as sentiment in the programming model. The sentiment class's dimension is a key aspect in determining the model's effectiveness. For instance, we may classify tweet sentiment into two categories—positive and negative—or three categories (positive, negative and neutral). The class of entities that the person conducting sentiment analysis intends to find in the tweets is referredtoassentimentinthe programming model. The sentiment class's dimension is a key aspect in determining the model's effectiveness. Many businesses and organizations now utilize sentiment analysis to evaluate customer feedback on a productortheir response to an event without the need for surveys or other pricey and time-consuming methods. One such social networking site, Twitter, one of thebiggest networkingsites, is considered in this thesis. According to the data, there are around 316 million active users monthly, and on average, 500 million tweets are sent each day. II. LITERATURE SURVEY Sentiment analysis involves determining the sentiment of a specific remark or sentence. It's a categorization technique that extracts opinion from tweets and creates a sentiment, which is individualized depending on the topic of interest. It's our responsibility to determine what characteristicswill determine the feeling it conveys. Theclassofentitiesthat the person conducting sentiment analysis intends to find in the tweets is referred to as sentiment in the programming model. The sentiment class's dimension is a key aspect in determining the model's effectiveness. For instance, we may classify tweet sentiment into two categories—positive and negative—or three categories (positive, negative and neutral). The class of entitiesthatthe person conducting sentiment analysis intends to find in the tweets is referred to as sentiment in the programming model. The sentiment class's dimension is a key aspect in determining the model's effectiveness. learning approach uses feature extraction while training the model with a feature set and dataset. III. DESIGN AND IMPLEMENTATION Through the use of Twitter's own APIs, this technical paper documents the implementation of Twitter sentiment analysis. For text mining on social networks, there are excellent resources and tools. The full range of the libraries utilized in this project has been available. We use following approaches to extract sentiment from the tweets.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1304 1. Download and cache the sentiment dictionary first. 2. Download the testing data sets for Twitter and enter them into the software. 3. Remove the stop words from the tweets to clean them up. 4. Tokenize all of the dataset's words before feeding them to the program. 5. For each term, contrast it with the dictionary's definitions of words having positive and negative connotations. Then raise either the positive or negative count. 6. In order to determine the polarity, we can calculate the outcome percentage based on the positive and negative counts. 3.1 IMPLIMENTATION Python was used in this study to implement sentimental analysis. Several packages have used it, notablytextbloband tweepy. The commands listed below can be used to install the necessary libraries: Install tweepy via pip. Install textblob via pip. Textblob: A Python (2 and 3) package called TextBlob is used to process textual data. It offers a straightforward API for getting started with typical natural language processing (NLP) activities like part-of-speech tagging, noun phrase extraction, sentimentanalysis,classification,translation,and others. Tweepy: You may access the Twitter API with Python quite conveniently using the open source Tweepy library.Tweepy contains a collection of classes and methods that represent the models and API endpoints of Twitter.User’sneedtogoto the apps.twitter.com/app/new and generate the API keys. With following steps, we can connect twitter API with python: Create a free Rapid API user account (or log in). Open the Twitter API page on Rapid API. As soon as you click "Connect to API," you can start entering the parameters and fields for your API Key. Start the Twitter API Endpoints testing. The below figure says about connecting of python with twitter API Fig 1. User API Keys 3.1 Dataset: The data is retrieved form the twitter using API and that is stored in the csv file for the data visualization. This stores the tweets that are retrieved form the twitter API and the user can see the data for clarification. Fig 2. Dataset with tweets IV. TWITTERSENTIMENTANALYSISWITHPYTHON 4.1 Python: Python is a preferred programming language because of its extensive capabilities, applicability,andsimplicity.Dueto its independent platform and widespread use in the programming community, the Python programming language is the most suitable for machine learning. The requirement for intelligent answerstopractical issuesneeds the further development ofAIinordertoautomatelaborious processes that would be difficult to program withoutAI.The Python programming language is thought to be the ideal technique for automating these processes since it is more straightforward and consistent than other programming languages. Additionally, the vibrant Python community makes it simple for developers to discuss projects and offer suggestions for improving their code.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1305 4.2 Tweepy: You may access the Twitter API with Python quite conveniently using the open-sourceTweepylibrary.Tweepy contains a collection of classes and methods that represent the models and API endpoints of Twitter. It also handles following things like data encoding and decoding The following figure say about setup of tweepy API. Fig 3. Tweepy API connection 4.3 Textblob: A Python library for Natural Language Processing is called TextBlob (NLP). Natural Language Toolkit(NLTK)wasa tool that TextBlob actively employed to complete its tasks. The NLTK library enables users to do categorization, classification, and a variety of other tasks while providing quick access to a large number of lexical resources.TextBlob is a straightforward library that provides intricate textual analysis and processing. A sentiment is identified by its semantic orientation and the force of each word in the sentence for lexicon-based techniques. This calls for a pre-defined dictionary that divides words into negative and positive categories. A text message will typically be represented by a bag of words. Following the individual scoring of each word, the ultimate sentiment is determined by performinga poolingprocedure, such as averaging all the sentiments. TextBlob returns a sentence's polarity and subjectivity. The polarity scale is [-1,1], where -1 represents a negative emotion and 1 represents a good emotion. Negative words turn the polarity around. Semantic labels in TextBlob facilitate detailed analysis. Emoticons, exclamation points, emoticons, etc. are a few examples. The range of subjectivity is [0, 1]. Subjectivitymeasureshowmuchfactual information and subjective opinion are present in the text. The content contains personal opinion rather than factual information due to the text's heightened subjectivity. One other setting for TextBlob is intensity. TextBlob uses the "intensity" to determine subjectivity. Whether a word modifies the next word depends on its intensity.Adverbsareusedasmodifiers in English, such as "extremely good." 4.3 NLTK: The Natural Language Toolkit (NLTK) is a Python programming environment for creating applications for statistical natural language processing (NLP). Steven Bird, Edward Looper, and Ewan Klein created the Natural Language Toolkit as an open-source library for the Python programming language with the intention of using it for development and education. It is appropriate for linguists without extensive programming experience, engineers and researchers who need to delve into computational linguistics, students, and educators because it includes a hands-on guide that introduces topics in computational linguistics as well as Python programming fundamentals. To gain insights from linguistic data, you can use these methods with NLTK using robust built-in machine learning procedures. Tasks like tokenization, stemming, lemmatization, punctuation, character count, word count, etc. can be accomplished with this library. This does an analysis of the data and produces the necessary results. 4.4 Matplotlib: For Python and its numerical extensionNumPy,Matplotlibis a cross-platform packagefor graphical data visualizationand charting. This makes it a strong opensource substitute for MATLAB. The APIs (Application Programming Interfaces)of matplotlib can also be used by developers to integrate plots into GUI programmers. The way a Python matplotlib script is written makes it possible to create a visual data plot in the majority of cases with just a few lines of code. Overlaying two APIs is the Matplotlib scripting layer.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1306 The matplotlib object is the top-level Python code object in the pyplot API hierarchy. An Object-Oriented API collection of objects that is more flexible than pyplot in how it may be put together. The backend layers of Matplotlib are directly accessible through this API. IV. RESULT The tweets are received from Twitter using the API, analyzed, and the results are shown in the below pie chart. The below figure shows the tweets that are retrieved from the twitter. Fig 4. Tweets retrieved form twitter The analysis's findings, which show different people's opinions on numerous issues, are summarized in the pie chart below. In order to analyze their product or business, these tweets are analyzed and results are recorded. the analysis' findings are based on a variety ofqueries,including those related to movies, politics, fashion, and more.Thedata based on the tweets retrieved are illustrated by thepiechart in figure 5. Based on the tweets we retrieve, if we run the program at other times, we can receive slightly different results. Fig 5. Output of the tweets analysis This shows the analysis of the tweetsaboutViratKohlibased on the latest 100 tweets and more tweets can also be retrieved based on the user’s needs. Three different categories are defined as positive, negative and neutral tweets. In the above pie chart, the results are as follows: Positive tweets percentage: 23.0 % Negative tweets percentage: 69.0 % Neutral tweets percentage: 8.0 % The fraction of neutral tweets is notably high, as shown in the pie chart. It's also crucial to note that, depending on the experiment's data, we can obtain various conclusions because people's opinions can alter in response to external factors. V. CONCLUSION Twitter sentiment analysis comes underthecategoryoftext and opinion mining. It focuses on examining the sentiments of the tweets and feeding the data to a machine learning model in order to train it and then test its precision, so that we may use this model going forward based on theresults.It entails actions including gathering data, textpre-processing, sentiment categorization, sentiment detection, model training, and testing. The models used in this research have improved over the past ten years, attaining efficiencies of roughly 85%–90%. However, the dimension of data variety is still missing. In addition, it has numerous application problems due to the slang and abbreviations employed. The performance of many analyzers suffers as the number of classes rises. Therefore, there is a very promising future for the advancement of sentiment analysis. REFERENCES [1] Pak, A., & Paroubek, P. (2010, May). Twitter as a corpus for sentiment analysis and opinion mining.InLREc(Vol. 10, No. 2010). [2] TextBlob, 2017, https://textblob.readthedocs.io/en/dev/ [3] Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167. [4] Neethu MS and Rajashree R,” Sentiment Analysis in Twitter using Machine Learning Techniques” 4th ICCCNT 2013, at Tiruchengode, India. IEEE – 31661 [5] Kiritchenko, S., Zhu, X., & Mohammad, S. M. (2014). Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, 50, 723-762.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1307 [6] Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. (2011, June). Sentiment analysis of twitter data. In Proceedings of the workshop on languages in social media (pp. 30-38). Association for Computational Linguistics. [7] Pang, B.and Lee, L. “A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts”. 42nd Meeting of the Association for Computational Linguistics[C] (ACL-04). 2004, 271-278. [8] Rosenthal, S., Farra, N., & Nakov, P. (2017). SemEval- 2017 task 4: Sentiment analysis in Twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (pp. 502-518). [9] Nehal Mamgain, Ekta Mehta, Ankush Mittal and Gaurav Bhatt, “Sentiment AnalysisofTopCollegesinIndia Using Twitter Data”, (IEEE) ISBN -978-1-5090-0082-1, 2016.