A CLUSTERING ANALYSIS OF
TWEET LENGTH AND ITS
RELATION TO SENTIMENT
Research Project
Matthew Mayo
School of Computer Science
Columbus State University
Columbus, GA
CPSC 6185 Intelligent Agents
Dr. Rania Hodhod
Twitter
• Popular microblogging web service
• 140 character per message (tweet) limit
• Started in 2006, over 645 million users today*
• 58 million tweets per day*
• 9,000 tweets per minute*
* Source: www.statisticbrain.com/twitter-statistics
Sentiment Analysis
• Identifying, extracting & processing subjective
information from source material
• Subjective information includes attitudes,
emotions & opinions
• Appropriate for binary classification (positive vs.
negative, good vs. bad, etc.)
• Useful for movie reviews, political election
opinions, etc.
Project Aim
Interested in exploring the relationship between:
• Length of tweet (number of characters)
AND
• Sentiment score of tweet
Problem Description
The research project tasks:
1. Capture Twitter data
2. Build custom sentiment dictionary
3. Process tweets
4. Create dataset
5. Cluster tweet data
Methodology
● Custom Python scripts to capture and process live
tweets over 4 week schedule
● Use k-means clustering in Weka to look for
natural sentiment patterns
● Any correlation between length of tweet and its
sentiment (positive/negative/neutral)?
Results
Sentiment scores of shorter tweets appear more
tightly-centered around their cluster’s centroid
Longer tweets become less-centered on the
applicable centroid
As the number of characters in a tweet would lead
to a greater number of terms, which would
increase the chances of terms being assigned a
score, this seems intuitive
Results

More Related Content

PPTX
A Clustering Analysis of Tweet Length and its Relation to Sentiment
PDF
A New Approach to Real Time Intent and Sentiment Analysis
PDF
Certificate_ MIT_ CYBERSECURITY
DOCX
William Allen LM Updated Resume
PDF
Certificate MIT
PDF
Utkarsh Garg Resume
DOCX
BTP proposal
PDF
Sentiment Analysis of Twitter Data
A Clustering Analysis of Tweet Length and its Relation to Sentiment
A New Approach to Real Time Intent and Sentiment Analysis
Certificate_ MIT_ CYBERSECURITY
William Allen LM Updated Resume
Certificate MIT
Utkarsh Garg Resume
BTP proposal
Sentiment Analysis of Twitter Data

Similar to project-presentation (20)

PDF
Opinion mining for social media
PPTX
Open Analytics: Building Effective Frameworks for Social Media Analysis
PPTX
Sentiment tool Project presentaion
PPTX
Twitter Sentiment Prediction.pptx
PDF
final_nlp
PDF
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
PDF
Retweet Prediction with Attention-based Deep Neural Network
PDF
Twitter Sentiment Analysis
PPTX
Social media analytics
PPTX
PPTX
Twitter sentiment analysis using Azure NLP
PPTX
New Methodologies for Capturing and Working with Publicly Available Twitter Data
PDF
E017433538
PPTX
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
PPTX
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
PDF
Twitter Sentiment Analysis.pdf
PDF
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
PPTX
Designing Big Content - Search Exchange 2013
PPT
Creating Killer Content for Social Media
PPT
Towards identifying Collaborative Learning groups using Social Media
Opinion mining for social media
Open Analytics: Building Effective Frameworks for Social Media Analysis
Sentiment tool Project presentaion
Twitter Sentiment Prediction.pptx
final_nlp
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Retweet Prediction with Attention-based Deep Neural Network
Twitter Sentiment Analysis
Social media analytics
Twitter sentiment analysis using Azure NLP
New Methodologies for Capturing and Working with Publicly Available Twitter Data
E017433538
SampleLiteratureReviewTemplate_IVBTechIISEM_MajorProject.pptx
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Twitter Sentiment Analysis.pdf
IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...
Designing Big Content - Search Exchange 2013
Creating Killer Content for Social Media
Towards identifying Collaborative Learning groups using Social Media
Ad

project-presentation

  • 1. A CLUSTERING ANALYSIS OF TWEET LENGTH AND ITS RELATION TO SENTIMENT
  • 2. Research Project Matthew Mayo School of Computer Science Columbus State University Columbus, GA CPSC 6185 Intelligent Agents Dr. Rania Hodhod
  • 3. Twitter • Popular microblogging web service • 140 character per message (tweet) limit • Started in 2006, over 645 million users today* • 58 million tweets per day* • 9,000 tweets per minute* * Source: www.statisticbrain.com/twitter-statistics
  • 4. Sentiment Analysis • Identifying, extracting & processing subjective information from source material • Subjective information includes attitudes, emotions & opinions • Appropriate for binary classification (positive vs. negative, good vs. bad, etc.) • Useful for movie reviews, political election opinions, etc.
  • 5. Project Aim Interested in exploring the relationship between: • Length of tweet (number of characters) AND • Sentiment score of tweet
  • 6. Problem Description The research project tasks: 1. Capture Twitter data 2. Build custom sentiment dictionary 3. Process tweets 4. Create dataset 5. Cluster tweet data
  • 7. Methodology ● Custom Python scripts to capture and process live tweets over 4 week schedule ● Use k-means clustering in Weka to look for natural sentiment patterns ● Any correlation between length of tweet and its sentiment (positive/negative/neutral)?
  • 8. Results Sentiment scores of shorter tweets appear more tightly-centered around their cluster’s centroid Longer tweets become less-centered on the applicable centroid As the number of characters in a tweet would lead to a greater number of terms, which would increase the chances of terms being assigned a score, this seems intuitive