Hamdan Azhar
hamdan@prismoji.com // @hamdanazhar
// November 5, 2016
🐍s,🌹s, & major 🔑s
an introduction to emoji data science
🗃 📊🗃
why emoji data science?
http://theislamicmonthly.com/neither-here-nor-there-on-losing-my-snapchat-best-friend/
Introduction to emoji data science (Emojicon, 2016)
Introduction to emoji data science (Emojicon, 2016)
emojis data science
Overarching goals
■ Understanding what emojis mean
■ Using emojis to understand the topics we use them to
discuss
■ Getting past the “so what” hurdle and defining good
questions to ask
the birth of
My reaction to this article,
in emoji
So we decided to look at some actual data
Getting the data
■ UseTwitter API to sample 100,000 tweets for five hashtags related to Britain’s EU
Referendum
 Hashtags: #NotMyVote, #VoteRemain, #EURef, #Brexit, #VoteLeave
 Data pulled for June 24, the day after the referendum
 English language tweets only
 After removing retweets, we’re left with 23,989 unique tweets, i.e. the “Brexit
dataset”
 Of these, 1,505 tweets (6.3%) contain at least one emoji
Analyzing the data
 Use regular expressions in R, along with Unicode emoji dictionaries, to
extract emojis from tweets
 Compute emoji counts in the Brexit dataset
 Compare with counts for all >10B emoji tweets onTwitter since 2013 (from
emojitracker.com)
 Extract hashtags from tweets and compute hashtag profiles for various
emojis
emoji emoji name
brexit
rank
general
rank
brexit
index*
general
index*
overindex**
😂 face with tears of joy 1 1 100 100
flag of united kingdom 2 363 87 0.2 400x
👍 thumbs up sign 3 18 26 11 2.3x
👏 clapping hands sign 4 45 24 6 3.9x
❤️ heavy black heart 5 3 21 45
😭 loudly crying face 6 7 17 29
😔 pensive face 7 13 14 18
😩 weary face 8 11 13 22
😢 crying face 9 27 12 9 1.3x
🙈 see-no-evil monkey 10 24 12 9 1.3x
* Index is an estimate of how prevalent a given emoji is in Brexit tweets and general tweets, with the most common emoji (😂) being defined as 100
** Reflects how much more likely a given emoji is to be used in a Brexit tweet vs. generally onTwitter (general rank and index obtained from emojitracker.com). An emoji
overindexes on Brexit if both brexit rank < general rank AND brexit index > general index.
Which emojis over-index most heavily for Brexit?
(above and beyond their usual popularity onTwitter)
Finding the “hashtag signature” of a given emoji
 We know the distribution of
hashtags in our entire dataset
 We can pick a given emoji and
compute the distribution of
hashtags for tweets that use that
emoji
 By comparing these two
distributions, we can estimate
which hashtags an emoji is most
likely to be used with
15%
17%
20%
29%
19%
Introduction to emoji data science (Emojicon, 2016)
Hashtag signatures of the top emojis of Brexit
http://motherboard.vice.com/read/the-emojis-of-great-brexit
Introduction to emoji data science (Emojicon, 2016)
Taylor Swift is winning hearts (and minds)
Source: Analysis of 100,000
public tweets mentioning
@taylorswift13 and
@kanyewest from
Aug. 1-4, 2016.
(PRISMOJI)
equal
higher association with
@taylorswift13
higher association with
@kanyewest
Hearts vs. Snakes:
The emoji battle underyling the epicTaylor Swift – KanyeWest feud
Source: Analysis of 100,000
public tweets mentioning
@taylorswift13 and
@kanyewest from
Aug. 1-4, 2016.
(PRISMOJI)
#taylorswiftwhatup is the most common hashtag in tweets about
bothTaylor and Kanye
Source: Analysis of 100,000
public tweets mentioning
@taylorswift13 and
@kanyewest from
Aug. 1-4, 2016.
(PRISMOJI)
Introduction to emoji data science (Emojicon, 2016)
Our common emoji language of #fanlove
Source: Analysis of 250,000
public tweets mentioning
@beyonce, @justinbieber,
@djkhaled, @drake, and
@rihanna from
Aug. 1-4, 2016.
(PRISMOJI)
Sometimes love hurts
Examples of in tweets involving #fanlove
Source: Analysis of 250,000
public tweets mentioning
@beyonce, @justinbieber,
@djkhaled, @drake, and
@rihanna from
Aug. 1-4, 2016.
(PRISMOJI)
http://motherboard.vice.com/read/a-data-scientists-emoji-guide-to-kanye-west-and-taylor-swift
Some more examples
#firstsevenjobs
Source: Analysis of 32,979 public
tweets with the hashtags
#firstsevenjobs and #first7jobs
from Aug. 8, 2016. (PRISMOJI)
Understanding gendered emojis onTwitter
#wcw vs #mcm: All hearts are not created equal
higher association
with
#mcm
higher association
with
#wcw
Source: Analysis of 100,000
public tweets with the hashtags
#wcw and #mcm fromJune 27-
29, 2016. (PRISMOJI)
#Rio2016 Olympics
Source: Analysis of 449,680
public tweets mentioning
#rio2016 from
Aug. 6-22, 2016.
(PRISMOJI)
higher association with
FIRST 3 DAYS
higher association with
LAST 3 DAYS
Third Presidential Debate
Source: Analysis of public
tweets during third presidential
debate on
Oct. 20, 2016.
(PRISMOJI)
Three takeaways I’d like you to leave with
■ Understanding emojis as data can yield interesting insights
■ More work is needed to learn more about what emojis
mean, and what they reveal about our world
■ You can play around with emoji data too 
Thank you!
• Email: hamdan@prismoji.com
• Twitter: @hamdanazhar
• prismoji.com
• hamdanazhar.com

More Related Content

PPTX
Introduction to emoji data science (csv,conf,v3, 2017)
PDF
Emoji Data Science & Sentiment Analysis (Newsgeist, 2017)
PDF
Introduction to Emoji Data Science (Open Data Science Conference, 2017)
PPTX
Shame On UX Launch Meet Up
PPTX
State one positive and negative words
PDF
Final impoliteness GlideShah
DOCX
Drug theft auto survey results
PPTX
Emoji In Social Media Operations
Introduction to emoji data science (csv,conf,v3, 2017)
Emoji Data Science & Sentiment Analysis (Newsgeist, 2017)
Introduction to Emoji Data Science (Open Data Science Conference, 2017)
Shame On UX Launch Meet Up
State one positive and negative words
Final impoliteness GlideShah
Drug theft auto survey results
Emoji In Social Media Operations

Viewers also liked (20)

PPTX
The Linguistic Secrets Found in Billions of Emoji - SXSW 2016 presentation
PPTX
History of Emoji
PDF
Teaching Students with Emojis, Emoticons, & Textspeak
PDF
Emojis as the New Means of Communication
PPTX
Human Personality
PPTX
The emojis strange power
PPTX
Personality assessment
PPTX
eMail 101 (4) Class for Self help Virtual Senior Center
PPTX
Voice
PPT
Chapter 7 Multimedia
PPTX
Application of dual output LiDAR scanning system for power transmission line ...
PPT
Introduction to PHP
PDF
Ch07
PDF
Groundwater Research and Technology, Stefan Schuster
PPTX
Ch2(working with forms)
PDF
WE1.L10 - GRACE Applications to Regional Hydrology and Water Resources
PDF
final emoji-board EMAIL ME NOWSWAG11
PPT
Introduction to PHP
PPT
Chapter 1Into the Internet
PPTX
Chapter 5 Input
The Linguistic Secrets Found in Billions of Emoji - SXSW 2016 presentation
History of Emoji
Teaching Students with Emojis, Emoticons, & Textspeak
Emojis as the New Means of Communication
Human Personality
The emojis strange power
Personality assessment
eMail 101 (4) Class for Self help Virtual Senior Center
Voice
Chapter 7 Multimedia
Application of dual output LiDAR scanning system for power transmission line ...
Introduction to PHP
Ch07
Groundwater Research and Technology, Stefan Schuster
Ch2(working with forms)
WE1.L10 - GRACE Applications to Regional Hydrology and Water Resources
final emoji-board EMAIL ME NOWSWAG11
Introduction to PHP
Chapter 1Into the Internet
Chapter 5 Input
Ad

Recently uploaded (20)

PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
DOCX
search engine optimization ppt fir known well about this
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
UiPath Agentic Automation session 1: RPA to Agents
PPTX
2018-HIPAA-Renewal-Training for executives
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
CloudStack 4.21: First Look Webinar slides
PPT
Geologic Time for studying geology for geologist
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
Benefits of Physical activity for teenagers.pptx
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PPTX
Configure Apache Mutual Authentication
PPTX
Chapter 5: Probability Theory and Statistics
Credit Without Borders: AI and Financial Inclusion in Bangladesh
search engine optimization ppt fir known well about this
Abstractive summarization using multilingual text-to-text transfer transforme...
UiPath Agentic Automation session 1: RPA to Agents
2018-HIPAA-Renewal-Training for executives
Module 1.ppt Iot fundamentals and Architecture
NewMind AI Weekly Chronicles – August ’25 Week III
Consumable AI The What, Why & How for Small Teams.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
sustainability-14-14877-v2.pddhzftheheeeee
CloudStack 4.21: First Look Webinar slides
Geologic Time for studying geology for geologist
A proposed approach for plagiarism detection in Myanmar Unicode text
OpenACC and Open Hackathons Monthly Highlights July 2025
Benefits of Physical activity for teenagers.pptx
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Hindi spoken digit analysis for native and non-native speakers
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Configure Apache Mutual Authentication
Chapter 5: Probability Theory and Statistics
Ad

Introduction to emoji data science (Emojicon, 2016)

  • 1. Hamdan Azhar hamdan@prismoji.com // @hamdanazhar // November 5, 2016 🐍s,🌹s, & major 🔑s an introduction to emoji data science 🗃 📊🗃
  • 2. why emoji data science?
  • 7. Overarching goals ■ Understanding what emojis mean ■ Using emojis to understand the topics we use them to discuss ■ Getting past the “so what” hurdle and defining good questions to ask
  • 9. My reaction to this article, in emoji
  • 10. So we decided to look at some actual data
  • 11. Getting the data ■ UseTwitter API to sample 100,000 tweets for five hashtags related to Britain’s EU Referendum  Hashtags: #NotMyVote, #VoteRemain, #EURef, #Brexit, #VoteLeave  Data pulled for June 24, the day after the referendum  English language tweets only  After removing retweets, we’re left with 23,989 unique tweets, i.e. the “Brexit dataset”  Of these, 1,505 tweets (6.3%) contain at least one emoji
  • 12. Analyzing the data  Use regular expressions in R, along with Unicode emoji dictionaries, to extract emojis from tweets  Compute emoji counts in the Brexit dataset  Compare with counts for all >10B emoji tweets onTwitter since 2013 (from emojitracker.com)  Extract hashtags from tweets and compute hashtag profiles for various emojis
  • 13. emoji emoji name brexit rank general rank brexit index* general index* overindex** 😂 face with tears of joy 1 1 100 100 flag of united kingdom 2 363 87 0.2 400x 👍 thumbs up sign 3 18 26 11 2.3x 👏 clapping hands sign 4 45 24 6 3.9x ❤️ heavy black heart 5 3 21 45 😭 loudly crying face 6 7 17 29 😔 pensive face 7 13 14 18 😩 weary face 8 11 13 22 😢 crying face 9 27 12 9 1.3x 🙈 see-no-evil monkey 10 24 12 9 1.3x * Index is an estimate of how prevalent a given emoji is in Brexit tweets and general tweets, with the most common emoji (😂) being defined as 100 ** Reflects how much more likely a given emoji is to be used in a Brexit tweet vs. generally onTwitter (general rank and index obtained from emojitracker.com). An emoji overindexes on Brexit if both brexit rank < general rank AND brexit index > general index. Which emojis over-index most heavily for Brexit? (above and beyond their usual popularity onTwitter)
  • 14. Finding the “hashtag signature” of a given emoji  We know the distribution of hashtags in our entire dataset  We can pick a given emoji and compute the distribution of hashtags for tweets that use that emoji  By comparing these two distributions, we can estimate which hashtags an emoji is most likely to be used with 15% 17% 20% 29% 19%
  • 16. Hashtag signatures of the top emojis of Brexit
  • 19. Taylor Swift is winning hearts (and minds) Source: Analysis of 100,000 public tweets mentioning @taylorswift13 and @kanyewest from Aug. 1-4, 2016. (PRISMOJI) equal higher association with @taylorswift13 higher association with @kanyewest
  • 20. Hearts vs. Snakes: The emoji battle underyling the epicTaylor Swift – KanyeWest feud Source: Analysis of 100,000 public tweets mentioning @taylorswift13 and @kanyewest from Aug. 1-4, 2016. (PRISMOJI)
  • 21. #taylorswiftwhatup is the most common hashtag in tweets about bothTaylor and Kanye Source: Analysis of 100,000 public tweets mentioning @taylorswift13 and @kanyewest from Aug. 1-4, 2016. (PRISMOJI)
  • 23. Our common emoji language of #fanlove Source: Analysis of 250,000 public tweets mentioning @beyonce, @justinbieber, @djkhaled, @drake, and @rihanna from Aug. 1-4, 2016. (PRISMOJI)
  • 24. Sometimes love hurts Examples of in tweets involving #fanlove Source: Analysis of 250,000 public tweets mentioning @beyonce, @justinbieber, @djkhaled, @drake, and @rihanna from Aug. 1-4, 2016. (PRISMOJI)
  • 27. #firstsevenjobs Source: Analysis of 32,979 public tweets with the hashtags #firstsevenjobs and #first7jobs from Aug. 8, 2016. (PRISMOJI)
  • 28. Understanding gendered emojis onTwitter #wcw vs #mcm: All hearts are not created equal higher association with #mcm higher association with #wcw Source: Analysis of 100,000 public tweets with the hashtags #wcw and #mcm fromJune 27- 29, 2016. (PRISMOJI)
  • 29. #Rio2016 Olympics Source: Analysis of 449,680 public tweets mentioning #rio2016 from Aug. 6-22, 2016. (PRISMOJI) higher association with FIRST 3 DAYS higher association with LAST 3 DAYS
  • 30. Third Presidential Debate Source: Analysis of public tweets during third presidential debate on Oct. 20, 2016. (PRISMOJI)
  • 31. Three takeaways I’d like you to leave with ■ Understanding emojis as data can yield interesting insights ■ More work is needed to learn more about what emojis mean, and what they reveal about our world ■ You can play around with emoji data too 
  • 32. Thank you! • Email: hamdan@prismoji.com • Twitter: @hamdanazhar • prismoji.com • hamdanazhar.com