SlideShare a Scribd company logo
visualization 
at Twitter 
data 
Krist Wongsuphasawat / @kristw
Krist Wongsuphasawat / @kristw
Bangkok, Thailand 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
Chulalongkorn University 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
Programming + Soccer 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
Programming + Soccer 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
Programming + Soccer 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
M.S. in Computer Science 
Univ. of Maryland 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
PhD in Computer Science 
Univ. of Maryland 
Information Visualization 
Krist Wongsuphasawat / @kristw
Computer Engineer 
Bangkok, Thailand 
PhD in Computer Science 
Univ. of Maryland 
Information Visualization 
Krist Wongsuphasawat / @kristw 
IBM 
Microsoft
Computer Engineer 
Bangkok, Thailand 
PhD in Computer Science 
Univ. of Maryland 
Information Visualization 
Krist Wongsuphasawat / @kristw 
IBM 
Microsoft 
Sr. Data Visualization Scientist 
Twitter
data visualization 
at Twitter
data visualization 
at Twitter
visualization 
data 
at Twitter
vis 
data 
at Twitter
data 
at Twitter 
“Tweets”
data 
at Twitter 
“Tweets” 
#events 
TV Shows New Year 
Earthquake 
Oscars 
Protest 
Super Bowl 
World Cup Election 
Breaking news 
…
data 
at Twitter 
“Tweets” 
#events 
TV Shows New Year 
Earthquake 
Oscars 
Protest 
Super Bowl 
World Cup Election 
Breaking news 
… 
#curiosity 
Sleep pattern 
Human behavior 
Language …
data 
at Twitter 
“Tweets” 
#events 
TV Shows New Year 
Earthquake 
Oscars 
Protest 
Super Bowl 
World Cup Election 
Breaking news 
… 
#curiosity 
Sleep pattern 
Human behavior 
Language … 
What could we learn from the Tweets?
vis 
data 
at Twitter 
“Tweets” 
Tell stories about an event, 
Pursue curiosity or inspiration 
Goal:
vis 
data 
at Twitter 
“Tweets” 
Tell stories about an event, 
Pursue curiosity or inspiration 
(with deadline) 
Goal:
Challenge accepted
vis 
data 
at Twitter 
“Tweets” 
Get data 
1
easy?
Having all Tweets 
How people think I feel.
Having all Tweets 
How people think I feel. How I really feel.
Challenges 
• Too much data 
• Want only relevant Tweets 
• hashtag: #BRA 
• keywords: “goal” 
• Need to aggregate & reduce size 
• Long processing time (hours)
Hadoop Cluster 
Vertica 
Pig / Scalding (slow) SQL 
Data Storage 
Tool 
Workflow
Hadoop Cluster 
Vertica 
Pig / Scalding (slow) SQL 
Data Storage 
Tool 
Workflow
Workflow 
Hadoop Cluster 
Vertica 
Pig / Scalding (slow) SQL 
Data Storage 
Tool 
Your laptop Smaller dataset
Hadoop Cluster 
Vertica 
Pig / Scalding (slow) SQL 
Data Storage 
Tool 
Tool node.js / python / excel (fast) 
Final dataset 
Your laptop 
Workflow 
Smaller dataset
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize
Visualize 
• Peek into data 
• Check data & test ideas 
• Decide how to visualize 
• Guided by data type 
• Choose tools 
• Start building
Visualize 
• Peek into data 
• Check data & test ideas 
• Decide how to visualize 
• Guided by data type 
• Choose tools 
• Start building 
R d3 
Tableau Yeoman
(+ media) 
photos, videos 
data 
What? 
TEXT 
Where? When? 
GEO TIME
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Time Tweets/second
Time Tweets/second
Time Tweets/second + Annotation 
http://www.flickr.com/photos/twitteroffice/5681263084/
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Geo 
Heatmap 
Low density 
High density
Geo 
San Francisco 
flickr.com/photos/twitteroffice/8798020541 
Low density 
High density
Geo 
San Francisco 
Rebuild the world 
based on 
tweet volumes 
twitter.github.io/interactive/andes/
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Text 
www.wordle.net 
Some experiments 
during World Cup
Text 
www.wordle.net 
Word cloud of Tweets right after the 1st goal
Text Word cloud of Tweets right after the 1st goal 
It was an “own” goal. 
www.wordle.net
Text WordTree [Wattenberg & Viégas 2008] 
www.jasondavies.com/wordtree 
www.jasondavies.com/wordtree
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Time + Geo 
Japan Earthquake 2011 
blog.twitter.com/2011/global-pulse 
youtu.be/SybWjN9pKQk
Data Visualization at Twitter
Data Visualization at Twitter
Time + Geo Tweet pattern [Rios & Lin 2012] 
Night 
Late night 
Daytime 
Night 
Late night 
Daytime
Time + Geo Tweet pattern [Rios & Lin 2012] 
Night 
Late night 
Daytime 
Night 
Late night 
Daytime
Time + Geo Tweet pattern [Rios & Lin 2012] 
Night 
Late night 
Daytime 
Night 
Late night 
Daytime
Time + Geo Tweet pattern [Rios & Lin 2012] 
Night 
Late night 
Daytime 
Night 
Late night 
Daytime
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Geo + Text Real-time Tweet map
Geo + Text Real-time Tweet map
Geo + Text Real-time Tweet map 
most 
frequent 
term
Geo + Text Real-time Tweet map 
Gmail was down 
Jan 24, 2014
Geo + Text Real-time Tweet map 
Nelson Mandela 
passed away 
Dec 5, 2013
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Time + Text 
UEFA Champions League 
Biggest tournament for European soccer clubs 
Many Tweets during the matches
UEFA Champions League 
Team 1 Team 2 
Time + Text 
Dortmund Bayern Munich
UEFA Champions League 
Team 1 Team 2 
Time + Text 
Dortmund Bayern Munich
UEFA Champions League 
Team 1 Team 2 
Time + Text 
Dortmund Bayern Munich
UEFA Champions League 
Team 1 Team 2 
Dortmund Bayern Munich 
Count Tweets mentioning 
the teams every minute 
Time + Text
Time + Text UEFA Champions League
Time + Text UEFA Champions League 
+ “goal” count 
+ context
+ “offside” 
Time + Text UEFA Champions League
+ players 
Time + Text UEFA Champions League
Competition Tree 
vs vs 
A B C D 
vs 
A C 
C
Competition Tree 
vs vs 
A B C D 
vs + 
A C 
C
Competition Tree 
vs vs 
A B C D 
vs + = 
A C 
C
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Time + Text + Geo State of the Union 
twitter.github.io/interactive/sotu2014
Time + Text + Geo State of the Union 
1) timeline + topic from Tweets 
4) Density map of 
Tweets about 
selected topic 
3) Volume of Tweets 
by topics 
during selected 
part of the SOTU 
2) context 
(speech) 
twitter.github.io/interactive/sotu2014
Time + Text World Cup 2014
Time + Text + Geo World Cup 2014
Visualize Data 
What? 
TEXT 
Where? When? 
GEO TIME
Visualize Data 
What? 
TEXT 
+ 
Where? When? 
GEO TIME 
Non-Twitter data 
CONTEXT
Time + Text New Year 2014
Time + Text New Year 2014
Time + Text + Geo (c) New Year 2014 
twitter.github.io/interactive/newyear2014/
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
Iterate!
Evaluation 
• Self 
• Peer feedback 
• Non team members / Potential audience
vis 
data 
at Twitter 
Get data 
1 
2 
Visualize 
Evaluate 
3
vis 
data 
at Twitter 
Get data 
1 
2 
Visualize 
Evaluate 
3 
big data => small data
vis 
data 
at Twitter 
Get data 
1 
2 
Visualize 
Evaluate 
3 
big data => small data 
What? Where? When?
big data => small data self, peer, external 
vis 
data 
at Twitter 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When?
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When?
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users 
• followers graph 
• logs 
• etc. 
! 
• derived data: language, sentiment
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users Who? … 
• followers graph 
• logs 
• etc. 
! 
• derived data: language, sentiment
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users Who? … 
• followers graph 
• logs 
• etc. 
! 
• derived data: language, sentiment 
(with deadline)
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users Who? … 
• followers graph 
• logs 
• etc. 
(with deadline) 
! 
• derived data: language, sentiment @kristw / https://interactive.twitter.com
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users Who? … 
• followers graph 
• logs 
• etc. 
(with deadline) 
! 
• derived data: language, sentiment @kristw / https://interactive.twitter.com 
+ visualizations by @philogb, @miguelrios & @trebor
Questions?
big data => small data self, peer, external 
vis 
data 
at Twitter 
“Tweets” 
Get data 
1 
2 
Visualize 
Evaluate 
3 
What? Where? When? 
• users Who? … 
• followers graph 
• logs 
• etc. 
(with deadline) 
@kristw / https://interactive.twitter.com 
+ visualizations by @philogb, @miguelrios & @trebor
Thank you

More Related Content

PDF
Adventure in Data: A tour of visualization projects at Twitter
PDF
Making Sense of Millions of Thoughts: Finding Patterns in the Tweets
PDF
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
PDF
What to expect when you are visualizing
PDF
Logs & Visualizations at Twitter
PDF
Data Visualization: A Quick Tour for Data Science Enthusiasts
PDF
Increasing the Impact of Visualization Research
PDF
What to expect when you are visualizing (v.2)
Adventure in Data: A tour of visualization projects at Twitter
Making Sense of Millions of Thoughts: Finding Patterns in the Tweets
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
What to expect when you are visualizing
Logs & Visualizations at Twitter
Data Visualization: A Quick Tour for Data Science Enthusiasts
Increasing the Impact of Visualization Research
What to expect when you are visualizing (v.2)

What's hot (20)

PDF
6 things to expect when you are visualizing
PDF
Apache Spark 101 [in 50 min]
PPTX
Python for Big Data Analytics
DOCX
Twitter analysis
PDF
6 things to expect when you are visualizing (2020 Edition)
PDF
Analyzing social media with Python and other tools (4/4)
PDF
Power of Python with Big Data
PDF
Sourcing Candidates Using Twitter and Google+
PPT
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
PDF
UBC STAT545 2014 Cm001 intro to-course
PPTX
Integrating and Interpreting Social Data from Heterogeneous Sources
PPTX
Google Searchology
PPTX
Programming for Everybody in Python
PPT
Effective and efficient google searching power point tutorial
PDF
Open Source Community Metrics for FOSDEM
PDF
Hadoop and Neo4j: A Winning Combination for Bioinformatics
6 things to expect when you are visualizing
Apache Spark 101 [in 50 min]
Python for Big Data Analytics
Twitter analysis
6 things to expect when you are visualizing (2020 Edition)
Analyzing social media with Python and other tools (4/4)
Power of Python with Big Data
Sourcing Candidates Using Twitter and Google+
Open Source Search Tools for www2010 conferencesourcesearchtoolswww20100426dA...
UBC STAT545 2014 Cm001 intro to-course
Integrating and Interpreting Social Data from Heterogeneous Sources
Google Searchology
Programming for Everybody in Python
Effective and efficient google searching power point tutorial
Open Source Community Metrics for FOSDEM
Hadoop and Neo4j: A Winning Combination for Bioinformatics
Ad

Viewers also liked (20)

PDF
From Data to Visualization, what happens in between?
PDF
Principles of Data Visualization
PDF
Visual Design with Data
PDF
Data Visualization 101: How to Design Charts and Graphs
PDF
JESS3 Social Media Data Visualization
PDF
Analytics for startups
PPT
Data 2.0 - Harnessing New Data Visualization Tools CIL 2008
PPTX
Can Digital Data help predict the results of the US elections?
PDF
Tip from IBM Connect 2014: Socialytics = Social Business, Big Social Data and...
PPTX
Searching lexis nexis in power search mode
PPTX
Analyzing social conversation: a guide to data mining and data visualization
PDF
What is 1st, 2nd, 3rd party data?
PDF
Analysis and Visualization of Real-Time Twitter Data
PDF
Business Models in the Data Economy: A Case Study from the Business Partner D...
PPTX
Data Driven PR: 8 Steps to Building Media Attention with Research
PDF
Digital Winners 2013: Aleksander stensby
PPTX
Influence mapping Toolbox Presentation London 2015
PDF
Unleashing Twitter Data for Fun and Insight
PDF
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
PPT
Text Analytics: Yesterday, Today and Tomorrow
From Data to Visualization, what happens in between?
Principles of Data Visualization
Visual Design with Data
Data Visualization 101: How to Design Charts and Graphs
JESS3 Social Media Data Visualization
Analytics for startups
Data 2.0 - Harnessing New Data Visualization Tools CIL 2008
Can Digital Data help predict the results of the US elections?
Tip from IBM Connect 2014: Socialytics = Social Business, Big Social Data and...
Searching lexis nexis in power search mode
Analyzing social conversation: a guide to data mining and data visualization
What is 1st, 2nd, 3rd party data?
Analysis and Visualization of Real-Time Twitter Data
Business Models in the Data Economy: A Case Study from the Business Partner D...
Data Driven PR: 8 Steps to Building Media Attention with Research
Digital Winners 2013: Aleksander stensby
Influence mapping Toolbox Presentation London 2015
Unleashing Twitter Data for Fun and Insight
Designing a Better Planet with Big Data and Sensor Networks (for Intelligent ...
Text Analytics: Yesterday, Today and Tomorrow
Ad

Similar to Data Visualization at Twitter (20)

PPT
The evolution of research on social media
PDF
Data Visualization
PDF
Super Bowl 50 & the Twitterverse
PPTX
Hithai Shree.J and Varsha.R.pptx
PDF
Big Data in Economic Research: Twitter, Phone calls and Political events
PPT
This Is What Learning Looks Like: Using Analytic Tools to Visualise Classroom...
PPTX
Researching Social Media – Big Data and Social Media Analysis
PPT
Classifying Twitter Content
PPTX
INEGI ESS big data workshop
PDF
Tkclass Social Media Data for Research, Reporting
PDF
Challenges in-archiving-twitter
PDF
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
PDF
Network Mapping & Data Storytelling for Beginners
PPTX
Social Media Analytics Research at the QUT Digital Media Research Centre
PDF
Weller social media as research data_psm15
PDF
Social Media As Social Science Data Steven Lloyd Wilson
PPT
Ieee visap bkang
PDF
Information visualization of Twitter data for co-organizing conferences
PPTX
Open Data: Analysis and Visualisation
PPTX
New Methodologies for Capturing and Working with Publicly Available Twitter Data
The evolution of research on social media
Data Visualization
Super Bowl 50 & the Twitterverse
Hithai Shree.J and Varsha.R.pptx
Big Data in Economic Research: Twitter, Phone calls and Political events
This Is What Learning Looks Like: Using Analytic Tools to Visualise Classroom...
Researching Social Media – Big Data and Social Media Analysis
Classifying Twitter Content
INEGI ESS big data workshop
Tkclass Social Media Data for Research, Reporting
Challenges in-archiving-twitter
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Network Mapping & Data Storytelling for Beginners
Social Media Analytics Research at the QUT Digital Media Research Centre
Weller social media as research data_psm15
Social Media As Social Science Data Steven Lloyd Wilson
Ieee visap bkang
Information visualization of Twitter data for co-organizing conferences
Open Data: Analysis and Visualisation
New Methodologies for Capturing and Working with Publicly Available Twitter Data

More from Krist Wongsuphasawat (19)

PDF
What I tell myself before visualizing
PDF
Navigating the Wide World of Data Visualization Libraries
PDF
Encodable: Configurable Grammar for Visualization Components
PDF
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
PDF
Reveal the talking points of every episode of Game of Thrones from fans' conv...
PDF
A Narrative Display for Sports Tournament Recap
PDF
Visualization for Event Sequences Exploration
PDF
Krist Wongsuphasawat's Dissertation Proposal Slides: Interactive Exploration ...
PDF
Usability of Google Docs
PDF
Outflow: Exploring Flow, Factors and Outcome of Temporal Event Sequences
PDF
Information Visualization for Knowledge Discovery
PDF
Krist Wongsuphasawat's Dissertation Defense: Interactive Exploration of Tempo...
PDF
Information Visualization for Health Care
PDF
LifeFlow: Understanding Millions of Event Sequences in a Million Pixels
PDF
Information Visualization for Knowledge Discovery: An Introduction
PDF
Finding Comparable Temporal Categorical Records: A Similarity Measure with an...
PDF
Outflow: Visualizing Patients Flow by Symptoms & Outcome
PDF
Finding Patterns in Temporal Data
What I tell myself before visualizing
Navigating the Wide World of Data Visualization Libraries
Encodable: Configurable Grammar for Visualization Components
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
Reveal the talking points of every episode of Game of Thrones from fans' conv...
A Narrative Display for Sports Tournament Recap
Visualization for Event Sequences Exploration
Krist Wongsuphasawat's Dissertation Proposal Slides: Interactive Exploration ...
Usability of Google Docs
Outflow: Exploring Flow, Factors and Outcome of Temporal Event Sequences
Information Visualization for Knowledge Discovery
Krist Wongsuphasawat's Dissertation Defense: Interactive Exploration of Tempo...
Information Visualization for Health Care
LifeFlow: Understanding Millions of Event Sequences in a Million Pixels
Information Visualization for Knowledge Discovery: An Introduction
Finding Comparable Temporal Categorical Records: A Similarity Measure with an...
Outflow: Visualizing Patients Flow by Symptoms & Outcome
Finding Patterns in Temporal Data

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Computer network topology notes for revision
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
climate analysis of Dhaka ,Banglades.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Fluorescence-microscope_Botany_detailed content
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Business Analytics and business intelligence.pdf
Supervised vs unsupervised machine learning algorithms
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Reliability_Chapter_ presentation 1221.5784
IBA_Chapter_11_Slides_Final_Accessible.pptx
.pdf is not working space design for the following data for the following dat...
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Data_Analytics_and_PowerBI_Presentation.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...

Data Visualization at Twitter

  • 1. visualization at Twitter data Krist Wongsuphasawat / @kristw
  • 3. Bangkok, Thailand Krist Wongsuphasawat / @kristw
  • 4. Computer Engineer Bangkok, Thailand Chulalongkorn University Krist Wongsuphasawat / @kristw
  • 5. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  • 6. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  • 7. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  • 8. Computer Engineer Bangkok, Thailand M.S. in Computer Science Univ. of Maryland Krist Wongsuphasawat / @kristw
  • 9. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw
  • 10. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw IBM Microsoft
  • 11. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw IBM Microsoft Sr. Data Visualization Scientist Twitter
  • 15. vis data at Twitter
  • 16. data at Twitter “Tweets”
  • 17. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news …
  • 18. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news … #curiosity Sleep pattern Human behavior Language …
  • 19. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news … #curiosity Sleep pattern Human behavior Language … What could we learn from the Tweets?
  • 20. vis data at Twitter “Tweets” Tell stories about an event, Pursue curiosity or inspiration Goal:
  • 21. vis data at Twitter “Tweets” Tell stories about an event, Pursue curiosity or inspiration (with deadline) Goal:
  • 23. vis data at Twitter “Tweets” Get data 1
  • 24. easy?
  • 25. Having all Tweets How people think I feel.
  • 26. Having all Tweets How people think I feel. How I really feel.
  • 27. Challenges • Too much data • Want only relevant Tweets • hashtag: #BRA • keywords: “goal” • Need to aggregate & reduce size • Long processing time (hours)
  • 28. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Workflow
  • 29. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Workflow
  • 30. Workflow Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Your laptop Smaller dataset
  • 31. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Tool node.js / python / excel (fast) Final dataset Your laptop Workflow Smaller dataset
  • 32. vis data at Twitter “Tweets” Get data 1 2 Visualize
  • 33. Visualize • Peek into data • Check data & test ideas • Decide how to visualize • Guided by data type • Choose tools • Start building
  • 34. Visualize • Peek into data • Check data & test ideas • Decide how to visualize • Guided by data type • Choose tools • Start building R d3 Tableau Yeoman
  • 35. (+ media) photos, videos data What? TEXT Where? When? GEO TIME
  • 36. Visualize Data What? TEXT Where? When? GEO TIME
  • 37. Visualize Data What? TEXT Where? When? GEO TIME
  • 40. Time Tweets/second + Annotation http://www.flickr.com/photos/twitteroffice/5681263084/
  • 41. Visualize Data What? TEXT Where? When? GEO TIME
  • 42. Geo Heatmap Low density High density
  • 43. Geo San Francisco flickr.com/photos/twitteroffice/8798020541 Low density High density
  • 44. Geo San Francisco Rebuild the world based on tweet volumes twitter.github.io/interactive/andes/
  • 45. Visualize Data What? TEXT Where? When? GEO TIME
  • 46. Text www.wordle.net Some experiments during World Cup
  • 47. Text www.wordle.net Word cloud of Tweets right after the 1st goal
  • 48. Text Word cloud of Tweets right after the 1st goal It was an “own” goal. www.wordle.net
  • 49. Text WordTree [Wattenberg & Viégas 2008] www.jasondavies.com/wordtree www.jasondavies.com/wordtree
  • 50. Visualize Data What? TEXT Where? When? GEO TIME
  • 51. Time + Geo Japan Earthquake 2011 blog.twitter.com/2011/global-pulse youtu.be/SybWjN9pKQk
  • 54. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  • 55. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  • 56. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  • 57. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  • 58. Visualize Data What? TEXT Where? When? GEO TIME
  • 59. Geo + Text Real-time Tweet map
  • 60. Geo + Text Real-time Tweet map
  • 61. Geo + Text Real-time Tweet map most frequent term
  • 62. Geo + Text Real-time Tweet map Gmail was down Jan 24, 2014
  • 63. Geo + Text Real-time Tweet map Nelson Mandela passed away Dec 5, 2013
  • 64. Visualize Data What? TEXT Where? When? GEO TIME
  • 65. Time + Text UEFA Champions League Biggest tournament for European soccer clubs Many Tweets during the matches
  • 66. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  • 67. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  • 68. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  • 69. UEFA Champions League Team 1 Team 2 Dortmund Bayern Munich Count Tweets mentioning the teams every minute Time + Text
  • 70. Time + Text UEFA Champions League
  • 71. Time + Text UEFA Champions League + “goal” count + context
  • 72. + “offside” Time + Text UEFA Champions League
  • 73. + players Time + Text UEFA Champions League
  • 74. Competition Tree vs vs A B C D vs A C C
  • 75. Competition Tree vs vs A B C D vs + A C C
  • 76. Competition Tree vs vs A B C D vs + = A C C
  • 77. Visualize Data What? TEXT Where? When? GEO TIME
  • 78. Time + Text + Geo State of the Union twitter.github.io/interactive/sotu2014
  • 79. Time + Text + Geo State of the Union 1) timeline + topic from Tweets 4) Density map of Tweets about selected topic 3) Volume of Tweets by topics during selected part of the SOTU 2) context (speech) twitter.github.io/interactive/sotu2014
  • 80. Time + Text World Cup 2014
  • 81. Time + Text + Geo World Cup 2014
  • 82. Visualize Data What? TEXT Where? When? GEO TIME
  • 83. Visualize Data What? TEXT + Where? When? GEO TIME Non-Twitter data CONTEXT
  • 84. Time + Text New Year 2014
  • 85. Time + Text New Year 2014
  • 86. Time + Text + Geo (c) New Year 2014 twitter.github.io/interactive/newyear2014/
  • 87. vis data at Twitter “Tweets” Get data 1 2 Visualize
  • 88. vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3
  • 89. vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 Iterate!
  • 90. Evaluation • Self • Peer feedback • Non team members / Potential audience
  • 91. vis data at Twitter Get data 1 2 Visualize Evaluate 3
  • 92. vis data at Twitter Get data 1 2 Visualize Evaluate 3 big data => small data
  • 93. vis data at Twitter Get data 1 2 Visualize Evaluate 3 big data => small data What? Where? When?
  • 94. big data => small data self, peer, external vis data at Twitter Get data 1 2 Visualize Evaluate 3 What? Where? When?
  • 95. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When?
  • 96. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users • followers graph • logs • etc. ! • derived data: language, sentiment
  • 97. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. ! • derived data: language, sentiment
  • 98. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. ! • derived data: language, sentiment (with deadline)
  • 99. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) ! • derived data: language, sentiment @kristw / https://interactive.twitter.com
  • 100. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) ! • derived data: language, sentiment @kristw / https://interactive.twitter.com + visualizations by @philogb, @miguelrios & @trebor
  • 102. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) @kristw / https://interactive.twitter.com + visualizations by @philogb, @miguelrios & @trebor