SlideShare a Scribd company logo
Crisis Computing
Finding relevant and credible information on social
media during disasters
Big Data Analytics Conference
Delhi, India, December 2014
January 2010
How/when did it start for me?
Humanitarian Computing
At least 775publications:
●
Crisis Analysis (55)
●
Crisis Management (309)
●
Situational Awareness (67)
●
Social Media (231)
●
Mobile Phones (74)
●
Crowdsourcing (116)
●
Software and Tools (97)
●
Human-Computer Interaction (28) 
●
Natural Language Processing (33) 
●
Trust and Security (33)
●
Geographical Analysis (53)
Source: http://humanitariancomp.referata.com/
Humanitarian Computing Topics
Crisis Computing
Crisis Computing
http://www.youtube.com/watch?v=0UFsJhYBxzY
8
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
An earthquake hits a Twitter user
• When an earthquake strikes, the first tweets are
posted 20-30 seconds later
• Damaging seismic waves travel at 3-5 km/s, while
network communications are light speed on
fiber/copper + latency
• After ~100km seismic waves may be overtaken by
tweets about them
http://xkcd.com/723/
Examples of crisis tweets
Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the
Unexpected Happens: Social Media Communications Across Crises.
To appear in CSCW 2015.
Examples of crisis tweets (cont.)
11
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Fertile grounds for applied research
✔
Problems of global significance
✔
Solved with labor-intensive methods
✔
Better solution provides a public good
✔
Large and noisy data sets available
✔
Engage volunteer communities
12
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Fertile grounds for applied research
✔
Problems of global significance
✔
Solved with labor-intensive methods
✔
Better solution provides a public good
✔
Large and noisy data sets available
✔
Engage volunteer communities
• Relevance to practitioners?
13
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Current collaborators
Patrick Meier
– QCRI
Sarah Vieweg
– QCRI
Muhammad Imran
– QCRI
Irina Temnikova
– QCRI
Alexandra Olteanu
– EPFL
Aditi Gupta
– IIIT Delhi
“P.K.” Kumaraguru
– IIIT Delhi
Fernando Diaz
– Microsoft
14
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Outline
Crisis Maps
Extraction
Matching
Verification
Credibility
Crisis maps from social media
Carlos Castillo, Fernando Diaz, and Hemant Purohit:
Leveraging Social Media and Web of Data to Assist Crisis Response Coordination
Tutorial at SDM, Philadelphia, PA, USA. April 2014.
Hemant Purohit, Carlos Castillo, Patrick Meier and Amit Sheth:
Crisis Mapping, Citizen Sensing and Social Media Analytics
Tutorial at ICWSM, May 2013.
Crisis Computing
Crisis Computing
Crisis Computing
Crisis Computing
Patrick Meier, Social Innovation Director @ QCRI – http://irevolution.net/
“What can speed humanitarian
response to tsunami-ravaged
coasts? Expose human rights
atrocities? Launch helicopters to
rescue earthquake victims?
Outwit corrupt regimes?
A map.”
21
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Crisis mapping goes mainstream (2011)
Crisis Computing
Crisis Computing
Crisis Computing
Crisis Computing
http://newsbeatsocial.com/watch/0_s6xxcr3p
Crisis Computing
Understanding Crisis Tweets
Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the
Unexpected Happens: Social Media Communications Across Crises.
To appear in CSCW 2015.
29
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Types of Disaster
30
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
3.
Extraction
Our approach
2.
Classification
1.
Filtering
31
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Filtering
Is disaster-
related?
Contributes to
situational
awareness?
Yes Yes
No No
32
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Classification
Caution &
Advice
Information
Sources
Damage &
Casualties
Donations
Gov
Eyewitness
Media
NGO
Outsider
...
...
Filtered
tweets
33
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
A large-scale study of crisis tweets
• Collect tweets from 26 disasters
• Classify according to:
●
Informative / Not informative
●
Information provided
●
Information source
34
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Advice on labeling
• Your instructions will never be correct the first
time you try
– e.g. personal / eyewitness
– Instructions must be re-written reactively
– Perform small-scale labeling first
• Instructions must be concrete and brief
– If you can't do it, the task has to be divided
35
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Information Provided in Crisis Tweets
N=26; Data available at http://crisislex.org/
36
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
What do people tweet about?
• Affected individuals
– 20% on average (min. 5%, max. 57%)
– most prevalent in human-induced, focalized & instantaneous events
• Sympathy and emotional support
– 20% on average (min. 3%, max. 52%)
– most prevalent in instantaneous events
• Other useful information
– 32% on average (min. 7%, max. 59%)
– least prevalent in diffused events
37
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
What do people tweet about? (cont.)
• Infrastructure and utilities
– 7% on average (min. 0%, max. 22%)
– most prevalent in diffused events, in particular floods
• Caution and advice
– 10% on average (min. 0%, max. 34%)
– least prevalent in instantaneous & human-induced events
• Donations and volunteering
– 10% on average (min. 0%, max. 44%)
– most prevalent in natural hazards
Distribution over information sources
Distribution over time
Extracting information and matching
emergency-related resources
Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier:
Extracting Information Nuggets from Disaster-Related Messages in Social Media
In ISCRAM. Baden-Baden, Germany, 2013. Best paper award.
Hemant Purohit, Amit Sheth, Carlos Castillo, Patrick Meier, Fernando Diaz:
Emergency-Relief Coord. on Social Media: Auto. Matching Resource Requests and Offers
First Monday 19 (1), January 2014
Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier:
Practical Extraction of Disaster-Relevant Information from Social Media
In SWDM. Rio de Janeiro, Brazil, 2013
41
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Information Extraction
...
Classified
tweets
@JimFreund: Apparently we have no choice.
There is a tornado watch in effect
tonight.
42
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Extraction
• #hashtags, @user mentions, URLs, etc.
– Regular expressions
– Text library from Twitter
• Temporal expressions
– Part-of-speech tagger + heuristics
– Natty library
• Supervised learning
43
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Labels for extraction
• Type-dependent instruction
• Ask evaluators to copy-paste a word/phrase from
each tweet
44
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Learning: Conditional Random Fields
• Used extensively in NLP for part-of-speech tagging
and information extraction
• Representation of observations is important
(capitalization, position, etc.)
HMM Linear-chain CRF
hidden
observed
45
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Tool
• CMU ARK Twitter NLP
– Tokenization
– Feature extraction
– CRF learning
• Very easy to use: simply change the training set
(part-of-speech tags) into anything, and re-train
46
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Output examples
RT @weatherchannel: .@NYGovCuomo orders closing of NYC bridges. Only
Staten Island bridges unaffected at this time. Bridges must close by 7pm. #Sandy
#NYC
Wow what a mess #Sandy has made. Be sure to check on the elderly and
homeless please! Thoughts and prayers to all affected
RT @twc_hurricane: Wind gusts over 60 mph are being reported at Central Park
and JFK airport in #NYC this hour. #Sandy
RT @mitchellreports: Red Cross tells us grateful for Romney donation but prefer
people send money or donate blood dont collect goods NOT best way to help
#Sandy
47
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Extractor evaluation
Setting Rec Prec
Train 2/3 Joplin, Test 1/3 Joplin 78% 90%
Train 2/3 Sandy, Test 1/3 Sandy 41% 79%
Train Joplin, Test Sandy 11% 78%
Train Joplin + 10% Sandy, Test 90% Sandy 21% 81%
• Precision is: one word or more in common with
what humans extracted
48
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Donations matching
• Identify and match requests/offers for donations
– Money, clothing, food, shelter, volunteers, blood
Average precision = 0.21 (0.16 if only text similarity is used)
Crowdsourced stream processing systems
Muhammad Imran, Ioanna Lykourentzou and Carlos Castillo:
Engineering Crowdsourced Stream Processing Systems
http://arxiv.org/abs/1310.5463
50
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
51
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Design objectives and principles
Design principles
Design objective Example metric Automatic
components
Crowdsourced
components
Low latency End-to-end time Keep-items moving Trivial tasks
High throughput Output items per
unit of time
High-performance
processing
Task automation
Load adaptability Rate response
function
Load shedding, load
queueing
Task prioritization
Cost effectiveness Cost vs. quality,
throughput, etc.
N/A Task frugality
High quality Application-
dependent
Redudancy, aggregation and quality control
Design patterns
● QA loop
● Task assignment
● Process/verify
● Supervised learning
● Crowdwork sub-task
chaining
● Humans are not a
bottleneck
● Humans review every
output element
53
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
http://aidr.qcri.org/
54
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Self-service for crisis-related classification
Unstructured
text reports
Categorized
information
Automatic
classifier
Model
Builder
Crowdsourced
ground-truth
Library of
training data
Crisis Computing
Crisis Computing
Credibility and verification
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo and Patrick Meier:
TweetCred: A Real-time Web-based System for Credibility of Content on Twitter
In SocInfo 2014. Runner-up for best paper award.
Carlos Castillo, Marcelo Mendoza, Barbara Poblete:
Predicting Information Credibility in Time-Sensitive Social Media
In Internet Research, Vol. 23, Issue 5. October 2013.
A. Popoola, D. Krasnoshtan, A. Toth, V. Naroditskiy, C. Castillo, P. Meier and I. Rahwan:
Information Verification during Natural Disasters
Social Web and Disaster Management (SWDM) workshop, 2013.
Crisis Computing
3
Crisis Computing
http://www.youtube.com/watch?v=pAHoEO-K0Ek
62
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Crowdsourced verification: Veri.ly
• Frame crowdwork correctly
• Not upvoting/downvoting a claim
• Instead, providing evidence for/against
@VeriDotLy — http://veri.ly/
Crisis Computing
Crisis Computing
65
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Examples of evidence provided
66
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Automatic credibility evaluation: TweetCred
• Real-time web-based service
• Used as a Chrome extension
• Annotates Twitter's timeline with credibility
scores
67
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
http://twitdigest.iiitd.edu.in/TweetCred/
68
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Next steps
• Credibility facets
– Factually written
– Detailed
– Author on the ground
– ...
• Respond to searches about an event
Crisis Computing
Closing remarks
71
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Computationally
feasible
Supported by
data
Useful
Good projects in this space
72
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Computationally
feasible
Supported by
data
Useful
Good projects in this space
Temptation! Danger!
Poorly planned
projects :-(
AI-complete
problems
73
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Some venues
• SWDM – Workshop on Social Web
for Disaster Management
– Deadline: January 24th
• ISCRAM – International Conference on Information Systems
for Crisis Response and Management
+ the usual suspects, depending on your area ;-)
74
Carlos Castillo – chato@acm.org
http://www.chato.cl/research/
Possibility of large impact by using computer
science to support humanitarian work
=
Applied computing at its best
Thank you!
Carlos Castillo · chato@acm.org
http://www.chato.cl/research/
With thanks to Patrick Meier for several slides

More Related Content

PDF
Social Media News Mining and Automatic Content Analysis of News
PDF
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
PDF
How to Leverage Social Media Communities for Crisis Response Coordination
PDF
Automatically Rank Social Media Requests for Emergency Services using Service...
PDF
Social Media & Web Mining for Public Services of Smart Cities - SSA Talk
PDF
The Role of Social Media and Artificial Intelligence for Disaster Response
PPS
Overview of Social Media During Disaster
PDF
Public Health Crisis Analytics for Gender Violence
Social Media News Mining and Automatic Content Analysis of News
SIAM SDM2014 tutorial - Social Media and Web of Data to Assist Crisis Respons...
How to Leverage Social Media Communities for Crisis Response Coordination
Automatically Rank Social Media Requests for Emergency Services using Service...
Social Media & Web Mining for Public Services of Smart Cities - SSA Talk
The Role of Social Media and Artificial Intelligence for Disaster Response
Overview of Social Media During Disaster
Public Health Crisis Analytics for Gender Violence

What's hot (20)

PDF
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
PDF
Applying citizen science model to disaster management
PDF
Real-Time Processing of Social Media Content for Social Good
ODP
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
PPTX
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
DOCX
The case for integrating crisis response with social media
PPTX
Extracting Information Nuggets from Disaster-Related Messages in Social Media
PPTX
Processing Social Media Messages in Mass Emergency: A Survey
PPT
SOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTER
PDF
Role of social media in disaster management
DOCX
Capstone Lessons Learned
KEY
Emergency Risk Communication
PPTX
Twitris in Action - a review of its many applications
PDF
CDG14_BRIEF_ArchiveSocial_V
PDF
Department of Homeland Security Report- Lessons Learned Using Social Media Du...
PDF
Lessons learned from Social media intervention during hurricane Sandy
PPT
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
PDF
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
PDF
Snowden-final-report-for-publication
PPTX
Com 427 final presentation
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
Applying citizen science model to disaster management
Real-Time Processing of Social Media Content for Social Good
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
Crisis Mapping, Citizen Sensing and Social Media Analytics: Leveraging Citize...
The case for integrating crisis response with social media
Extracting Information Nuggets from Disaster-Related Messages in Social Media
Processing Social Media Messages in Mass Emergency: A Survey
SOCIAL MEDIA: BEFORE, DURING AND AFTER A DISASTER
Role of social media in disaster management
Capstone Lessons Learned
Emergency Risk Communication
Twitris in Action - a review of its many applications
CDG14_BRIEF_ArchiveSocial_V
Department of Homeland Security Report- Lessons Learned Using Social Media Du...
Lessons learned from Social media intervention during hurricane Sandy
Social Media in Sri Lanka: Do Science and Reason Stand a Chance? - Nalaka Gun...
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
Snowden-final-report-for-publication
Com 427 final presentation
Ad

Viewers also liked (20)

PDF
Keynote talk: Big Crisis Data, an Open Invitation
PDF
Fairness-Aware Data Mining
PDF
Discrimination Discovery
PPTX
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
PDF
Databeers: Big Crisis Data
PDF
Big Crisis Data for ISPC
PDF
Detecting Algorithmic Bias (keynote at DIR 2016)
PDF
Dr. Searcher and Mr. Browser: A unified hyperlink-click graph
PDF
Characterizing the Life Cycle of Online News Stories Using Social Media React...
PDF
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
PDF
Information Verification During Natural Disasters
PDF
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
PDF
Kdd12 tutorial-inf-part-i
PDF
Kdd12 tutorial-inf-part-ii
PDF
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
PDF
Kdd12 tutorial-inf-part-iv
PDF
What to Expect When the Unexpected Happens: Social Media Communications Acros...
PDF
Emotions and dialogue in a peer-production community: the case of Wikipedia
PDF
Kdd12 tutorial-inf-part-iii
PDF
Social Media Mining and Retrieval
Keynote talk: Big Crisis Data, an Open Invitation
Fairness-Aware Data Mining
Discrimination Discovery
A Robust Framework for Classifying Evolving Document Streams in an Expert-Mac...
Databeers: Big Crisis Data
Big Crisis Data for ISPC
Detecting Algorithmic Bias (keynote at DIR 2016)
Dr. Searcher and Mr. Browser: A unified hyperlink-click graph
Characterizing the Life Cycle of Online News Stories Using Social Media React...
The Effects of Time on Query Flow Graph-based Models for Query Suggestion
Information Verification During Natural Disasters
Social Media News Communities: Gatekeeping, Coverage, and Statement Bias
Kdd12 tutorial-inf-part-i
Kdd12 tutorial-inf-part-ii
TweetCred: Real-Time Credibility Assessment of 
 Content on Twitter @ Socinfo...
Kdd12 tutorial-inf-part-iv
What to Expect When the Unexpected Happens: Social Media Communications Acros...
Emotions and dialogue in a peer-production community: the case of Wikipedia
Kdd12 tutorial-inf-part-iii
Social Media Mining and Retrieval
Ad

Similar to Crisis Computing (20)

PDF
Crisis Informatics (November 2013)
PPTX
Akram.pptx
PPTX
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
PDF
Expelling Information of Events from Critical Public Space using Social Senso...
PPTX
Evolution of the Humanitarian Data Ecosystem
PDF
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
PDF
Crisis Information Processing - with the power of A.I.
PPTX
Introduction to Machine Learning: An Application to Disaster Response
PDF
Classifying Crises-Information Relevancy with Semantics
PDF
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
PPTX
Examples of Real-World Big Data Application
PDF
Technology Trends in Situation Awareness
PPTX
From Research to Applications: What Can We Extract with Social Media Sensing?
PPTX
Emerging Trends in Crisis Informatics
PPTX
Crisis Event Extraction Service (CREES) – Automatic Detection and Classificat...
PDF
Leveraging technology in disaster management
PPTX
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...
PDF
Multimodal Combination.pdf
PDF
Big Data from Social Media and Crowdsourcing in Emergencies
Crisis Informatics (November 2013)
Akram.pptx
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...
Expelling Information of Events from Critical Public Space using Social Senso...
Evolution of the Humanitarian Data Ecosystem
Coordinating Human and Machine Intelligence to Classify Microblog Communica0o...
Crisis Information Processing - with the power of A.I.
Introduction to Machine Learning: An Application to Disaster Response
Classifying Crises-Information Relevancy with Semantics
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Examples of Real-World Big Data Application
Technology Trends in Situation Awareness
From Research to Applications: What Can We Extract with Social Media Sensing?
Emerging Trends in Crisis Informatics
Crisis Event Extraction Service (CREES) – Automatic Detection and Classificat...
Leveraging technology in disaster management
ISCRAM 2013: Extracting Information Nuggets from Disaster-Related Messages i...
Multimodal Combination.pdf
Big Data from Social Media and Crowdsourcing in Emergencies

More from Carlos Castillo (ChaTo) (19)

PDF
Finding High Quality Content in Social Media
PDF
When no clicks are good news
PDF
Observational studies in social media
PDF
Natural experiments
PDF
Content-based link prediction
PDF
Link prediction
PDF
Recommender Systems
PDF
Graph Partitioning and Spectral Methods
PDF
Finding Dense Subgraphs
PDF
Graph Evolution Models
PDF
Link-Based Ranking
PDF
Text Indexing / Inverted Indices
PDF
Text Summarization
PDF
Hierarchical Clustering
PDF
K-Means Algorithm
PDF
Text similarity and the vector space model
PDF
Intro to Creative Commons (May 2015)
Finding High Quality Content in Social Media
When no clicks are good news
Observational studies in social media
Natural experiments
Content-based link prediction
Link prediction
Recommender Systems
Graph Partitioning and Spectral Methods
Finding Dense Subgraphs
Graph Evolution Models
Link-Based Ranking
Text Indexing / Inverted Indices
Text Summarization
Hierarchical Clustering
K-Means Algorithm
Text similarity and the vector space model
Intro to Creative Commons (May 2015)

Recently uploaded (20)

PDF
Mastering Social Media Marketing in 2025.pdf
PPTX
Types of Social Media Marketing for Business Success
PDF
The Edge You’ve Been Missing Get the Sociocosmos Edge
PDF
Live Echo Boost on TikTok_ Double Devices, Higher Ranks
PPT
memimpindegra1uejehejehdksnsjsbdkdndgggwksj
PDF
25K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
PPTX
Developing lesson plan gejegkavbw gagsgf
PDF
Climate Risk and Credit Allocation: How Banks Are Integrating Environmental R...
PDF
Instant Audience, Long-Term Impact Buy Real Telegram Members
PDF
The Fastest Way to Look Popular Buy Reactions Today
PDF
11111111111111111111111111111111111111111111111
DOCX
Buy Goethe A1 ,B2 ,C1 certificate online without writing
PDF
Why Digital Marketing Matters in Today’s World Ask ChatGPT
PPTX
How Social Media Influencers Repurpose Content (1).pptx
PPTX
Strategies for Social Media App Enhancement
PDF
StarNetCafeSB2012D3POYNagaworld2-Hotel-Casino-Phnom Entertainment
PDF
Transform Your Social Media, Grow Your Brand
PDF
Subscribe This Channel Subscribe Back You
PDF
Your Best Post Vanished. Blame the Attention Economy
PPTX
Result-Driven Social Media Marketing Services | Boost ROI
Mastering Social Media Marketing in 2025.pdf
Types of Social Media Marketing for Business Success
The Edge You’ve Been Missing Get the Sociocosmos Edge
Live Echo Boost on TikTok_ Double Devices, Higher Ranks
memimpindegra1uejehejehdksnsjsbdkdndgggwksj
25K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
Developing lesson plan gejegkavbw gagsgf
Climate Risk and Credit Allocation: How Banks Are Integrating Environmental R...
Instant Audience, Long-Term Impact Buy Real Telegram Members
The Fastest Way to Look Popular Buy Reactions Today
11111111111111111111111111111111111111111111111
Buy Goethe A1 ,B2 ,C1 certificate online without writing
Why Digital Marketing Matters in Today’s World Ask ChatGPT
How Social Media Influencers Repurpose Content (1).pptx
Strategies for Social Media App Enhancement
StarNetCafeSB2012D3POYNagaworld2-Hotel-Casino-Phnom Entertainment
Transform Your Social Media, Grow Your Brand
Subscribe This Channel Subscribe Back You
Your Best Post Vanished. Blame the Attention Economy
Result-Driven Social Media Marketing Services | Boost ROI

Crisis Computing

  • 1. Crisis Computing Finding relevant and credible information on social media during disasters Big Data Analytics Conference Delhi, India, December 2014
  • 2. January 2010 How/when did it start for me?
  • 3. Humanitarian Computing At least 775publications: ● Crisis Analysis (55) ● Crisis Management (309) ● Situational Awareness (67) ● Social Media (231) ● Mobile Phones (74) ● Crowdsourcing (116) ● Software and Tools (97) ● Human-Computer Interaction (28)  ● Natural Language Processing (33)  ● Trust and Security (33) ● Geographical Analysis (53) Source: http://humanitariancomp.referata.com/
  • 8. 8 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ An earthquake hits a Twitter user • When an earthquake strikes, the first tweets are posted 20-30 seconds later • Damaging seismic waves travel at 3-5 km/s, while network communications are light speed on fiber/copper + latency • After ~100km seismic waves may be overtaken by tweets about them http://xkcd.com/723/
  • 10. Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the Unexpected Happens: Social Media Communications Across Crises. To appear in CSCW 2015. Examples of crisis tweets (cont.)
  • 11. 11 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Fertile grounds for applied research ✔ Problems of global significance ✔ Solved with labor-intensive methods ✔ Better solution provides a public good ✔ Large and noisy data sets available ✔ Engage volunteer communities
  • 12. 12 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Fertile grounds for applied research ✔ Problems of global significance ✔ Solved with labor-intensive methods ✔ Better solution provides a public good ✔ Large and noisy data sets available ✔ Engage volunteer communities • Relevance to practitioners?
  • 13. 13 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Current collaborators Patrick Meier – QCRI Sarah Vieweg – QCRI Muhammad Imran – QCRI Irina Temnikova – QCRI Alexandra Olteanu – EPFL Aditi Gupta – IIIT Delhi “P.K.” Kumaraguru – IIIT Delhi Fernando Diaz – Microsoft
  • 14. 14 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Outline Crisis Maps Extraction Matching Verification Credibility
  • 15. Crisis maps from social media Carlos Castillo, Fernando Diaz, and Hemant Purohit: Leveraging Social Media and Web of Data to Assist Crisis Response Coordination Tutorial at SDM, Philadelphia, PA, USA. April 2014. Hemant Purohit, Carlos Castillo, Patrick Meier and Amit Sheth: Crisis Mapping, Citizen Sensing and Social Media Analytics Tutorial at ICWSM, May 2013.
  • 20. Patrick Meier, Social Innovation Director @ QCRI – http://irevolution.net/ “What can speed humanitarian response to tsunami-ravaged coasts? Expose human rights atrocities? Launch helicopters to rescue earthquake victims? Outwit corrupt regimes? A map.”
  • 21. 21 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Crisis mapping goes mainstream (2011)
  • 28. Understanding Crisis Tweets Alexandra Olteanu, Sarah Vieweg and Carlos Castillo: What to Expect When the Unexpected Happens: Social Media Communications Across Crises. To appear in CSCW 2015.
  • 29. 29 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Types of Disaster
  • 30. 30 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ 3. Extraction Our approach 2. Classification 1. Filtering
  • 31. 31 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Filtering Is disaster- related? Contributes to situational awareness? Yes Yes No No
  • 32. 32 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Classification Caution & Advice Information Sources Damage & Casualties Donations Gov Eyewitness Media NGO Outsider ... ... Filtered tweets
  • 33. 33 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ A large-scale study of crisis tweets • Collect tweets from 26 disasters • Classify according to: ● Informative / Not informative ● Information provided ● Information source
  • 34. 34 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Advice on labeling • Your instructions will never be correct the first time you try – e.g. personal / eyewitness – Instructions must be re-written reactively – Perform small-scale labeling first • Instructions must be concrete and brief – If you can't do it, the task has to be divided
  • 35. 35 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Information Provided in Crisis Tweets N=26; Data available at http://crisislex.org/
  • 36. 36 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ What do people tweet about? • Affected individuals – 20% on average (min. 5%, max. 57%) – most prevalent in human-induced, focalized & instantaneous events • Sympathy and emotional support – 20% on average (min. 3%, max. 52%) – most prevalent in instantaneous events • Other useful information – 32% on average (min. 7%, max. 59%) – least prevalent in diffused events
  • 37. 37 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ What do people tweet about? (cont.) • Infrastructure and utilities – 7% on average (min. 0%, max. 22%) – most prevalent in diffused events, in particular floods • Caution and advice – 10% on average (min. 0%, max. 34%) – least prevalent in instantaneous & human-induced events • Donations and volunteering – 10% on average (min. 0%, max. 44%) – most prevalent in natural hazards
  • 40. Extracting information and matching emergency-related resources Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier: Extracting Information Nuggets from Disaster-Related Messages in Social Media In ISCRAM. Baden-Baden, Germany, 2013. Best paper award. Hemant Purohit, Amit Sheth, Carlos Castillo, Patrick Meier, Fernando Diaz: Emergency-Relief Coord. on Social Media: Auto. Matching Resource Requests and Offers First Monday 19 (1), January 2014 Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz and Patrick Meier: Practical Extraction of Disaster-Relevant Information from Social Media In SWDM. Rio de Janeiro, Brazil, 2013
  • 41. 41 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Information Extraction ... Classified tweets @JimFreund: Apparently we have no choice. There is a tornado watch in effect tonight.
  • 42. 42 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Extraction • #hashtags, @user mentions, URLs, etc. – Regular expressions – Text library from Twitter • Temporal expressions – Part-of-speech tagger + heuristics – Natty library • Supervised learning
  • 43. 43 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Labels for extraction • Type-dependent instruction • Ask evaluators to copy-paste a word/phrase from each tweet
  • 44. 44 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Learning: Conditional Random Fields • Used extensively in NLP for part-of-speech tagging and information extraction • Representation of observations is important (capitalization, position, etc.) HMM Linear-chain CRF hidden observed
  • 45. 45 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Tool • CMU ARK Twitter NLP – Tokenization – Feature extraction – CRF learning • Very easy to use: simply change the training set (part-of-speech tags) into anything, and re-train
  • 46. 46 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Output examples RT @weatherchannel: .@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges unaffected at this time. Bridges must close by 7pm. #Sandy #NYC Wow what a mess #Sandy has made. Be sure to check on the elderly and homeless please! Thoughts and prayers to all affected RT @twc_hurricane: Wind gusts over 60 mph are being reported at Central Park and JFK airport in #NYC this hour. #Sandy RT @mitchellreports: Red Cross tells us grateful for Romney donation but prefer people send money or donate blood dont collect goods NOT best way to help #Sandy
  • 47. 47 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Extractor evaluation Setting Rec Prec Train 2/3 Joplin, Test 1/3 Joplin 78% 90% Train 2/3 Sandy, Test 1/3 Sandy 41% 79% Train Joplin, Test Sandy 11% 78% Train Joplin + 10% Sandy, Test 90% Sandy 21% 81% • Precision is: one word or more in common with what humans extracted
  • 48. 48 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Donations matching • Identify and match requests/offers for donations – Money, clothing, food, shelter, volunteers, blood Average precision = 0.21 (0.16 if only text similarity is used)
  • 49. Crowdsourced stream processing systems Muhammad Imran, Ioanna Lykourentzou and Carlos Castillo: Engineering Crowdsourced Stream Processing Systems http://arxiv.org/abs/1310.5463
  • 50. 50 Carlos Castillo – chato@acm.org http://www.chato.cl/research/
  • 51. 51 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Design objectives and principles Design principles Design objective Example metric Automatic components Crowdsourced components Low latency End-to-end time Keep-items moving Trivial tasks High throughput Output items per unit of time High-performance processing Task automation Load adaptability Rate response function Load shedding, load queueing Task prioritization Cost effectiveness Cost vs. quality, throughput, etc. N/A Task frugality High quality Application- dependent Redudancy, aggregation and quality control
  • 52. Design patterns ● QA loop ● Task assignment ● Process/verify ● Supervised learning ● Crowdwork sub-task chaining ● Humans are not a bottleneck ● Humans review every output element
  • 53. 53 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ http://aidr.qcri.org/
  • 54. 54 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Self-service for crisis-related classification Unstructured text reports Categorized information Automatic classifier Model Builder Crowdsourced ground-truth Library of training data
  • 57. Credibility and verification Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo and Patrick Meier: TweetCred: A Real-time Web-based System for Credibility of Content on Twitter In SocInfo 2014. Runner-up for best paper award. Carlos Castillo, Marcelo Mendoza, Barbara Poblete: Predicting Information Credibility in Time-Sensitive Social Media In Internet Research, Vol. 23, Issue 5. October 2013. A. Popoola, D. Krasnoshtan, A. Toth, V. Naroditskiy, C. Castillo, P. Meier and I. Rahwan: Information Verification during Natural Disasters Social Web and Disaster Management (SWDM) workshop, 2013.
  • 59. 3
  • 62. 62 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Crowdsourced verification: Veri.ly • Frame crowdwork correctly • Not upvoting/downvoting a claim • Instead, providing evidence for/against @VeriDotLy — http://veri.ly/
  • 65. 65 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Examples of evidence provided
  • 66. 66 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Automatic credibility evaluation: TweetCred • Real-time web-based service • Used as a Chrome extension • Annotates Twitter's timeline with credibility scores
  • 67. 67 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ http://twitdigest.iiitd.edu.in/TweetCred/
  • 68. 68 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Next steps • Credibility facets – Factually written – Detailed – Author on the ground – ... • Respond to searches about an event
  • 71. 71 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Computationally feasible Supported by data Useful Good projects in this space
  • 72. 72 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Computationally feasible Supported by data Useful Good projects in this space Temptation! Danger! Poorly planned projects :-( AI-complete problems
  • 73. 73 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Some venues • SWDM – Workshop on Social Web for Disaster Management – Deadline: January 24th • ISCRAM – International Conference on Information Systems for Crisis Response and Management + the usual suspects, depending on your area ;-)
  • 74. 74 Carlos Castillo – chato@acm.org http://www.chato.cl/research/ Possibility of large impact by using computer science to support humanitarian work = Applied computing at its best
  • 75. Thank you! Carlos Castillo · chato@acm.org http://www.chato.cl/research/ With thanks to Patrick Meier for several slides