SlideShare a Scribd company logo
Epidemiological Modeling of News and
Rumors on Twitter
Fang Jin, Edward Dougherty, Parang Saraf, Peng Mi,
Yang Cao, Naren Ramakrishnan
Virginia Tech
Aug 11, 2013
2
Outline
o Motivation
o Approach
o Implementation
o Results and Analysis
o Conclusions & Limitation
3
Motivation
Ø  Can twitter data (news and rumor) be represented by epidemic
models?
Ø  Can we gain insight into the acceptance, comprehension, and spread
of information?
v  How effectively does information spread via twitter?
v  What is the rate of information propagation?
Ø  Can we observe any differences between news spreading and rumor
spreading?
4
Twitter VS disease
o Idea spreading is an intentional act
o It is advantageous to acquire new ideas
o Idea spreading on twitter has no
(intrinsic) spatial concept
o Idea: no immune system, no “R”
Ideas spread model: SIS and SEIZ
o Both infectious
o May take time to accept
o Have transmission route
。。。
5
Epidemic Model
Susceptible
Infected
Exposed
Skeptics
Twitter accounts
Believe news / rumor, (I) post a tweet
Be exposed but not yet believe
Skeptics, do not tweet
S
E
I
Z
Disease Twitter
6
S I S
Model Description
Disease Applications:
–  Influenza
–  Common Cold
Twitter Application Reasoning:
–  An individual either believes a rumor (I),
–  or is susceptible to believing the rumor (S)
h"p://www.me.ucsb.edu/~moehlis/APC514/tutorials/tutorial_seasonal/node2.html
7
SEIZ Model Description
p
b
β
l
(1-l)
(1-p)
ρ
S E
I
Z
S-I contact rate
S-Z contact rate
Probability of (S → I)
given contact with adopters
E-I contact rate
Probability of (S → Z)
given contact with skeptics
Probability of (S → E)
given contact with skeptics
Probability of (S →E)
given contact with adopters
Total:175M
Active: 39M
Following none: 56M
No followers: 90M
Fake:0.5M
Challenges
–  Time Zone Differences
–  Users “unplugging”, they may offline
-  We have very little information: no rate, no initial compartments
-  Population == Number of Twitter Accounts
h"p://techcrunch.com/2012/07/30/analyst-­‐twi"er-­‐passed-­‐500m-­‐users-­‐in-­‐june-­‐2012-­‐140m-­‐of-­‐them-­‐in-­‐us-­‐jakarta-­‐biggest-­‐tweeHng-­‐city/	
  
9
Approach
Assumptions:
–  No vital dynamics
–  N, S(t0), E(t0), I(t0), Z(t0) are unknown
Implementation:
–  Nonlinear least squares fit, using lsqnonlin function
–  Selecting a set of parameter values, solve ordinary differential equation(ODE) system
–  Minimize the error of |I(t) – tweets(t)|
Rumor Identification
bl: effective rate of S → Z
βp: effective rate of S → I
b(1-l): effective rate of S → E via contact with Z
β(1-p): effective rate of S → E via contact with I
Є: E-I Incubation rate
ρ: E-I contact rate
RSI, a kind of flux ratio, the ratio of effects entering E to those leaving E.
By SEIZ model parameters
p
b
β
l
(1-l)
(1-p)
ρ
S E
I
Z
Є
11
¢  Obama injured. 04-23-2013
¢  Doomsday rumor. 12-21-2012
¢  Fidel Castro’s coming death. 10-15-2012
¢  Riots and shooting in Mexico. 09-05-2012
¢  Boston Marathon Explosion. 04-15-2013
¢  Pope Resignation. 02-11-2013
¢  Venezuela's refinery explosion. 08-25-2012
¢  Michelle Obama at the 2013 Oscars. 02-24-2013
Datasets
12
Boston Marathon Bombing
SIS Model SEIZ Model
SEIZ models Twitter data more accurately than SIS model, specially at the initial points.
Error = norm( I – tweets ) / norm( tweets )
13
Pope Resignation
SIS Model SEIZ Model
SEIZ models Twitter data more accurately than SIS model, specially at the initial points.
14
Doomsday
SIS Model SEIZ Model
15
SIS VS SEIZ
What can we deduce?
Ø  SEIZ models Twitter data more accurately than SIS model
Ø  SEIZ models Twitter data (via I(t) function) well
Fitting error of SIS and SEIZ models:
Boston	
   Pope	
   Amuay	
   Michelle	
   Obama	
   Doomsday	
   Castro	
   Riot	
   Average	
  
SIS	
   0.058	
   0.041	
   0.058	
   0.088	
   0.102	
   0.028	
   0.082	
   0.088	
   0.0680	
  
SEIZ	
   0.010	
   0.004	
   0.027	
   0.061	
   0.101	
   0.029	
   0.073	
   0.093	
   0.0499	
  
Rumor detection via SEIZ model
SEIZ model parameter result
28.31	
  
24.66	
  
3.58	
  
0.34	
   0.25	
   0.2	
   0.18	
   0.02	
  
0	
  
5	
  
10	
  
15	
  
20	
  
25	
  
30	
  
Boston Pope Amuay Michelle Obama Doomsday Castro Riot
RSI value for eight stories
17
Conclusion
v Twitter stories can be modeled by epidemiological models.
- SEIZ models Twitter data (via I(t) function) well
- SEIZ models Twitter data more accurately than SIS model, especially at initial points
v Generate a wealth of valuable parameters from SEIZ
v These parameters can be incorporated into a strategy to support the
identification of Twitter topics as rumor vs news.
18
Limitations
v Tweets could be suppressing rumor or news
–  A tweet could contain skeptical information
v Our study does not incorporate follower information
v May be possible to incorporate some level of population information
v More accurate models, based on more reasonable assumptions.
19

More Related Content

PDF
Epidemiological Modeling of News and Rumors on Twitter
PPTX
FAKE NEWS DETECTION PPT
PPTX
Detection and resolution of rumours in social media
PPTX
Descriptive statistics formulae Part I
PPTX
友人関係と感染症伝搬をネットワークで理解する
ODP
Detecting Trends Through Twitter Stream v2
PPTX
Social Media Research for Qualitative Data
PDF
Crowdsourcing the Annotation of Rumourous Conversations in Social Media
Epidemiological Modeling of News and Rumors on Twitter
FAKE NEWS DETECTION PPT
Detection and resolution of rumours in social media
Descriptive statistics formulae Part I
友人関係と感染症伝搬をネットワークで理解する
Detecting Trends Through Twitter Stream v2
Social Media Research for Qualitative Data
Crowdsourcing the Annotation of Rumourous Conversations in Social Media

What's hot (20)

PDF
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
PPTX
Toward Formal Reasoning with Epistemic Policies about Information Quality i...
PDF
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
PPT
Michal Migurski: Data in Context
PDF
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
PDF
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
PPTX
Analyzing customer sentiments in microblogs
PDF
Analyzing Real Time News
PDF
Twitter Intelligent Sensor Agent
PDF
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
PDF
Poster
PDF
GeospatialDataAnalysis
PDF
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
PDF
Content-based link prediction
PPTX
Cyberhate publications
PDF
Analyzing-Threat-Levels-of-Extremists-using-Tweets
PDF
Twitter Sentiment and Network Analysis
PPTX
Semantic Twitter Analyzing Tweets For Real Time Event Notification
PDF
Machine Classification and Analysis of Suicide-Related Communication on Twitter
PDF
DH 199 Social Media Analytics
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
Toward Formal Reasoning with Epistemic Policies about Information Quality i...
Are Positive or Negative Tweets More "Retweetable" in Brazilian Politics?
Michal Migurski: Data in Context
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
Analyzing customer sentiments in microblogs
Analyzing Real Time News
Twitter Intelligent Sensor Agent
IRJET- Big Data Driven Information Diffusion Analytics and Control on Social ...
Poster
GeospatialDataAnalysis
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...
Content-based link prediction
Cyberhate publications
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Twitter Sentiment and Network Analysis
Semantic Twitter Analyzing Tweets For Real Time Event Notification
Machine Classification and Analysis of Suicide-Related Communication on Twitter
DH 199 Social Media Analytics
Ad

Viewers also liked (20)

PPTX
PPT
Rumors!
PPTX
Rumors Affect
PPTX
Gossip & Rumor In the Workplace
PPT
How To Deal With Gossip
PDF
Slides: Safeguarding Abila: Real-time Streaming Analysis
PDF
Bayesian Model Fusion for Forecasting Civil Unrest
PDF
Mining Twitter to Understand Engineering Students' Experiences
PDF
Kaplan & Haenlein - The early bird catches the news nine things you should kn...
PPTX
[WWW Conference 2011]Information Credibility on Twitter
PPTX
A Sentiment-Based Approach to Twitter User Recommendation
PDF
Tutorial on Relationship Mining In Online Social Networks
PDF
Rumors Monitoring
PDF
54-58-TRE205-Special-Report-Tax Factor
PPTX
Rumors and relationships final
PPTX
Topic and Opinion Classification based Information Credibility Analysis on Tw...
PPTX
Identifying rumours on Twitter
PPTX
Rumors and Corporate Reputation
PPTX
Basics of communication
PPTX
Twitter Data Analytics
Rumors!
Rumors Affect
Gossip & Rumor In the Workplace
How To Deal With Gossip
Slides: Safeguarding Abila: Real-time Streaming Analysis
Bayesian Model Fusion for Forecasting Civil Unrest
Mining Twitter to Understand Engineering Students' Experiences
Kaplan & Haenlein - The early bird catches the news nine things you should kn...
[WWW Conference 2011]Information Credibility on Twitter
A Sentiment-Based Approach to Twitter User Recommendation
Tutorial on Relationship Mining In Online Social Networks
Rumors Monitoring
54-58-TRE205-Special-Report-Tax Factor
Rumors and relationships final
Topic and Opinion Classification based Information Credibility Analysis on Tw...
Identifying rumours on Twitter
Rumors and Corporate Reputation
Basics of communication
Twitter Data Analytics
Ad

Similar to Slides: Epidemiological Modeling of News and Rumors on Twitter (20)

PDF
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
PDF
Evolution and Influence Measurement Association Information Network
PDF
Statistic Project Essay
PPTX
Information Contagion through Social Media: Towards a Realistic Model of the ...
PPTX
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
PDF
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
PDF
Machine Learning for Epidemiological Models (Enrico Meloni)
PPT
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
PDF
IRJET - Suicidal Text Detection using Machine Learning
PDF
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
PDF
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
PPTX
Modeling Spread of Disease from Social Interactions
PPTX
DP1_160430723010_Divya.pptx
PDF
PPTX
Surveillance of social media: Big data analytics
PPTX
wendi_ppt
PDF
Extracting information from ' messy' social media data
PDF
A MATHEMATICAL MODEL OF ACCESS CONTROL IN BIG DATA USING CONFIDENCE INTERVAL ...
PDF
A mathematical model of access control in big data using confidence interval ...
PDF
Carl Miller
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
Evolution and Influence Measurement Association Information Network
Statistic Project Essay
Information Contagion through Social Media: Towards a Realistic Model of the ...
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
Machine Learning for Epidemiological Models (Enrico Meloni)
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
IRJET - Suicidal Text Detection using Machine Learning
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
Modeling Spread of Disease from Social Interactions
DP1_160430723010_Divya.pptx
Surveillance of social media: Big data analytics
wendi_ppt
Extracting information from ' messy' social media data
A MATHEMATICAL MODEL OF ACCESS CONTROL IN BIG DATA USING CONFIDENCE INTERVAL ...
A mathematical model of access control in big data using confidence interval ...
Carl Miller
 

More from Parang Saraf (20)

PDF
Email and Network Analyzer
PDF
Slides: Safeguarding Abila through Multiple Data Perspectives
PDF
Slides: Safeguarding Abila: Spatio-Temporal Activity Modeling
PDF
Safeguarding Abila: Discovering Evolving Activist Networks
PDF
News Analyzer
PDF
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
PDF
Slides: Forex-Foreteller: Currency Trend Modeling using News Articles
PDF
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
PDF
DMAP: Data Aggregation and Presentation Framework
PDF
EMBERS Posters
PDF
Concurrent Inference of Topic Models and Distributed Vector Representations
PDF
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
PDF
Safeguarding Abila through Multiple Data Perspectives
PDF
Safeguarding Abila: Real-time Streaming Analysis
PDF
Safeguarding Abila: Spatio-Temporal Activity Modeling
PDF
Safeguarding Abila: Discovering Evolving Activist Networks
PDF
Forex-Foreteller: Currency Trend Modeling using News Articles
PDF
Merseyside Crime Analysis
PDF
Virtual time round-robin scheduler presented by Parang Saraf (CS4204 VT)
PDF
A fast file system for unix presentation by parang saraf (cs5204 VT)
Email and Network Analyzer
Slides: Safeguarding Abila through Multiple Data Perspectives
Slides: Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Discovering Evolving Activist Networks
News Analyzer
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
Slides: Forex-Foreteller: Currency Trend Modeling using News Articles
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
DMAP: Data Aggregation and Presentation Framework
EMBERS Posters
Concurrent Inference of Topic Models and Distributed Vector Representations
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
Safeguarding Abila through Multiple Data Perspectives
Safeguarding Abila: Real-time Streaming Analysis
Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Discovering Evolving Activist Networks
Forex-Foreteller: Currency Trend Modeling using News Articles
Merseyside Crime Analysis
Virtual time round-robin scheduler presented by Parang Saraf (CS4204 VT)
A fast file system for unix presentation by parang saraf (cs5204 VT)

Recently uploaded (20)

PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Introduction to the R Programming Language
PDF
annual-report-2024-2025 original latest.
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Transcultural that can help you someday.
PDF
[EN] Industrial Machine Downtime Prediction
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Database Infoormation System (DBIS).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Introduction to Data Science and Data Analysis
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction to the R Programming Language
annual-report-2024-2025 original latest.
Clinical guidelines as a resource for EBP(1).pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Transcultural that can help you someday.
[EN] Industrial Machine Downtime Prediction
ISS -ESG Data flows What is ESG and HowHow
Supervised vs unsupervised machine learning algorithms
Database Infoormation System (DBIS).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Data Science and Data Analysis
Optimise Shopper Experiences with a Strong Data Estate.pdf
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Mega Projects Data Mega Projects Data

Slides: Epidemiological Modeling of News and Rumors on Twitter

  • 1. Epidemiological Modeling of News and Rumors on Twitter Fang Jin, Edward Dougherty, Parang Saraf, Peng Mi, Yang Cao, Naren Ramakrishnan Virginia Tech Aug 11, 2013
  • 3. 3 Motivation Ø  Can twitter data (news and rumor) be represented by epidemic models? Ø  Can we gain insight into the acceptance, comprehension, and spread of information? v  How effectively does information spread via twitter? v  What is the rate of information propagation? Ø  Can we observe any differences between news spreading and rumor spreading?
  • 4. 4 Twitter VS disease o Idea spreading is an intentional act o It is advantageous to acquire new ideas o Idea spreading on twitter has no (intrinsic) spatial concept o Idea: no immune system, no “R” Ideas spread model: SIS and SEIZ o Both infectious o May take time to accept o Have transmission route 。。。
  • 5. 5 Epidemic Model Susceptible Infected Exposed Skeptics Twitter accounts Believe news / rumor, (I) post a tweet Be exposed but not yet believe Skeptics, do not tweet S E I Z Disease Twitter
  • 6. 6 S I S Model Description Disease Applications: –  Influenza –  Common Cold Twitter Application Reasoning: –  An individual either believes a rumor (I), –  or is susceptible to believing the rumor (S) h"p://www.me.ucsb.edu/~moehlis/APC514/tutorials/tutorial_seasonal/node2.html
  • 7. 7 SEIZ Model Description p b β l (1-l) (1-p) ρ S E I Z S-I contact rate S-Z contact rate Probability of (S → I) given contact with adopters E-I contact rate Probability of (S → Z) given contact with skeptics Probability of (S → E) given contact with skeptics Probability of (S →E) given contact with adopters
  • 8. Total:175M Active: 39M Following none: 56M No followers: 90M Fake:0.5M Challenges –  Time Zone Differences –  Users “unplugging”, they may offline -  We have very little information: no rate, no initial compartments -  Population == Number of Twitter Accounts h"p://techcrunch.com/2012/07/30/analyst-­‐twi"er-­‐passed-­‐500m-­‐users-­‐in-­‐june-­‐2012-­‐140m-­‐of-­‐them-­‐in-­‐us-­‐jakarta-­‐biggest-­‐tweeHng-­‐city/  
  • 9. 9 Approach Assumptions: –  No vital dynamics –  N, S(t0), E(t0), I(t0), Z(t0) are unknown Implementation: –  Nonlinear least squares fit, using lsqnonlin function –  Selecting a set of parameter values, solve ordinary differential equation(ODE) system –  Minimize the error of |I(t) – tweets(t)|
  • 10. Rumor Identification bl: effective rate of S → Z βp: effective rate of S → I b(1-l): effective rate of S → E via contact with Z β(1-p): effective rate of S → E via contact with I Є: E-I Incubation rate ρ: E-I contact rate RSI, a kind of flux ratio, the ratio of effects entering E to those leaving E. By SEIZ model parameters p b β l (1-l) (1-p) ρ S E I Z Є
  • 11. 11 ¢  Obama injured. 04-23-2013 ¢  Doomsday rumor. 12-21-2012 ¢  Fidel Castro’s coming death. 10-15-2012 ¢  Riots and shooting in Mexico. 09-05-2012 ¢  Boston Marathon Explosion. 04-15-2013 ¢  Pope Resignation. 02-11-2013 ¢  Venezuela's refinery explosion. 08-25-2012 ¢  Michelle Obama at the 2013 Oscars. 02-24-2013 Datasets
  • 12. 12 Boston Marathon Bombing SIS Model SEIZ Model SEIZ models Twitter data more accurately than SIS model, specially at the initial points. Error = norm( I – tweets ) / norm( tweets )
  • 13. 13 Pope Resignation SIS Model SEIZ Model SEIZ models Twitter data more accurately than SIS model, specially at the initial points.
  • 15. 15 SIS VS SEIZ What can we deduce? Ø  SEIZ models Twitter data more accurately than SIS model Ø  SEIZ models Twitter data (via I(t) function) well Fitting error of SIS and SEIZ models: Boston   Pope   Amuay   Michelle   Obama   Doomsday   Castro   Riot   Average   SIS   0.058   0.041   0.058   0.088   0.102   0.028   0.082   0.088   0.0680   SEIZ   0.010   0.004   0.027   0.061   0.101   0.029   0.073   0.093   0.0499  
  • 16. Rumor detection via SEIZ model SEIZ model parameter result 28.31   24.66   3.58   0.34   0.25   0.2   0.18   0.02   0   5   10   15   20   25   30   Boston Pope Amuay Michelle Obama Doomsday Castro Riot RSI value for eight stories
  • 17. 17 Conclusion v Twitter stories can be modeled by epidemiological models. - SEIZ models Twitter data (via I(t) function) well - SEIZ models Twitter data more accurately than SIS model, especially at initial points v Generate a wealth of valuable parameters from SEIZ v These parameters can be incorporated into a strategy to support the identification of Twitter topics as rumor vs news.
  • 18. 18 Limitations v Tweets could be suppressing rumor or news –  A tweet could contain skeptical information v Our study does not incorporate follower information v May be possible to incorporate some level of population information v More accurate models, based on more reasonable assumptions.
  • 19. 19