SlideShare a Scribd company logo
A	Generative/	Discriminative	
Approach	to	De-construct	
Cascading	Events
Sameera Horawalavithana,	John	Skvoretz,	and	
Adriana	Iamnitchi
University	of	South	Florida
Machine	Learning	in	Network	Science	(MLNS	‘19)
Diffusion	in	Social	
Networks
• News,	opinions,	rumors,	fads,	
urban	legends,	…
• Virus,	disease	propagation
• Change	in	social	priorities:	
smoking,	recycling
• Saturation	news	coverage:	
topic	diffusion	among	
bloggers
• Internet-energized	political	
campaigns
• ….
“Viral	Marketing”	-- Patterns	of	Influence	in	a	Recommendation	Network
2
Information	
Cascade
• A	diffusion	process	initiates	activations	through	an	
underlying	network
• Activations	leave	a	trace	– cascade
• E.g.,	re-tweet	chain,	online	conversation
3
Information	
Cascade
4
Cascading	Events
• who did	what to	whom when where
• E.g.,	@Sameera mentions	
@Anda in	the	Retweet	“RT:	
TheHackerNews:	Biggest…”	at	
2019:03:27	08:23
5
Collective	Behavior	in	
Cascading	Events
• “Information	do	no	spread	
in	isolation,	independent	of	
all	other	information	
currently	diffusion	in	the	
network”	Myers	et	al.
• Competing	or	Cooperative	
cascades
• E.g.,	Reddit	conversations	on	
Bitcoin	scaling	debate	
Introducing Bitcoin Cash(BCH) that splits Bitcoin’s
(BTC) original blockchain via a hard fork — a
consequence of popular Bitcoin scaling debate
“We need off-chain
scaling and on-chain
scaling.Our stupid
politics is hurting
Bitcoin because
we're separating the
two necessary parts
of the overall
solution.”
Discussion on the
advantages of Bitcoin
Cash (BCH) on
scalability — “Bitcoin
Cash (BCH) totally fixes
the quadratic scaling of
sighash operations bug”
..“I miss those
days man and
got tired of
these scaling
topics.”..
“Don't fall for the big
block argument while we
are so close to have the
scaling solution.”
“..but he made
a mistake in
the scaling
debate”
6
Probabilistic	Cascade	
Generation
• Branching	Model
• E.g.,	person	transmits	
the	disease	to	each	
people	she	meets	
independently	with	a	
probability	p,	an	
infected	person	meets	k
(new)	people	while	she	
is	contagious
• Used	to	accurately	
model	cascade	subtree	
structure,	Cheng	et	al.
7
Probabilistic	
Generation	of	a	
Cascade	Pool
• Cascade	pool	consists	of	a	set	
of	cascades
• Each	cascade	originated	from	
a	given	set	of	initial	seeds	
(e.g.,	tweet,	or	a	post	in	
Reddit),	
• Apply	branching	model	for	
each	seed	separately.
• Based	on	empirically-
benchmarked	distributions;
• E.g.,	conditional	degree	
distribution	by	the	level
• Branching	model	generates	
only	the	cascade	structure.
8
Cascading	Events:	
Inferring	Users
• How	do	we	map	users	to	the	
generated	cascade	trees?
• Overlay	the	cascade	structure	on	
top	of	an	underlying	diffusion	
network
• Given	a	user	in	an	underlying	
network,	we	take	a	random	
weighted	sample	of	user’s	
incoming	edges	
• Underlying	Networks:
• Reddit,	shared	subreddit network	
weighted	by	direct	interactions
• Twitter,	follower	network	where	
edges	are	weighted	by	the	
number	of	previous	direct	
interactions
ua
ub
uc
ud
ue
uf
0.45
0.35
0.2
1
0.5
0.5
1
post
comment	1
comment	2
comment	3
Cascading	Events:	
Inferring	Time
• How	do	we	estimate	the	
time	of	adoption	
(timestamp	of	the	comment	
or	retweet)?
• Given	a	size	of	the	cascade,	
we	randomly	sample	a	
sequence	of	long	propagation	
delays	in	a	previous	cascade	
of	a	similar	size
10
post
comment	1
comment	2
comment	3
post
comment	1
comment	2
comment	3
2017/08/01	13:00
+ 00:35
+ 00:40
+ 00:55
2018/02/01	23:00
+ 00:35
+ 00:40
+ 00:55
Previously	
seen	
cascade
Simulated	
cascade
Problem..
11
The	probabilistic-based	cascade	generation	
considers	the	structure,	authorship	and	
timing	as	mutually	independent	of	each	
other	during	the	evolution	of	a	cascade	
Largely	constrained	by	the	prior	knowledge,	
and	does	not	provide	any	predictive	power	
for	attributing	user	and	timing	information.	
How	to	capture	the	dependence	between	
user,	timing	and	structure	in	a	cascade?
Generative/	
Discriminative	
Approach
12
How	to	capture	the	dependence	between	
user,	timing	and	structure	in	a	cascade?
Task	is	to	accurately	predict	a	pool	of	
cascades	with	user	and	timing	information
• Generate	N	pools	of	cascades	using	probabilistic	
models
• Reconstruct	a	pool	of	cascades	using	genetic	
algorithm
• We	develop	a	fitness	function	based	on	the	
output	of	two	trained	LSTM	models
• The	models	assess	how	realistic	to	observe	the	
generated	cascade	in	two	spatio-temporal	
properties	(i.e.,	branching	and	propagation	
delay)?
Discriminative	
Models
13
Two	Prediction	Objectives:
• Predict	the	branch	and	leaf	nodes
• critical	factor	to	accurately	predict	the	cascade	
structure
• Predict	the	early	and	late	adopters
• critical	factor	to	identify	the	temporal	position	of	a	
node	in	the	cascade
• E.g.,	assume	node	1	appeared	in	timestep x,	and	node	2	
and	3	appeared	in	timestep x+1	and	x+2	respectively:
• node	2	is	a	branch,	and	an	earlier adopter	
(propagation	delay	=	1)	than	node	3	(propagation	
delay	=	2)
Cascade	
Representation	
and	Features
14
Re-construction	of	a	
Cascade	Pool
• How	realistic	is	the	generated	
cascade	with	the	user	and	
timing	information?
• Terminology:
• “Gene”	– individual	cascade
• “Individual”	– a	pool	of	
cascades
• “Population”	– a	set	of	cascade	
pools
Fitness	Score
16
How	realistic	is	the	generated	cascade	with	the	attached	user	
and	timing	information?
seed
Nx
Ny
Nz
1
0
0
1
1
0
Generated	Cascade
Branch	LSTM
Delay	LSTM
0
0
1
0
1
1
T10
T1
T2
Ny T1
Nz T2
Nx T10
Branch	vector Delay	vector
Fitness	Score
17
Genetic	Test
• The	objective	is	to	come	up	with	an	individual	(a	pool	of	
cascades)	which	outperforms	any	existing	pool	in	the	
initial	search	space.	
• We	feed	each	generated	cascade	into	two	machine-
learning	models:
• Predict	a	binary	vector	that	represents	branch/leaf	
using	the	branch	discriminator	model,	
• Predict	a	binary	vector	that	represents	the	early/late	
adopters	using	the	delay	discriminator	model.	
• We	consider	these	two	binary	vectors	as	the	inferred	
ground	truth	to	assess	the	generated	cascade.
• Report	AUC	comparing	the	inferred	ground	truth	with	
the	simulated	cascade	output.
Performance
(prediction)
• Two	classification	tasks
• Predict	branch	or	leaf	node
• Predict	early	or	late	adopter
18
Training	Data Test	Data time
2015	January 2016	December 2017	August
Performance
(simulation)
• Simulation	Task
• Given	a	set	of	initial	seeds	(e.g.,	tweet,	or	a	post	in	Reddit),	predict	the	full	
cascade	structure	with	user	and	timing	information
• E.g.,	who did	what to	whom	when	where
19
Training	Data Test	Data
2015	January 2017	July 2017	August
Simulation
Acknowledgement
20
Funded	by	DARPA	SocialSim	Program	
and	the	Air	Force	Research	
Laboratory
Data:	Leidos,	Netanomics
Evaluation:	Pacific	Northwest	
National	Laboratory
Thank	you.
Email:	sameera1@mail.usf.edu
https://samtube405.github.io/_profile
21

More Related Content

PPT
New media activism presentation
PPT
Synagogue Council of MA
PPTX
Jpro stl community
PPTX
Social good online
PPT
Intro to social media, sm strategy & sm in gov.
PPT
Council on Foundations
PDF
Social media workshop: Politics Transformed
PDF
Introduction to Mining Social Media Data
New media activism presentation
Synagogue Council of MA
Jpro stl community
Social good online
Intro to social media, sm strategy & sm in gov.
Council on Foundations
Social media workshop: Politics Transformed
Introduction to Mining Social Media Data

What's hot (20)

PPTX
Social Media Workshop, postgraduate
PPT
Day 1 What and Why
PPTX
Leveraging Open Data and Social Media for Improved Community Well-being
PPT
Social Web 2.0 Class Week 9: Social Coordination, Mobile Social, Collective A...
PPTX
The impact of citizen journalism sound
PPTX
Wexner social media final
PPTX
The impact of citizen journalism
PDF
Dialogue-Earth:-Mining-Social-Media
PPT
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
PPTX
Corporate communications today
PPTX
Texas Nonprofit Summit
PPTX
Social Network Theory
PPTX
Wexner join lnc edits
PPTX
American life in the midst of crisis: How people are using technology as thei...
PPTX
Covenant fellows
PPT
Building Communities Beyond the Usual Suspects Akpapin
PPT
Using social media to promote your cause
PPT
The State of Social Justice and Digital Media in Africa
PPT
Tools and Tips for Analyzing Social Media Data
PPTX
Crowd powered collaboration
Social Media Workshop, postgraduate
Day 1 What and Why
Leveraging Open Data and Social Media for Improved Community Well-being
Social Web 2.0 Class Week 9: Social Coordination, Mobile Social, Collective A...
The impact of citizen journalism sound
Wexner social media final
The impact of citizen journalism
Dialogue-Earth:-Mining-Social-Media
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
Corporate communications today
Texas Nonprofit Summit
Social Network Theory
Wexner join lnc edits
American life in the midst of crisis: How people are using technology as thei...
Covenant fellows
Building Communities Beyond the Usual Suspects Akpapin
Using social media to promote your cause
The State of Social Justice and Digital Media in Africa
Tools and Tips for Analyzing Social Media Data
Crowd powered collaboration
Ad

Similar to [MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascading Events (20)

PDF
The Other Side Of The Digital The Sacrificial Economy Of New Media Andrea Righi
PDF
Data and society media manipulation and disinformation online
PDF
PDF
Unlike Us Reader Social Media Monopolies And Their Alternatives 1st Edition G...
PDF
The Propagation Of Misinformation In Social Media A Crossplatform Analysis Ri...
PPT
How information spreads on social networks when unexpected events occur
PPT
The Wired Nonprofit Excerpt
PPT
Socialmediaandethics 111104203630-phpapp02
PPT
Social media and ethics
PPTX
Online radicalisation: work, challenges and future directions
PPTX
12 November 2012
PPT
Social networks v2
PDF
Pillars of the Digital Age 2015
PPT
Web Utopia Lost: Where Do We Go From Here
PDF
Computational Social Science
PPTX
Social Media Training for Staff
PDF
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
PDF
Social media challenges
PDF
The Participatory Condition In The Digital Age Darin Barney
PDF
The Future is Yesterday: Public Relations in the Networked Era
The Other Side Of The Digital The Sacrificial Economy Of New Media Andrea Righi
Data and society media manipulation and disinformation online
Unlike Us Reader Social Media Monopolies And Their Alternatives 1st Edition G...
The Propagation Of Misinformation In Social Media A Crossplatform Analysis Ri...
How information spreads on social networks when unexpected events occur
The Wired Nonprofit Excerpt
Socialmediaandethics 111104203630-phpapp02
Social media and ethics
Online radicalisation: work, challenges and future directions
12 November 2012
Social networks v2
Pillars of the Digital Age 2015
Web Utopia Lost: Where Do We Go From Here
Computational Social Science
Social Media Training for Staff
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
Social media challenges
The Participatory Condition In The Digital Age Darin Barney
The Future is Yesterday: Public Relations in the Networked Era
Ad

More from Sameera Horawalavithana (17)

PDF
Data-driven Studies on Social Networks: Privacy and Simulation
PDF
Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
PDF
Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
PPTX
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
PDF
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHub
PPTX
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
PDF
Duplicate Detection on Hoaxy Dataset
PDF
Dancing with Stream Processing
PPTX
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
PDF
Be Elastic: Leapset Innovation session 06-08-2015
PPTX
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
PPTX
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
PPTX
Locality sensitive hashing
PPTX
Zipf distribution
PPTX
Query personalization
PPTX
Dancing with publish/subscribe
PPTX
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Data-driven Studies on Social Networks: Privacy and Simulation
Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHub
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
Duplicate Detection on Hoaxy Dataset
Dancing with Stream Processing
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
Be Elastic: Leapset Innovation session 06-08-2015
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
Locality sensitive hashing
Zipf distribution
Query personalization
Dancing with publish/subscribe
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
A Presentation on Artificial Intelligence
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Machine Learning_overview_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Network Security Unit 5.pdf for BCA BBA.
Spectroscopy.pptx food analysis technology
NewMind AI Weekly Chronicles - August'25-Week II
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Diabetes mellitus diagnosis method based random forest with bat algorithm
The AUB Centre for AI in Media Proposal.docx
A Presentation on Artificial Intelligence
Programs and apps: productivity, graphics, security and other tools
Encapsulation_ Review paper, used for researhc scholars
Machine Learning_overview_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Electronic commerce courselecture one. Pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Network Security Unit 5.pdf for BCA BBA.

[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascading Events