SlideShare a Scribd company logo
Survival Factorization
on Diffusion Networks
Nicola Barbieri, Giuseppe Manco and Ettore Ritacco
Tumblr, 35 E 21st St, 10010, New York, USA -
nicola@tumblr.com
ICAR - CNR, via Bucci 7/11C, 87036 Arcavacata di Rende
(CS), ITALY - giuseppe.manco@icar.cnr.it,
ettore.ritacco@icar.cnr.it
Context
• Users can	create	contents
• Contents can	be	shared	within	a	diffusion	network
• The	diffusion	takes	place	within	cascades
• Trees	of	timed	word-of-mouth	chains
Relevant	Questions
• What	makes	a	content	popular?
• Which	creators	are	able	to	trigger	a	cascade?
• Who	will	share	a	content?
• When	will	someone	share	a	content?
• Who	is	expert	in	a	topic	characterizing	a	set	of	contents?
• Who	is	interested	in	a	topic?
• Which	are	the	most	popular	topics?
• …
Focus	of	this	Talk
• What	makes	a	content	popular?
• Which	creators	are	able	to	trigger	a	cascade?
• Who will	share	a	content?
• When will	someone	share	a	content?
• Who	is	expert in	a	topic	characterizing	a	set	of	contents?
• Who	is	interested in	a	topic?
• Which	are	the	most	popular	topics?
• …
Idea
• Information	spreading	within	a	network	=	disease	contagion
• A	user	shares	a	content	=	an	individual	is	infected
• Active on	a	given	cascade
• A	user	does	not	share	it	=	the	individual	resists the	contagion
• Inactive on	a	given	cascade
• Information	Diffusion	in	terms	of	Survival	Analysis
• The	observations	in	a	time	horizon	where	a	user	can	either	resist	or	be	infected
• Warning:	the	network	is	implicit!
• We	can	only	observe	the	content	adoptions,	i.e.	the	contagion
Key	elements	(1)
M.	Gomez-Rodriguez,	D.	Balduzzi,	and	B.	Scholkopf.	Uncovering the	temporal dynamics of	diffusion networks.	In	ICML	2011.	
Time
Recent	contentOld	content Observation
• Contagion	is	time-dependent
• the	probability	of	contagion	depends	on	the	time	when	the	target	gets	in	
touch	with	the	content
Key	elements	(2)
M.	Gomez-Rodriguez,	D.	Balduzzi,	and	B.	Scholkopf.	Uncovering the	temporal dynamics of	diffusion networks.	In	ICML	2011.	
Time
Recent	carrierOld	carrier Observation
• Contagion	is	carrier-dependent
• Some	carriers	are	more	infectious	than	others
• Influence exerted	in	the	diffusion	process
Formally
• 𝒕"
= 𝑡% 𝑐 , … , 𝑡) 𝑐 a	cascade with	content	𝑐,	
• 𝑁 is	the	number	of	users
• 𝑡+ 𝑐 ∈ 0, 𝑇" ∪ ∞ is	the	timestamp	when	the	node	𝑢 becomes	active on	
the	cascade	𝒕",	
• 𝑇" is	the	time	horizon	
• The	probability	of	user	𝒖 being	infected	by	user	𝒗 at	time	𝑡+ 𝑐 is	
given	by:
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 ∝ 𝑒:;<,= >< " :>= "
The	Infection	model
• 𝑣 is	the	influencer	
• 𝜆+,6 represents	the	influence	exerted	by	𝑣 on	𝑢
• The	transmission	rate
• 𝑆 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 = 𝑒:;<,= >< " :>= "
is	the	survival	function,	
• the	probability	of	resisting	the	contagion	𝑝 𝑇 ≥	 𝑡+ 𝑐 |𝑡6 𝑐
• 𝑡+ 𝑐 − 𝑡6 𝑐 represents	the	exposure	time
• The	longer	the	delay,	the	lower	the	probability	of	infection
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 = 𝜆+,6 ⋅ 𝑒:;<,= >< " :>= "
Building	the	Survival	Model	– Step	1
• A	cascade	is	composed	by	users	that	activate	and	users	that	resist	the	
contagion Latent	infection	indicator
Immune	users
𝑝 𝒕"
|𝒀, 𝚲 = H 𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
I<,=
J
𝑆 𝑡+ 𝑐 − 𝑡6 𝑐
%:I<,=
J
+,6	L">M6N
⋅ H H 𝑆 𝑇"
− 𝑡6 𝑐 |𝜆+,6
6	L">M6N+	MOL">M6N
									
Infected	users
Building	the	Survival	Model	– Issues
• The	nature	of	the	transmission	rate	𝝀 determines	the	adaptiveness	of	
the	model	to	the	personalization	of	the	contagion
• A	very	fine-grain approach:
• a	single	value	𝜆+,6 for	each	pair	of	users within	each	cascade
• This	approach	is	intractable	in	real	scenarios
• The	matrix	𝚲,	containing	all	the	𝜆+,6,	has	size	 𝑁T
Key	elements	(3)
TimeRecent	carrierOld	carrier Observation
• Contagion	is	topic-dependent	
• Susceptibility and	influence are	relative	the	content
• Content	is	characterized	by	topics
Two	topics	per	transmission
The	Infection	model	(2)
• 𝜆+,6
U
represents	the	influence	exerted	by	𝑣 on	𝑢 about	topic	𝑘
• The	topical	transmission	rate
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
U
∝ 𝑒:;<,=
W >< " :>= "
The	Infection	model	(2)
• 𝜆+,6
U
represents	the	influence	exerted	by	𝑣 on	𝑢 about	topic	𝑘
• The	topical	transmission	rate
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
U
∝ 𝑒:;<,=
W >< " :>= "
Topic	dependency
The	Infection	model	(2)
• 𝜆+,6
U
represents	the	influence	exerted	by	𝑣 on	𝑢 about	topic	𝑘
• The	topical	transmission	rate
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
U
∝ 𝑒:;<,=
W >< " :>= "
𝜆+,6,U = 𝐴6,U ⋅ 𝑆+,U
Susceptibility on	topic	𝑘
The	Infection	model	(2)
• 𝜆+,6
U
represents	the	influence	exerted	by	𝑣 on	𝑢 about	topic	𝑘
• The	topical	transmission	rate
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
U
∝ 𝑒:;<,=
W >< " :>= "
𝜆+,6,U = 𝐴6,U ⋅ 𝑆+,U
Influence on	topic	𝑘
Building	the	Survival	Model	– Step	3
• The	likelihood	of	a	cascade	is	topic	dependent
𝑝 𝒕"|𝒁, 𝒀, 𝚲 = H H 𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6
U I<,=
J
𝑆 𝑡+ 𝑐 − 𝑡6 𝑐
%:I<,=
J ZJ,W
+,6	L">M6NU
⋅ H H H 𝑆 𝑇" − 𝑡6 𝑐 |𝜆+,6
U
6	L">M6N+	MOL">M6NU
															
Topics
Latent	topic	indicator
The	complete	model
• Content can	be	modeled	jointly
• E.g.,	textual	content	model	by	a	mixture	of	Poisson	distributions	expressing	
topic	dependency
• For each cascade 𝑐 ∈ 1, … , 𝑀
o Sample the topical diffusion pattern,
𝑧" ∼ 𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝛩
o For each word 𝑤 in 𝑐
§ Sample the occurrences of 𝑤 in 𝑐,
𝑛g," ∼ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛 𝛷
o For each user 𝑢 in 𝑐
§ Sample the user who generated the
contagion, 𝑦+
" ∼ 𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙 Ξ
§ Sample her activation time,
𝑡+ 𝑐 ∼ 𝑊𝑒𝑖𝑏𝑢𝑙𝑙 𝑧", 𝑦+
", 𝐴, 𝑆
Ξ
Model	Learning
• EM	approach
• E-step:
• Update	latent	variables
• M-step:
• Given	the	status	of	the	latent	variables	𝒁 and	𝒀,	update	parameters
• Linear	complexity!	
• The	update	equations	in	the	EM	algorithm	can	be	optimized	by	exploiting	the	
factorization	of		𝜆+,6
U
Model	Learning
• EM	approach
• E-step:
• Update	latent	variables
• M-step:
• Given	the	status	of	the	latent	variables	𝒁 and	𝒀,	update	parameters
• Linear	complexity!	
• The	update	equations	in	the	EM	algorithm	can	be	optimized	by	exploiting	the	
factorization	of		𝜆+,6
U
Exploiting	the	model
• We	started	with	four	questions:
• Q:	Who will	share	a	content?
• A:	users	infected within	a	given	time	horizon
• Q:	When will	someone	share	a	content?
• A:	A	sample	from	𝑝 𝒕"
|𝒁, 𝒀, 𝚲
• Q:	Who	is	expert in	a	topic	characterizing	a	set	of	contents?
• A:	Influential users,	see	𝐴6,U
• Q:	Who	is	interested in	a	topic?
• A:	Susceptible users,	see	𝑆+,U
Comparison	with	the	Literature
Evaluation
• Activation	prediction:
• Two	samples	of	Twitter	(filtered/noisy	draws)
• Testing	protocol:
• Given	an	incomplete	cascade	(50%,	80%),	fill	the	missing	activations
• Predict	activation	times
• Influencers	and	topics:
• MemeTracker dataset
• Testing	protocol:
• A	semantic	(handmade)	analysis	on	the	top	topics	and	most	influential	users
Twitter
• ROC	curves	on	predicting	
user’s	retweet	time	on	
Twitter- Large	(noisy	sample,	
first	row)	and	Twitter-Small	
(filtered	sample,	second	row)
MemeTracker
Conclusions
• Robust,	efficient and	accurate modeling	of	information	cascades
• Factorizing the	infection	rate uncovers	highly	relevant	
information	concerning	the	underlying	diffusion	process
• Works	with	general	Weibull	distribution,	not	just	the	exponential
• Future	work
• Bayesian	learning:	The	underlying	probability	distributions	allow	conjugate	priors
• Exploit	multiple	mutual	elicitation	processes	(e.g.	Hawkes	processes)	in	the	same	
modeling
• Deep	architectures	for	combining	heterogeneous	content
• Content	dynamics	within	a	cascade

More Related Content

PDF
Monitoring real time public vaccine confidence through social media (Francesc...
PPT
European librarians theatre - Social Media Spotlight
PDF
Ischia Group Theory 2008 Proceedings Of The Conference Naples Italy 14 April ...
PDF
Lect6 technologies fall 2017
PPTX
Synthesizing knowledge from disagreement -cwi-2015-04-23
PDF
Temporal profiles of avalanches on networks
PPTX
Synchronous Online Experiments with NodeGame
PPTX
Synthesizing knowledge from disagreement -- Manchester -- 2015-05-06
Monitoring real time public vaccine confidence through social media (Francesc...
European librarians theatre - Social Media Spotlight
Ischia Group Theory 2008 Proceedings Of The Conference Naples Italy 14 April ...
Lect6 technologies fall 2017
Synthesizing knowledge from disagreement -cwi-2015-04-23
Temporal profiles of avalanches on networks
Synchronous Online Experiments with NodeGame
Synthesizing knowledge from disagreement -- Manchester -- 2015-05-06

Similar to Survival Factorization on Diffusion Networks (20)

PPT
The Crisis of Mediators? Science Communication in the Digital Age
PDF
Science & Society -- From Dissemination to Deliberation
ODP
Science2.0 or "How happy is a researcher discovering the existence of Yet Ano...
PDF
Groeling, Tim: NewsScape: Preserving TV News
PPT
Angelina Russo - Innovation Island
PDF
DRIVE 2016 | 27 October - RTD: Resourceful ageing
PPTX
Presentation
PPTX
Introduction to Computational Social Science - Lecture 1
PDF
NORFest2023 Keynote address: Chelle Gentemann (NASA)
PPTX
Accessibility as Innovation - giving your potential users the chance to inspi...
PDF
Tinnitusbook
PPTX
Gatewatching 11: Echo Chambers? Filter Bubbles? Reviewing the Evidence
PDF
Workshop: Science Meets Industry: Online Behavioral Experiments with nodeGame...
PDF
General Systemology 1st Ed David Rousseau Jennifer Wilby Julie Billingham
PDF
Data Visualization
PDF
Do Museums Worldwide form a true Community on Twitter? Museum Twitter ecosys...
PPTX
Iterative knowledge extraction from social networks. The Web Conference 2018
PPT
Open access for researchers, policy makers and research managers
PDF
Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017
PPTX
Dissemination v1
The Crisis of Mediators? Science Communication in the Digital Age
Science & Society -- From Dissemination to Deliberation
Science2.0 or "How happy is a researcher discovering the existence of Yet Ano...
Groeling, Tim: NewsScape: Preserving TV News
Angelina Russo - Innovation Island
DRIVE 2016 | 27 October - RTD: Resourceful ageing
Presentation
Introduction to Computational Social Science - Lecture 1
NORFest2023 Keynote address: Chelle Gentemann (NASA)
Accessibility as Innovation - giving your potential users the chance to inspi...
Tinnitusbook
Gatewatching 11: Echo Chambers? Filter Bubbles? Reviewing the Evidence
Workshop: Science Meets Industry: Online Behavioral Experiments with nodeGame...
General Systemology 1st Ed David Rousseau Jennifer Wilby Julie Billingham
Data Visualization
Do Museums Worldwide form a true Community on Twitter? Museum Twitter ecosys...
Iterative knowledge extraction from social networks. The Web Conference 2018
Open access for researchers, policy makers and research managers
Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017
Dissemination v1
Ad

Recently uploaded (20)

PDF
annual-report-2024-2025 original latest.
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Introduction to the R Programming Language
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction to machine learning and Linear Models
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Computer network topology notes for revision
PPTX
1_Introduction to advance data techniques.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Miokarditis (Inflamasi pada Otot Jantung)
annual-report-2024-2025 original latest.
Introduction to Knowledge Engineering Part 1
Introduction to the R Programming Language
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
[EN] Industrial Machine Downtime Prediction
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction to machine learning and Linear Models
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Computer network topology notes for revision
1_Introduction to advance data techniques.pptx
Clinical guidelines as a resource for EBP(1).pdf
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Supervised vs unsupervised machine learning algorithms
Miokarditis (Inflamasi pada Otot Jantung)
Ad

Survival Factorization on Diffusion Networks