SlideShare a Scribd company logo
Least Cost Influence in
Multiplex Social Networks
MODEL REPRESENTATION AND ANALYSIS
Presented by:
Ayushi Jain Rahul Bobhate
Natasha Mandal Ankur Sachdeva
Dung T. Nguyen, Huiyuan Zhang, Soham Das, My T. Thai, Thang N. Dinh
Structure
• Define a few terms
• Motivation
• Related work
• Challenges and proposed solution
• Math notations and problem definition
• Lossless coupling
• Lossy coupling
• Influence relay
• Experiments
• Conclusion
What are Multiplex networks?
• Networks extended to multiple edges between nodes like in more than one
social media platforms
• Example: A set of users who interact of Facebook, Twitter & Foursquare
What is least cost influence (LCI) problem?
• A minimum number of seed users who can eventually influence a large number
of users
• Example: How to find the least advertising cost set of influencers who can
influence a massive number of users
Or
How to find the minimum number of inducements required for the product
adoption to reach a certain proportion of the population
motivation
• In the recent decade, the popularity of OSNs has created a major
communication medium which allows for information sharing
• Similar to real social networks: word-of-mouth & peer-pressure effect
Do you know how much time does an individual spend(on average) on social
media?
1.72
hours per
day
28% of
online
activity
some more statistics!
Number of Facebook Users
Least Cost Influence in Multiplex Social Networks
Number of users(in millions)
Why is it important to study information diffusion in these networks??
• Considerable number of overlapping users
• Users can relay the information from one network to another
• Example:
Jack
If we only consider the information propagation in one network, we’ll
fail to identify the most influent users
Single network
• Kempe et al.
• Find a set of k users who can maximize influence
• Stochastic process- Independent Cascade Model (IC)
• Probability of influencing friends α Strength of Friendship
• NP Hard- greedy algorithm with approximation ratio (1-1/e)
• Linear Threshold Model (LT)
• User adopts a new product when total influence of friends exceeds a threshold
• Dinh et al.
• Suggested algorithm for a special case of LT
• Influence between users is uniform and user is influenced if a certain fraction ρ of
his friends are active
Related work
Multiplex Networks
• Yagan et al.
• Studied connection between online and offline networks
• Investigated outbreak of information using SIR model on random networks
• Liu at al.
• Analyzed networks formed by online interaction and offline events
Drawbacks:
• Studied flow of information and network clustering but not LCI
• Did not study specific optimization problem of viral marketing
• Shen et al.
• Studied information propagation in multiplex OSN
• Combined all networks into one network by representing an overlapping user as a
super node
• Cannot preserve individual networks’ properties
challenges
How to evaluate
influence of
overlapping
users in
multiplex
networks?
In which
network, a user
is easier to be
influenced?
Which network
propagates the
influence
better?
• In this paper, we study LCI for a set of users with minimum cardinality to
influence a certain fraction of users in multiplex networks
• Represent a model for various coupling schemes to reduce the problem in
multiplex networks to an equivalent problem on a single network. Coupling
schemes can be applied for most popular diffusion models including: Linear
Threshold model, Stochastic Threshold model, and Independent Cascading
model
• Introduce a new metric called influence relay to analyze the influence diffusion
process in both- a single network and multiplex networks
Proposed solution
Graph Notations
• Gi – Weighted directed graph consisting of (Vi, Ei, θi, Wi).
• Vi – Set of vertices in graph Gi, represents users in the network.
• Ei – Set of edges in graph Gi, which represent the connection between the
users.
• Wi – Set of weights of the edges which belong to Ei, which represents
the strength of influence or the strength of connection.
• Nu
i- , Nu
i+ – Set of incoming and outgoing neighbors of u.
• θi(u) – Threshold indicating the persistence of opinions of u.
Least Cost Influence (LCI) Problem definition
• Given:
• System of k networks G1..k
• Set of users U
• Time hop d
• 0<β<1
• To find:
• A seed set S ⊂ U of minimum cardinality to such that
• There are at least β fraction of users U active
• After d hops
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
Linear Threshold model
• Influence and information diffusion model for single network
• Could be extended to handle multiple networks
• In LT model:
• Every user is either active or inactive
• A user u is active if he/she accepts the information OR
• The total influence of their neighbors is greater than their threshold.
• After each time hop, inactive users are activated and they continue to activate new
users.
• d be the number of hops in the network till which information is propagated.
• Active set of users after d hops caused by seed set S is denoted by Ad(G1...k, S)
Coupling Schemes
• Lossless coupling scheme:
• Scheme to combine multiple networks into single network.
• No loss of data while combining networks. (Obviously!)
• Advantages:
• Use existing algorithms
• Same quality of solution
Challenges
• Heterogeneity of user participation:
• User might have joined a single network
• Other user might have joined multiple networks
• Recognition of users is difficult
• Inter-network Influence propagation
• User transmits the information in multiple networks
• Represent transmission of influence between networks in a single network.
• Preserving properties of individual networks
• Coupled network should preserve diffusion properties of individual networks.
• Should be able to establish relationship between solution for coupled network and
individual network
Least Cost Influence in Multiplex Social Networks
Coupling scheme for LT-model
• Solution to 1st challenge
• Introduce dummy nodes.
• They represent a user u in the network Gi, in which the user is not registered.
• Solution to 2nd challenge
• Introduce gateway vertices.
• Introduce Synchronization edges.
• Instead of an edge between two vertices, there exist
• An edge between a user to a gateway vertex
• And an edge from gateway vertex to a user
• Solution to 3rd challenge
• Don’t need to do anything else.
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
Least Cost Influence in Multiplex Social Networks
Lemmas
• Lemma 1: Suppose that the propagation process in the coupled network G
starts from the seed set which contains only gateway vertices S = {s0
1, . . . , s0
p},
then representative vertices are activated only at even propagation hops.
• Lemma 2: Suppose that the propagation process on G1...k and G starts from the
same seed set S, then following conditions are equivalent:
• User u is active after d propagation hops in G1...k.
• There exists i such that ui is active after 2d − 1 propagation hops in G.
• Vertex u0 is active after 2d propagation hops in G.
Theorems
• Theorem 1: Given a system of k networks G1...k with the user set U, the coupled
network G produced by the lossless coupling scheme, and a seed set S = {s1, s2,
. . . , sp}, if Ad(G1...k, S) = {a1, a2, . . . , aq} is the set of active users caused by S
after d propagation hops in multiplex networks, then A2d(G, S)= {a0
1, a1
1, . . . ,
ak
1, . . ., a0
q, a1
q, . . . , ak
q} is the set of active vertices caused by S after 2d
propagation hops in the coupled network.
• Theorem 2: When the lossless scheme is used, the set S = {s1, s2, . . . , sp}
influences β fraction of users in G1...k after d propagation hops if and only if S =
{s0
1, s0
2, . . . , s0
p} influences β fraction of vertices in coupled network G after 2d
propagation hops.
Extension to other diffusion models
• Lossless coupling scheme can be used for other diffusion models.
• Stochastic Threshold model
• Independent Cascading model
• Similarity between LT model and other approaches
• Same approach of using
• Gateway vertices
• Representative vertices
• Synchronization edges
Lossy Coupling
MOTIVATION
• In the coupled network of Lossless Coupling which was shown, there were a
large number of extra vertices and edges.
• It is ideal to have a compact coupled network which contains only users as
vertices.
• Such a compact coupled network will inevitably have loss of information.
Lossy Coupling
GOALS
• The goal is to design a scheme which will minimize this loss of information.
• The solution for finding the Least Cost Influence in the compact coupled
network should be very close to the solution in the original multiplex network.
Lossy Coupling
OBSERVATION 1
• A user 𝑢 will be activated if there exists 𝑖 such that 𝑣∈𝑁 𝑢
𝑖−∩𝐴
𝑤 𝑖(𝑣, 𝑢) ≥
𝜃 𝑖(𝑢) where 𝐴 is the set of active users.
• We can relax the conditions to activate 𝑢 with positive parameters 𝛼 𝑖
𝑢 as in
follows:
𝑖=1
𝑘
(𝛼 𝑖(𝑢) 𝑣∈𝑁 𝑢
𝑖−
∩𝐴
𝑤 𝑖(𝑣, 𝑢)) ≥ 𝑖=1
𝑘
𝛼 𝑖(𝑢)𝜃 𝑖(𝑢)
Lossy Coupling
PROPOSITION 1
• For a system of networks 𝐺1..𝑘 , if
𝑖=1
𝑘
(𝛼 𝑖(𝑢) 𝑣∈𝑁 𝑢
𝑖−∩𝐴
𝑤 𝑖(𝑣, 𝑢)) ≥ 𝑖=1
𝑘
𝛼 𝑖(𝑢)𝜃 𝑖(𝑢)
is satisfied, then user 𝑢 is activated.
• This can be used by checking the condition for a single network 𝑖. The
inequality still holds because 𝛼 𝑖 𝑢 > 0.
Lossy Coupling
• 𝛼 𝑖 𝑢 can constitute for extra influence which may be required to activate 𝑢
• 𝛼 𝑖
𝑢 can be made proportional to 𝑣∈𝑁 𝑢
𝑖−
∩𝐴
𝑤 𝑖
(𝑣, 𝑢) − 𝜃 𝑖
(𝑢) . In this way,
when 𝑣∈𝑁 𝑢
𝑖−∩𝐴
𝑤 𝑖(𝑣, 𝑢) > 𝜃 𝑖 𝑢 we choose 𝛼 𝑖 𝑢 ≫ 𝛼 𝑗 𝑢 [∀𝑗 ≠ 𝑖].
• In real life, we don’t know in which network 𝑢 will be activated. Hence, we
have to use heuristics.
Lossy Coupling
OBSERVATION 2
• When 𝑢 participates in multiple networks, it may be easier to influence 𝑢 in
some networks, than in others.
• For example if a node 𝑢 is in two networks:
Network 1: 𝜃1(𝑢) = 0.1, 𝑢 has 8 in-neighbors and each in-neighbor 𝑣 influences 𝑢
with 𝑤1(𝑣, 𝑢) = 0.1, it takes 1 neighbor to activate 𝑢.
Network 2: 𝜃2(𝑢) = 0.7, 𝑢 has 8 in-neighbors and each in-neighbor 𝑣 influences 𝑢
with 𝑤2(𝑣, 𝑢) = 0.1, it takes 7 neighbors to activate 𝑢.
Lossy Coupling
EASINESS
• Intuitively we can say that 𝑢 is easier to influence in Network 1.
• Formally, 𝑒𝑎𝑠𝑖𝑛𝑒𝑠𝑠 𝜖 𝑖(𝑢) =
𝑣∈𝑁 𝑢
𝑖− 𝑤 𝑖(𝑣,𝑢)
𝜃 𝑖(𝑢)
• We can use 𝑒𝑎𝑠𝑖𝑛𝑒𝑠𝑠 𝜖 𝑖(𝑢) as 𝛼 𝑖 𝑢 for the equation stated in OBSERVATION
1.
Lossy Coupling
• Vertex Set is the set of users 𝑉 = {𝑢1,…𝑢 𝑛}
• The threshold of vertex 𝑢 is 𝜃 𝑢 = 𝑖=1
𝑘
𝜖 𝑖(𝑢)𝜃 𝑖(𝑢)
• The weight of edge (𝑣, 𝑢) is 𝑤 𝑣, 𝑢 = 𝑖=1
𝑘
𝜖 𝑖(𝑢)𝑤 𝑖(𝑣, 𝑢) where 𝑤 𝑖 𝑣, 𝑢 =
0 if there is no edge from 𝑣 to 𝑢 in the network 𝑖
Lossy Coupling
For the blue node,
𝜃 𝑢 = 𝑖=1
𝑘
𝜖 𝑖
(𝑢)𝜃 𝑖
(𝑢) =>
0.2+0.1
0.2
∗ 0.2 +
0.5
0.5
∗ 0.5 = 0.8
For the edge between red node and blue node,
𝑤 𝑣, 𝑢 = 𝑖=1
𝑘
𝜖 𝑖 𝑢 𝑤 𝑖 𝑣, 𝑢 =>
0.2+0.1
0.2
∗ 0.2 + 0 = 0.3
Lossy Coupling
INVOLVEMENT
• If a user is surrounded by a group of friends who have a high influence on each
other, the user tends to get influenced.
• We estimate 𝑖𝑛𝑣𝑜𝑙𝑣𝑒𝑚𝑒𝑛𝑡 of a node 𝑣 in a network 𝐺 𝑖by measuring how
strongly the 1-hop neighborhood 𝑣 is connected and to what extent influence
can propagate from one node to another in a 1-hop neighborhood.
Lossy Coupling
• Formally, 𝑖𝑛𝑣𝑜𝑙𝑣𝑒𝑚𝑒𝑛𝑡 of a node 𝑣 in a network 𝐺 𝑖 is defined as 𝜎𝑣
𝑖
=
𝑥,𝑦∈𝑁 𝑉
𝑖
∪{𝑣}
𝑤 𝑖 𝑥,𝑦
𝜃 𝑦
𝑖 where 𝑁 𝑉
𝑖
= 𝑁 𝑉
𝑖+
∪ 𝑁 𝑉
𝑖−
AVERAGE
• All parameters have same value i.e. 𝛼 𝑖 𝑢 = 1
Lossy Coupling
THEOREM 3
• When a lossy coupling scheme is used, if the set of users 𝑆 activates 𝛽 fraction
of users in 𝐺 (lossy coupled network), then it activates at least 𝛽 fraction of
users in 𝐺1..𝑘 (original system).
• The proof is based on the fact that the active state of a user in 𝐺 implies an
active state of users in 𝐺1..𝑘
.
Least Cost Influence in Multiplex Social Networks
Influence Relay
MOTIVATION
• When information is diffused in multiplex networks, it may flow within a single
network or may travel through multiplex networks.
• What is the contribution of each component network in the influence process?
• How much information flows within a network or between networks?
• Quantifying these values will help us understand the diffusion process in
multiplex networks.
Influence Relay
DEFINITION
• The authors proposed 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 as a metric to quantify the role of
users in propagating information.
• The 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices is recursively defined depending on order of
activation.
• 𝑆 = seed set, 𝐺= coupled network, 𝑑 = number of hops after which the
activation process stops, ℎ 𝑢 = hop at which u is activated.
• All inactive vertices in 𝑉𝐴 𝑑(𝐺, 𝑆) have an 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of 0.
Influence Relay
• For each activated vertex 𝑢 ∈ 𝐴 𝑑
(𝐺, 𝑆), 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of u, denoted by
𝐼𝑅(𝑢), is a linear combination of the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of its outgoing
neighbors that are activated after 𝑢.
• Formally, the 𝐼𝑅 𝑢 of vertices 𝑢 ∈ 𝐴 𝑑(𝐺, 𝑆) is defined as:
𝐼𝑅 𝑢 = 1 +
(𝑣∈𝑁 𝑢
+)∩(ℎ 𝑣 >ℎ(𝑢))
𝑤 𝑢, 𝑣 𝐼𝑅(𝑣)
(𝑧∈𝑁 𝑧
−)∩(ℎ 𝑧 <ℎ(𝑣)) 𝑤(𝑧, 𝑣)
Influence Relay
• The 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 captures the amount of influence a vertex relays to other
vertices after adopting the information.
• Thus, the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of a vertex 𝑢 depends largely on the
𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices that 𝑢 helps to activate and the weight of edges
between 𝑢 and them.
• The vertex 𝑢 is responsible for
𝑤 𝑢,𝑣
(𝑧∈𝑁 𝑧
−)∩(ℎ 𝑧 <ℎ(𝑣)) 𝑤(𝑧,𝑣)
of 𝑣 ’s
𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦.
• We add 1 to 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of 𝑢 since 𝑢 also contributes itself to the set of
activated vertices.
Influence Relay
COMPUTING INFLUENCE RELAY
• We compute 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices in reverse order of the diffusion
process.
• We construct the influence graph 𝐼𝐺𝑆 = (𝑉𝑆, 𝐸𝑆) from the seed set 𝑆 to
represent the diffusion process and to calculate the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of all
nodes in 𝑉𝑆.
• The vertex set 𝑉𝑆 of 𝑛 𝑆 nodes is 𝐴 𝑑(𝐺, 𝑆).
• There is an edge from 𝑢 to 𝑣 in 𝐸𝑆 if 𝑢 has passed information to 𝑣 i.e. 𝑢, 𝑣 ∈
𝐴 𝑑
(𝐺, 𝑆) and ℎ 𝑣 > ℎ(𝑢).
• 𝐼𝐺𝑆 is a directed acyclic graph and the reverse topological ordering of 𝐼𝐺𝑆 takes
linear time. The main loop runs for all the edges in 𝐼𝐺𝑆 so 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of
all vertices can be computed in linear time.
Input: A network 𝐺, a seed set 𝑆 and the number of hops 𝑑.
Output: The influence relay 𝐼𝑅 of all vertices.
𝐼𝐺𝑆 ← The influence graph caused by 𝑆 on 𝐺
for each 𝑢 ∈ 𝐴 𝑑
(𝐺, 𝑆) do
𝐼𝑅(𝑢) ← 0
end for
Compute the topological ordering 𝑢1, 𝑢2, … , 𝑢 𝑛 𝑆
of vertices in 𝑉𝑆
for 𝑖 = 𝑛 𝑆 down to 1 do
𝐼𝑅(𝑢𝑖) ← 𝐼𝑅(𝑢𝑖) + 1
total ← 0
for each 𝑣 ∈ 𝑁 𝑢
−
do
total ← total + 𝑤(𝑣, 𝑢𝑖)
end for
for each 𝑣 ∈ 𝑁 𝑢
−
do
𝐼𝑅(𝑣) ← 𝐼𝑅(𝑣) +
𝑤(𝑣,𝑢 𝑖)𝐼𝑅(𝑢 𝑖)
𝑡𝑜𝑡𝑎𝑙
end for
end for
Return IR
Influence Relay
THEOREM 4
• One of the important properties of 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 is that it preserves the
number of activated vertices.
• The total 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of seeding vertices is equal to the total number of
activated vertices.
𝑢∈𝑆 𝐼𝑅 𝑢 = |𝐴 𝑑(𝐺, 𝑆)|
Influence Relay
INFLUENCE CONTRIBUTION
• To obtain the contribution of a network to the diffusion process, we sum up
𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of all seed vertices in that network.
INTERNAL AND EXTERNAL INFLUENCE
• This can be used to quantify the amount of information flowing within and
between networks.
Influence Relay
• When the information is propagated within a component network called the
“target” network there are two kinds of influence paths:
• 𝐼𝑛𝑡𝑒𝑟𝑛𝑎𝑙 𝑝𝑎𝑡ℎ𝑠 include edges only in the target network.
• 𝐸𝑥𝑡𝑒𝑟𝑛𝑎𝑙 𝑝𝑎𝑡ℎ𝑠 include some edges of other networks. They are formed
when some of the vertices are activated outside the target network.
• We adapt relay influence to measure internal influence (passes through
internal paths) and external influence (passes through external paths) of the
seed set in the target network as follows:
Influence Relay
• Each vertex 𝑢 has internal influence 𝐼𝑅 𝑖𝑛 𝑢 and external influence 𝐼𝑅 𝑒𝑥 𝑢 .
• Both values are calculated backwards from activated vertices under 𝑢’s
influence.
• Only activated vertex 𝑢 in the target network receives 1 more influence unit to
𝐼𝑅 𝑖𝑛 𝑢 since we only consider the influence propagation in the target
network.
• If a vertex is activated outside the target network, all internal influence is
converted to external influence.
Least Cost Influence in Multiplex Social Networks
EXPERIMENTS
Data Sets
Typesofdatasets Real Networks
Synthesized
Networks
Real Networks
• Experiments performed on 2 data sets :
• Foursquare (FSQ) and Twitter networks
• Co-author networks in the area of Condensed Matter(CM), High-Energy
Theory(Het), and Network Science(NetS)
• Number of overlapping users in first dataset FSQ-Twitter is 4100.
• For second dataset, the numbers of overlapping users of the network pairs CM-Het,
CM-NetS, and Het-NetS are 2860, 517, and 90, respectively.
Real Networks
Weights of edges are
randomly assigned
from 0 to 1.
The edge weights are
then normalized so
that the total weight
of incoming degree of
each node is 1.
Threshold of each
node is a random
value from 0 to 1.
Synthesized Networks
• Synthesized networks generated by Erdos-Renyi random network model are
used for testing networks with controlled parameters.
• Two networks with 10000 nodes are formed by randomly connecting each pair
of nodes with probabilities 0.0008 and 0.006.
• The average degrees of the two networks are 8 and 60.
Comparison of coupling schemes
Solution Quality
• In both networks the seed size is smallest when the lossless coupling scheme
is used.
• The seed sizes are only a bit larger using the lossy coupling schemes.
Comparison of coupling schemes
• The small seed size is obtained through two different means:
• Increasing the fraction of overlapping users.
• Increasing the number of propagation hops.
Comparison of coupling schemes
Running Time
• The greedy algorithm runs much faster in the lossy coupled networks than in the
lossless coupled networks.
• Using the lossy coupled networks reduces the running times by a factor of 2 in FSQ-
Twitter and a factor 4 in the co-author networks in comparison to using the lossless
coupled networks.
• The major disadvantages of the lossless coupling scheme are the doubled number
of hops and the number of extra nodes and edges.
Advantages of using coupled networks
Influencing a fraction β of the nodes in all networks:
• The results using our lossless coupling method outperform the results when
we run the greedy algorithm on each network separately and take the union
of the produced seed sets.
• In Co-author networks, the size of seed set is 30% larger, and in FSQ-Twitter,
it is 47% larger than the size of seed sets using lossless coupling method.
Influencing a fraction β of the nodes in a particular network:
• The seed size decreases up to 9%, 25%, 17%, and 26% in CM, Het, FSQ, and Twitter,
respectively, when we consider these networks in connection with other networks.
• The external influence is substantial and accounts for large portions in many cases.
For instance, when the influenced fraction β = 0.2, the external influence accounts
for 27.3%, 52.7%, and 30.0% the total influence in CM, Het, and NetS, respectively.
Analysis of seed sets
• A significant fraction of the seed set is overlapping nodes although only 5%-7% users
of any network are overlapping users.
• For β = 0.4, the fraction of overlapping seed vertices is around 24.9% and 25% in the
co-author and FSQ-Twitter networks, respectively.
• When β is small, there is high influence contribution of overlapping users(approx. 50%
when β = 0.2). However when β is large, overlapping users are already selected so
they are not favored.
Mutual Impact of networks
• When k increases from 2 to 5, the seed size decreases several times. It implies that the
introduction of a new OSN increases the diffusion of information significantly.
• The number of influenced vertices is raised 46% with the support of 3 new networks
when k is changed from 2 to 5.
• the fraction of external influence is also increased dramatically from 39% when k = 2
to 67% when k = 5.
• All these results suggest that the existing networks may benefit from the newly
introduced competitor.
Conclusion and future Work
• To tackle the LCI problem, novel coupling schemes are introduced to reduce
the problem to a version on a single network.
• A new metric is designed to quantify the flow of influence inside and between
networks based on the coupled network.
• Exhaustive experiments provide new insights to the information diffusion in
multiplex networks.
• In future, the LCI problem can be investigated in multiplex networks with
heterogeneous diffusion models in which each network may have its own
diffusion mode.
Least Cost Influence in Multiplex Social Networks
Thank you!!
Thank You!!

More Related Content

PDF
Trust management in adhoc networks a social network based approach
PDF
11.trust management in adhoc networks a social network based approach
PPTX
4. social network analysis
PPTX
Community Detection
PDF
Exploratory social network analysis with pajek
PDF
network mining and representation learning
PPTX
Jeffrey xu yu large graph processing
PPTX
17 Statistical Models for Networks
Trust management in adhoc networks a social network based approach
11.trust management in adhoc networks a social network based approach
4. social network analysis
Community Detection
Exploratory social network analysis with pajek
network mining and representation learning
Jeffrey xu yu large graph processing
17 Statistical Models for Networks

What's hot (20)

PPTX
Community detection
PPTX
06 Community Detection
PPTX
Recomendation system: Community Detection Based Recomendation System using Hy...
PPT
Clique-based Network Clustering
PPTX
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
PPT
Download
PDF
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
PDF
Speeding up Distributed Big Data Recommendation in Spark
PPTX
Group and Community Detection in Social Networks
PDF
Collaborative Filtering with Spark
PDF
Introduction to Topological Data Analysis
PDF
Community Detection with Networkx
PDF
Topological Data Analysis of Complex Spatial Systems
PDF
08 Exponential Random Graph Models (ERGM)
PPTX
04 Diffusion and Peer Influence
PPTX
05 Whole Network Descriptive Stats
PDF
Community Detection in Social Media
PPT
Comtella: Adaptive Rewards Mechanism to Incentivize Participation in Online C...
PDF
Snowbird comp-top-may2017
PDF
00 Automatic Mental Health Classification in Online Settings and Language Emb...
Community detection
06 Community Detection
Recomendation system: Community Detection Based Recomendation System using Hy...
Clique-based Network Clustering
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Download
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Speeding up Distributed Big Data Recommendation in Spark
Group and Community Detection in Social Networks
Collaborative Filtering with Spark
Introduction to Topological Data Analysis
Community Detection with Networkx
Topological Data Analysis of Complex Spatial Systems
08 Exponential Random Graph Models (ERGM)
04 Diffusion and Peer Influence
05 Whole Network Descriptive Stats
Community Detection in Social Media
Comtella: Adaptive Rewards Mechanism to Incentivize Participation in Online C...
Snowbird comp-top-may2017
00 Automatic Mental Health Classification in Online Settings and Language Emb...
Ad

Similar to Least Cost Influence in Multiplex Social Networks (20)

PDF
Least Cost Influence by Mapping Online Social Networks
PDF
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
PPT
Spread influence on social networks
PDF
Kdd12 tutorial-inf-part-iii
PDF
Homophily and influence in social networks
PDF
Twittering Dissent
PDF
Kdd12 tutorial-inf-part-iv
PPTX
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
PDF
An improvised model for identifying influential nodes in multi parameter soci...
PDF
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
PPTX
Maximizing the Spread of Influence through a Social Network (1).pptx
PPTX
2012 kdd-com soc:adaptive transfer of user behaviors over composite social ne...
PDF
SDM 2019 Keynote
PPT
Least cost rumor blocking in social networks
PPTX
Big-O(Q) Social Network Analytics
PPTX
Big-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the Obvious
PDF
Influence-based Network-oblivious - ICDM 2013
PDF
Maximizing the Diversity of Exposure in a Social Network
PDF
Consensus on Multiplex Network To Calculate User Influence in Social Networks
PDF
Content-based link prediction
Least Cost Influence by Mapping Online Social Networks
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Spread influence on social networks
Kdd12 tutorial-inf-part-iii
Homophily and influence in social networks
Twittering Dissent
Kdd12 tutorial-inf-part-iv
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...
An improvised model for identifying influential nodes in multi parameter soci...
Who will follow whom? Exploiting Semantics for Link Prediction in Attention-I...
Maximizing the Spread of Influence through a Social Network (1).pptx
2012 kdd-com soc:adaptive transfer of user behaviors over composite social ne...
SDM 2019 Keynote
Least cost rumor blocking in social networks
Big-O(Q) Social Network Analytics
Big-O(Q) VLDB 2015 Keynote: Social Network Analytics: Beyond the Obvious
Influence-based Network-oblivious - ICDM 2013
Maximizing the Diversity of Exposure in a Social Network
Consensus on Multiplex Network To Calculate User Influence in Social Networks
Content-based link prediction
Ad

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Big Data Technologies - Introduction.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
KodekX | Application Modernization Development
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Spectroscopy.pptx food analysis technology
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Encapsulation theory and applications.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Big Data Technologies - Introduction.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KodekX | Application Modernization Development
Dropbox Q2 2025 Financial Results & Investor Presentation
Empathic Computing: Creating Shared Understanding
Spectroscopy.pptx food analysis technology
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Review of recent advances in non-invasive hemoglobin estimation
sap open course for s4hana steps from ECC to s4
Encapsulation theory and applications.pdf
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Least Cost Influence in Multiplex Social Networks

  • 1. Least Cost Influence in Multiplex Social Networks MODEL REPRESENTATION AND ANALYSIS Presented by: Ayushi Jain Rahul Bobhate Natasha Mandal Ankur Sachdeva Dung T. Nguyen, Huiyuan Zhang, Soham Das, My T. Thai, Thang N. Dinh
  • 2. Structure • Define a few terms • Motivation • Related work • Challenges and proposed solution • Math notations and problem definition • Lossless coupling • Lossy coupling • Influence relay • Experiments • Conclusion
  • 3. What are Multiplex networks? • Networks extended to multiple edges between nodes like in more than one social media platforms • Example: A set of users who interact of Facebook, Twitter & Foursquare
  • 4. What is least cost influence (LCI) problem? • A minimum number of seed users who can eventually influence a large number of users • Example: How to find the least advertising cost set of influencers who can influence a massive number of users Or How to find the minimum number of inducements required for the product adoption to reach a certain proportion of the population
  • 5. motivation • In the recent decade, the popularity of OSNs has created a major communication medium which allows for information sharing • Similar to real social networks: word-of-mouth & peer-pressure effect Do you know how much time does an individual spend(on average) on social media? 1.72 hours per day 28% of online activity
  • 6. some more statistics! Number of Facebook Users
  • 8. Number of users(in millions)
  • 9. Why is it important to study information diffusion in these networks?? • Considerable number of overlapping users • Users can relay the information from one network to another • Example: Jack
  • 10. If we only consider the information propagation in one network, we’ll fail to identify the most influent users
  • 11. Single network • Kempe et al. • Find a set of k users who can maximize influence • Stochastic process- Independent Cascade Model (IC) • Probability of influencing friends α Strength of Friendship • NP Hard- greedy algorithm with approximation ratio (1-1/e) • Linear Threshold Model (LT) • User adopts a new product when total influence of friends exceeds a threshold • Dinh et al. • Suggested algorithm for a special case of LT • Influence between users is uniform and user is influenced if a certain fraction ρ of his friends are active Related work
  • 12. Multiplex Networks • Yagan et al. • Studied connection between online and offline networks • Investigated outbreak of information using SIR model on random networks • Liu at al. • Analyzed networks formed by online interaction and offline events Drawbacks: • Studied flow of information and network clustering but not LCI • Did not study specific optimization problem of viral marketing • Shen et al. • Studied information propagation in multiplex OSN • Combined all networks into one network by representing an overlapping user as a super node • Cannot preserve individual networks’ properties
  • 13. challenges How to evaluate influence of overlapping users in multiplex networks? In which network, a user is easier to be influenced? Which network propagates the influence better?
  • 14. • In this paper, we study LCI for a set of users with minimum cardinality to influence a certain fraction of users in multiplex networks • Represent a model for various coupling schemes to reduce the problem in multiplex networks to an equivalent problem on a single network. Coupling schemes can be applied for most popular diffusion models including: Linear Threshold model, Stochastic Threshold model, and Independent Cascading model • Introduce a new metric called influence relay to analyze the influence diffusion process in both- a single network and multiplex networks Proposed solution
  • 15. Graph Notations • Gi – Weighted directed graph consisting of (Vi, Ei, θi, Wi). • Vi – Set of vertices in graph Gi, represents users in the network. • Ei – Set of edges in graph Gi, which represent the connection between the users. • Wi – Set of weights of the edges which belong to Ei, which represents the strength of influence or the strength of connection. • Nu i- , Nu i+ – Set of incoming and outgoing neighbors of u. • θi(u) – Threshold indicating the persistence of opinions of u.
  • 16. Least Cost Influence (LCI) Problem definition • Given: • System of k networks G1..k • Set of users U • Time hop d • 0<β<1 • To find: • A seed set S ⊂ U of minimum cardinality to such that • There are at least β fraction of users U active • After d hops
  • 19. Linear Threshold model • Influence and information diffusion model for single network • Could be extended to handle multiple networks • In LT model: • Every user is either active or inactive • A user u is active if he/she accepts the information OR • The total influence of their neighbors is greater than their threshold. • After each time hop, inactive users are activated and they continue to activate new users. • d be the number of hops in the network till which information is propagated. • Active set of users after d hops caused by seed set S is denoted by Ad(G1...k, S)
  • 20. Coupling Schemes • Lossless coupling scheme: • Scheme to combine multiple networks into single network. • No loss of data while combining networks. (Obviously!) • Advantages: • Use existing algorithms • Same quality of solution
  • 21. Challenges • Heterogeneity of user participation: • User might have joined a single network • Other user might have joined multiple networks • Recognition of users is difficult • Inter-network Influence propagation • User transmits the information in multiple networks • Represent transmission of influence between networks in a single network. • Preserving properties of individual networks • Coupled network should preserve diffusion properties of individual networks. • Should be able to establish relationship between solution for coupled network and individual network
  • 23. Coupling scheme for LT-model • Solution to 1st challenge • Introduce dummy nodes. • They represent a user u in the network Gi, in which the user is not registered. • Solution to 2nd challenge • Introduce gateway vertices. • Introduce Synchronization edges. • Instead of an edge between two vertices, there exist • An edge between a user to a gateway vertex • And an edge from gateway vertex to a user • Solution to 3rd challenge • Don’t need to do anything else.
  • 27. Lemmas • Lemma 1: Suppose that the propagation process in the coupled network G starts from the seed set which contains only gateway vertices S = {s0 1, . . . , s0 p}, then representative vertices are activated only at even propagation hops. • Lemma 2: Suppose that the propagation process on G1...k and G starts from the same seed set S, then following conditions are equivalent: • User u is active after d propagation hops in G1...k. • There exists i such that ui is active after 2d − 1 propagation hops in G. • Vertex u0 is active after 2d propagation hops in G.
  • 28. Theorems • Theorem 1: Given a system of k networks G1...k with the user set U, the coupled network G produced by the lossless coupling scheme, and a seed set S = {s1, s2, . . . , sp}, if Ad(G1...k, S) = {a1, a2, . . . , aq} is the set of active users caused by S after d propagation hops in multiplex networks, then A2d(G, S)= {a0 1, a1 1, . . . , ak 1, . . ., a0 q, a1 q, . . . , ak q} is the set of active vertices caused by S after 2d propagation hops in the coupled network. • Theorem 2: When the lossless scheme is used, the set S = {s1, s2, . . . , sp} influences β fraction of users in G1...k after d propagation hops if and only if S = {s0 1, s0 2, . . . , s0 p} influences β fraction of vertices in coupled network G after 2d propagation hops.
  • 29. Extension to other diffusion models • Lossless coupling scheme can be used for other diffusion models. • Stochastic Threshold model • Independent Cascading model • Similarity between LT model and other approaches • Same approach of using • Gateway vertices • Representative vertices • Synchronization edges
  • 30. Lossy Coupling MOTIVATION • In the coupled network of Lossless Coupling which was shown, there were a large number of extra vertices and edges. • It is ideal to have a compact coupled network which contains only users as vertices. • Such a compact coupled network will inevitably have loss of information.
  • 31. Lossy Coupling GOALS • The goal is to design a scheme which will minimize this loss of information. • The solution for finding the Least Cost Influence in the compact coupled network should be very close to the solution in the original multiplex network.
  • 32. Lossy Coupling OBSERVATION 1 • A user 𝑢 will be activated if there exists 𝑖 such that 𝑣∈𝑁 𝑢 𝑖−∩𝐴 𝑤 𝑖(𝑣, 𝑢) ≥ 𝜃 𝑖(𝑢) where 𝐴 is the set of active users. • We can relax the conditions to activate 𝑢 with positive parameters 𝛼 𝑖 𝑢 as in follows: 𝑖=1 𝑘 (𝛼 𝑖(𝑢) 𝑣∈𝑁 𝑢 𝑖− ∩𝐴 𝑤 𝑖(𝑣, 𝑢)) ≥ 𝑖=1 𝑘 𝛼 𝑖(𝑢)𝜃 𝑖(𝑢)
  • 33. Lossy Coupling PROPOSITION 1 • For a system of networks 𝐺1..𝑘 , if 𝑖=1 𝑘 (𝛼 𝑖(𝑢) 𝑣∈𝑁 𝑢 𝑖−∩𝐴 𝑤 𝑖(𝑣, 𝑢)) ≥ 𝑖=1 𝑘 𝛼 𝑖(𝑢)𝜃 𝑖(𝑢) is satisfied, then user 𝑢 is activated. • This can be used by checking the condition for a single network 𝑖. The inequality still holds because 𝛼 𝑖 𝑢 > 0.
  • 34. Lossy Coupling • 𝛼 𝑖 𝑢 can constitute for extra influence which may be required to activate 𝑢 • 𝛼 𝑖 𝑢 can be made proportional to 𝑣∈𝑁 𝑢 𝑖− ∩𝐴 𝑤 𝑖 (𝑣, 𝑢) − 𝜃 𝑖 (𝑢) . In this way, when 𝑣∈𝑁 𝑢 𝑖−∩𝐴 𝑤 𝑖(𝑣, 𝑢) > 𝜃 𝑖 𝑢 we choose 𝛼 𝑖 𝑢 ≫ 𝛼 𝑗 𝑢 [∀𝑗 ≠ 𝑖]. • In real life, we don’t know in which network 𝑢 will be activated. Hence, we have to use heuristics.
  • 35. Lossy Coupling OBSERVATION 2 • When 𝑢 participates in multiple networks, it may be easier to influence 𝑢 in some networks, than in others. • For example if a node 𝑢 is in two networks: Network 1: 𝜃1(𝑢) = 0.1, 𝑢 has 8 in-neighbors and each in-neighbor 𝑣 influences 𝑢 with 𝑤1(𝑣, 𝑢) = 0.1, it takes 1 neighbor to activate 𝑢. Network 2: 𝜃2(𝑢) = 0.7, 𝑢 has 8 in-neighbors and each in-neighbor 𝑣 influences 𝑢 with 𝑤2(𝑣, 𝑢) = 0.1, it takes 7 neighbors to activate 𝑢.
  • 36. Lossy Coupling EASINESS • Intuitively we can say that 𝑢 is easier to influence in Network 1. • Formally, 𝑒𝑎𝑠𝑖𝑛𝑒𝑠𝑠 𝜖 𝑖(𝑢) = 𝑣∈𝑁 𝑢 𝑖− 𝑤 𝑖(𝑣,𝑢) 𝜃 𝑖(𝑢) • We can use 𝑒𝑎𝑠𝑖𝑛𝑒𝑠𝑠 𝜖 𝑖(𝑢) as 𝛼 𝑖 𝑢 for the equation stated in OBSERVATION 1.
  • 37. Lossy Coupling • Vertex Set is the set of users 𝑉 = {𝑢1,…𝑢 𝑛} • The threshold of vertex 𝑢 is 𝜃 𝑢 = 𝑖=1 𝑘 𝜖 𝑖(𝑢)𝜃 𝑖(𝑢) • The weight of edge (𝑣, 𝑢) is 𝑤 𝑣, 𝑢 = 𝑖=1 𝑘 𝜖 𝑖(𝑢)𝑤 𝑖(𝑣, 𝑢) where 𝑤 𝑖 𝑣, 𝑢 = 0 if there is no edge from 𝑣 to 𝑢 in the network 𝑖
  • 38. Lossy Coupling For the blue node, 𝜃 𝑢 = 𝑖=1 𝑘 𝜖 𝑖 (𝑢)𝜃 𝑖 (𝑢) => 0.2+0.1 0.2 ∗ 0.2 + 0.5 0.5 ∗ 0.5 = 0.8 For the edge between red node and blue node, 𝑤 𝑣, 𝑢 = 𝑖=1 𝑘 𝜖 𝑖 𝑢 𝑤 𝑖 𝑣, 𝑢 => 0.2+0.1 0.2 ∗ 0.2 + 0 = 0.3
  • 39. Lossy Coupling INVOLVEMENT • If a user is surrounded by a group of friends who have a high influence on each other, the user tends to get influenced. • We estimate 𝑖𝑛𝑣𝑜𝑙𝑣𝑒𝑚𝑒𝑛𝑡 of a node 𝑣 in a network 𝐺 𝑖by measuring how strongly the 1-hop neighborhood 𝑣 is connected and to what extent influence can propagate from one node to another in a 1-hop neighborhood.
  • 40. Lossy Coupling • Formally, 𝑖𝑛𝑣𝑜𝑙𝑣𝑒𝑚𝑒𝑛𝑡 of a node 𝑣 in a network 𝐺 𝑖 is defined as 𝜎𝑣 𝑖 = 𝑥,𝑦∈𝑁 𝑉 𝑖 ∪{𝑣} 𝑤 𝑖 𝑥,𝑦 𝜃 𝑦 𝑖 where 𝑁 𝑉 𝑖 = 𝑁 𝑉 𝑖+ ∪ 𝑁 𝑉 𝑖− AVERAGE • All parameters have same value i.e. 𝛼 𝑖 𝑢 = 1
  • 41. Lossy Coupling THEOREM 3 • When a lossy coupling scheme is used, if the set of users 𝑆 activates 𝛽 fraction of users in 𝐺 (lossy coupled network), then it activates at least 𝛽 fraction of users in 𝐺1..𝑘 (original system). • The proof is based on the fact that the active state of a user in 𝐺 implies an active state of users in 𝐺1..𝑘 .
  • 43. Influence Relay MOTIVATION • When information is diffused in multiplex networks, it may flow within a single network or may travel through multiplex networks. • What is the contribution of each component network in the influence process? • How much information flows within a network or between networks? • Quantifying these values will help us understand the diffusion process in multiplex networks.
  • 44. Influence Relay DEFINITION • The authors proposed 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 as a metric to quantify the role of users in propagating information. • The 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices is recursively defined depending on order of activation. • 𝑆 = seed set, 𝐺= coupled network, 𝑑 = number of hops after which the activation process stops, ℎ 𝑢 = hop at which u is activated. • All inactive vertices in 𝑉𝐴 𝑑(𝐺, 𝑆) have an 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of 0.
  • 45. Influence Relay • For each activated vertex 𝑢 ∈ 𝐴 𝑑 (𝐺, 𝑆), 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of u, denoted by 𝐼𝑅(𝑢), is a linear combination of the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of its outgoing neighbors that are activated after 𝑢. • Formally, the 𝐼𝑅 𝑢 of vertices 𝑢 ∈ 𝐴 𝑑(𝐺, 𝑆) is defined as: 𝐼𝑅 𝑢 = 1 + (𝑣∈𝑁 𝑢 +)∩(ℎ 𝑣 >ℎ(𝑢)) 𝑤 𝑢, 𝑣 𝐼𝑅(𝑣) (𝑧∈𝑁 𝑧 −)∩(ℎ 𝑧 <ℎ(𝑣)) 𝑤(𝑧, 𝑣)
  • 46. Influence Relay • The 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 captures the amount of influence a vertex relays to other vertices after adopting the information. • Thus, the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of a vertex 𝑢 depends largely on the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices that 𝑢 helps to activate and the weight of edges between 𝑢 and them. • The vertex 𝑢 is responsible for 𝑤 𝑢,𝑣 (𝑧∈𝑁 𝑧 −)∩(ℎ 𝑧 <ℎ(𝑣)) 𝑤(𝑧,𝑣) of 𝑣 ’s 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦. • We add 1 to 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of 𝑢 since 𝑢 also contributes itself to the set of activated vertices.
  • 47. Influence Relay COMPUTING INFLUENCE RELAY • We compute 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of vertices in reverse order of the diffusion process. • We construct the influence graph 𝐼𝐺𝑆 = (𝑉𝑆, 𝐸𝑆) from the seed set 𝑆 to represent the diffusion process and to calculate the 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of all nodes in 𝑉𝑆. • The vertex set 𝑉𝑆 of 𝑛 𝑆 nodes is 𝐴 𝑑(𝐺, 𝑆). • There is an edge from 𝑢 to 𝑣 in 𝐸𝑆 if 𝑢 has passed information to 𝑣 i.e. 𝑢, 𝑣 ∈ 𝐴 𝑑 (𝐺, 𝑆) and ℎ 𝑣 > ℎ(𝑢). • 𝐼𝐺𝑆 is a directed acyclic graph and the reverse topological ordering of 𝐼𝐺𝑆 takes linear time. The main loop runs for all the edges in 𝐼𝐺𝑆 so 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of all vertices can be computed in linear time.
  • 48. Input: A network 𝐺, a seed set 𝑆 and the number of hops 𝑑. Output: The influence relay 𝐼𝑅 of all vertices. 𝐼𝐺𝑆 ← The influence graph caused by 𝑆 on 𝐺 for each 𝑢 ∈ 𝐴 𝑑 (𝐺, 𝑆) do 𝐼𝑅(𝑢) ← 0 end for Compute the topological ordering 𝑢1, 𝑢2, … , 𝑢 𝑛 𝑆 of vertices in 𝑉𝑆 for 𝑖 = 𝑛 𝑆 down to 1 do 𝐼𝑅(𝑢𝑖) ← 𝐼𝑅(𝑢𝑖) + 1 total ← 0 for each 𝑣 ∈ 𝑁 𝑢 − do total ← total + 𝑤(𝑣, 𝑢𝑖) end for for each 𝑣 ∈ 𝑁 𝑢 − do 𝐼𝑅(𝑣) ← 𝐼𝑅(𝑣) + 𝑤(𝑣,𝑢 𝑖)𝐼𝑅(𝑢 𝑖) 𝑡𝑜𝑡𝑎𝑙 end for end for Return IR
  • 49. Influence Relay THEOREM 4 • One of the important properties of 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 is that it preserves the number of activated vertices. • The total 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of seeding vertices is equal to the total number of activated vertices. 𝑢∈𝑆 𝐼𝑅 𝑢 = |𝐴 𝑑(𝐺, 𝑆)|
  • 50. Influence Relay INFLUENCE CONTRIBUTION • To obtain the contribution of a network to the diffusion process, we sum up 𝑖𝑛𝑓𝑙𝑢𝑒𝑛𝑐𝑒 𝑟𝑒𝑙𝑎𝑦 of all seed vertices in that network. INTERNAL AND EXTERNAL INFLUENCE • This can be used to quantify the amount of information flowing within and between networks.
  • 51. Influence Relay • When the information is propagated within a component network called the “target” network there are two kinds of influence paths: • 𝐼𝑛𝑡𝑒𝑟𝑛𝑎𝑙 𝑝𝑎𝑡ℎ𝑠 include edges only in the target network. • 𝐸𝑥𝑡𝑒𝑟𝑛𝑎𝑙 𝑝𝑎𝑡ℎ𝑠 include some edges of other networks. They are formed when some of the vertices are activated outside the target network. • We adapt relay influence to measure internal influence (passes through internal paths) and external influence (passes through external paths) of the seed set in the target network as follows:
  • 52. Influence Relay • Each vertex 𝑢 has internal influence 𝐼𝑅 𝑖𝑛 𝑢 and external influence 𝐼𝑅 𝑒𝑥 𝑢 . • Both values are calculated backwards from activated vertices under 𝑢’s influence. • Only activated vertex 𝑢 in the target network receives 1 more influence unit to 𝐼𝑅 𝑖𝑛 𝑢 since we only consider the influence propagation in the target network. • If a vertex is activated outside the target network, all internal influence is converted to external influence.
  • 55. Data Sets Typesofdatasets Real Networks Synthesized Networks
  • 56. Real Networks • Experiments performed on 2 data sets : • Foursquare (FSQ) and Twitter networks • Co-author networks in the area of Condensed Matter(CM), High-Energy Theory(Het), and Network Science(NetS) • Number of overlapping users in first dataset FSQ-Twitter is 4100. • For second dataset, the numbers of overlapping users of the network pairs CM-Het, CM-NetS, and Het-NetS are 2860, 517, and 90, respectively.
  • 57. Real Networks Weights of edges are randomly assigned from 0 to 1. The edge weights are then normalized so that the total weight of incoming degree of each node is 1. Threshold of each node is a random value from 0 to 1.
  • 58. Synthesized Networks • Synthesized networks generated by Erdos-Renyi random network model are used for testing networks with controlled parameters. • Two networks with 10000 nodes are formed by randomly connecting each pair of nodes with probabilities 0.0008 and 0.006. • The average degrees of the two networks are 8 and 60.
  • 59. Comparison of coupling schemes Solution Quality • In both networks the seed size is smallest when the lossless coupling scheme is used. • The seed sizes are only a bit larger using the lossy coupling schemes.
  • 60. Comparison of coupling schemes • The small seed size is obtained through two different means: • Increasing the fraction of overlapping users. • Increasing the number of propagation hops.
  • 61. Comparison of coupling schemes Running Time • The greedy algorithm runs much faster in the lossy coupled networks than in the lossless coupled networks. • Using the lossy coupled networks reduces the running times by a factor of 2 in FSQ- Twitter and a factor 4 in the co-author networks in comparison to using the lossless coupled networks. • The major disadvantages of the lossless coupling scheme are the doubled number of hops and the number of extra nodes and edges.
  • 62. Advantages of using coupled networks Influencing a fraction β of the nodes in all networks: • The results using our lossless coupling method outperform the results when we run the greedy algorithm on each network separately and take the union of the produced seed sets. • In Co-author networks, the size of seed set is 30% larger, and in FSQ-Twitter, it is 47% larger than the size of seed sets using lossless coupling method.
  • 63. Influencing a fraction β of the nodes in a particular network: • The seed size decreases up to 9%, 25%, 17%, and 26% in CM, Het, FSQ, and Twitter, respectively, when we consider these networks in connection with other networks. • The external influence is substantial and accounts for large portions in many cases. For instance, when the influenced fraction β = 0.2, the external influence accounts for 27.3%, 52.7%, and 30.0% the total influence in CM, Het, and NetS, respectively.
  • 64. Analysis of seed sets • A significant fraction of the seed set is overlapping nodes although only 5%-7% users of any network are overlapping users. • For β = 0.4, the fraction of overlapping seed vertices is around 24.9% and 25% in the co-author and FSQ-Twitter networks, respectively. • When β is small, there is high influence contribution of overlapping users(approx. 50% when β = 0.2). However when β is large, overlapping users are already selected so they are not favored.
  • 65. Mutual Impact of networks • When k increases from 2 to 5, the seed size decreases several times. It implies that the introduction of a new OSN increases the diffusion of information significantly. • The number of influenced vertices is raised 46% with the support of 3 new networks when k is changed from 2 to 5. • the fraction of external influence is also increased dramatically from 39% when k = 2 to 67% when k = 5. • All these results suggest that the existing networks may benefit from the newly introduced competitor.
  • 66. Conclusion and future Work • To tackle the LCI problem, novel coupling schemes are introduced to reduce the problem to a version on a single network. • A new metric is designed to quantify the flow of influence inside and between networks based on the coupled network. • Exhaustive experiments provide new insights to the information diffusion in multiplex networks. • In future, the LCI problem can be investigated in multiplex networks with heterogeneous diffusion models in which each network may have its own diffusion mode.