Absorbing Random Walk Centrality

Absorbing
Random
Walk
Centrality

Theory
and
Algorithms

1

Harry
Mavroforakis,
Boston
University

Michael
Mathioudakis,
Aalto
University

ArisAdes
Gionis,
Aalto
University

Helsinki
-‐
September
15th
2015

2

submit
query
to

twiKer

e.g.
‘#ferguson’

want
to
see

messages
from

few,
central
users

many
and
diﬀerent

groups
of
people

might
be
posAng

about
it

one
approach...

represent
acAvity

with
graph

3

users
in
the
results

(query
nodes)

other
users

connecAons

4

select
k

central
query
nodes

what
is
‘central’?

many
measures

have
been
studied

here,
we
use

random
walks
for

robustness

absorbing
random
walk
centrality

absorbing
random
walk
centrality

5

model

start

from
query
node
q
w.p.
s(q)

re-‐start

with
probability
α
at
each
step

transiAons

follow
edges
at
random
from
one
node
to
another

absorpAon

we
can
designate
set
of
absorbing
nodes

no
escape
from
absorbing
nodes

centrality
of
nodes
S

expected
Ame
(number
of
steps)
unAl
absorbed
by
S

problem

input

graph
,
query
nodes
,
α

&
candidate
nodes

(e.g.
query
nodes
or
all
nodes)

select

k
candidate
nodes

with
minimum
absorpAon
Ame

6

7

select
nodes
that

are
central
w.r.t.

the
query
nodes

k
=
1

8

select
nodes
that

are
central
w.r.t.

the
query
nodes

k
=
3

9

select
nodes
that

are
central
w.r.t.

the
query
nodes

k
≥|Q|

outline

•  complexity

•  greedy
algorithm

– naive
greedy

– speed-‐up

•  heurisAcs
&
baselines

•  empirical
evaluaAon

10

complexity

the
problem
is
NP-‐hard

reducAon
from

vertex cover
11

approximaAon

centrality
measure

monotonicity

absorpAon
Ame
decreases
with

more
absorbing
nodes

supermodularity

diminishing
returns

12

approximaAon

centrality
gain

mc:
min
centrality
for
k=1

gain
=
mc
-‐
centrality,
k>1

non-‐negaAve,
non-‐
decreasing,
submodular

13

S
υ {u}

S
υ
{u,v}
S

greedy
algorithm

(1-1/e)-‐approximaAon
guarantee
for
gain

greedy

S
=
empty

for
i
=
1..k

for
u
in
V
-‐
S

calculate
centrality
of
S
υ {u}
(*)

update
S
:=
S
υ {best
u}

14

boKleneck
is
in
line
(*)

one
matrix
inversion

(super-‐quadraAc)
for
step
(*)

use
sherman-morrison
to
perform
(*)
in
O(n2)

with
one
(1)
inversion
for
ﬁrst
node

however...
sAll
O(kn3)

heurisAcs
&
baselines

•  personalized
pagerank
with
same
α

•  spectralQ
&
spectralD

– spectral
embedding
of
nodes

– k-‐means
on
embedding
of
query
nodes

– select
candidates
close
to
centers

– spectral
Q
selects
more
nodes
from
larger
clusters

•  spectralC

– similar
to
spectralQ
but
clustering
on
candidates

•  degree
&
distance
centrality

15

evaluaAon

16

TABLE I: Dataset statistics
small
Dataset |V | |E|
karate 34 78
dolphins 62 159
lesmis 77 254
adjnoun 112 425
football 115 613
large
Dataset |V | |E|
kddCoauthors 2 891 2 891
livejournal 3 645 4 141
ca-GrQc 5 242 14 496
ca-HepTh 9 877 25 998
roadnet 10 199 13 932
oregon-1 11 174 23 409
Degree and distance centrality. Finally, we consider the
standard degree and distance centrality measures.
Degree returns the k highest-degree nodes. Note that this
baseline is oblivious to query nodes Q.
with q
datasets
Final
starting
Implem
with ex
Intel X
C. Res
Figu
algorith
better).
of the ﬁ
other tw
data

cannot
run
greedy
on
these

input

graphs
from
previous
datasets

query
nodes:
planted
spheres

k
spheres
(k
=
1)

radius
ρ
(ρ
=
2
or
3)

s
special
nodes
inside
spheres
(s
=
10
or
20)

17

α
=
0.15

small
graphs

18
dolphins

small
graphs

19
adjnoun

small
graphs

20
karate

large
graphs

21
oregon

large
graphs

22
livejournal

large
graphs

23
roadnet

conclusions

complex
problem,

greedy
algorithm
is
expensive

personalized
pagerank
is
good
alternaAve

future
work

expansion
strategies

comparison
with
more
alternaAves

parallel
implementaAon

24

26

where the inequality comes from the fact that a path in GX
passing from Z and being absorbed by X corresponds to a
shorter path in GY being absorbed by Y .
B. Proposition 5
Proposition Let Ci 1 be a set of i 1 absorbing nodes,
Pi 1 the corresponding transition matrix, and Fi 1 = (I
Pi 1) 1
. Let Ci = Ci 1 [ {u}. Given Fi 1, the centrality
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
By a direct a
can compute
cost of O(n2
Fi =
We have thu
and therefore
C. Propositio
Proposition
correspondin
C0
= C {
score acQ(C
Proof: T
Again we ass
Puu = 0 for
two sets of a
path in GX
sponds to a
bing nodes,
i 1 = (I
e centrality
t M be a
. Moreover,
. Then, the
By a direct application of Lemma 1, it is easy to see that we
can compute Fi from Fi 1 with the following formula, at a
cost of O(n2
) operations.
Fi = Fi 1 (Fi 1a)(bT
Fi 1)/(1 + bT
(Fi 1a))
We have thus shown that, given Fi 1, we can compute Fi,
and therefore acQ
(Ci) as well, in O(n2
).
C. Proposition 6
Proposition Let C be a set of absorbing nodes, P the
corresponding transition matrix, and F = (I P) 1
. Let
C0
= C {v} [ {u}, for u, v 2 C. Given F, the centrality
score acQ(C0
) can be computed in time O(n2
).
Proof: The proof is similar to the proof of Proposition 5.
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
Proof: (Proposition 5) Without loss of generality, let the
set of absorbing nodes be Ci 1 = {1, 2, . . . , i 1}. For
simplicity, assume no self-loops for non-absorbing nodes, i.e.,
Pu,u = 0 for u 2 V Ci 1. As in Section VI, the expected
number of steps before absorption is given by the formulas
acQ
(Ci 1) = sT
Q
Fi 11,
with Fi 1 = A 1
i 1 and Ai 1 = I Pi 1.
We proceed to show how to increase the set of absorbing nodes
by one and calculate the new absorption time by updating Fi 1
in O(n2
). Without loss of generality, suppose we add node i
to the absorbing nodes Ci 1, so that
Ci = Ci 1 [ {i} = {1, 2, . . . , i 1, i}.
Let Pi be the transition matrix over G with absorbing nodes
C. Proposition 6
Proposition Let C be
corresponding transition
C0
= C {v} [ {u}, f
score acQ(C0
) can be c
Proof: The proof is
Again we assume no sel
Puu = 0 for u 2 V
two sets of absorbing no
C = {
C0
= {
Let P0
be the transition
absorbing centrality for t
C0
is expressed as a fun
F = A 1
,
F0
= A0
Notice that
A0
A = (I
=
2
6
6
4
0(i 1
pi,1 . . .
pi+1,0 . . .
0(n i
where pi,j denotes the
node j in a transition ma

Absorbing Random Walk Centrality

More Related Content

What's hot (20)

Viewers also liked (14)

Similar to Absorbing Random Walk Centrality (20)

More from Michael Mathioudakis (10)

Recently uploaded (20)

Absorbing Random Walk Centrality