zanardi

Quantum approach to information retrieval:
Adiabatic quantum PageRank algorithm
arXiv:1109.6546

with Silvano Garnerone and Paolo Zanardi

First NASA Quantum Future Technologies Conference

$:

The WWW
is a big place

www.worldwidewebsize.com

The WWW
is a big place
…
and is hard
to search

www.worldwidewebsize.com

"The certitude that some shelf in
some hexagon held precious
books and that these precious
books were inaccessible seemed
almost intolerable"
J.L. Borges in The library of Babel

Google to the rescue: Brin & Page, 1998

Google scholar: >8500 citations

Google calculates an eigenvector

𝐺𝜋 = 𝜋


𝐺𝜋 = 𝜋
𝜋 is the
stationary state
of a surfer hopping
randomly on the
web graph

the PageRank vector

𝜋 𝑖 = rank of i’th page:
the relative time spent there by the random surfer


𝐺𝜋 = 𝜋
- G is a big matrix: dimension = number of webpages n.
Updated about once a month


𝐺𝜋 = 𝜋

- G is computed from the directed adjacency matrix of the webgraph

1
𝐺 = 𝛼𝑆 + 1 − 𝛼 𝐸
𝑛
hyperlink matrix of webgraph,
normalized columns; reflects the
directed connectivity structure of the webgraph


𝐺𝜋 = 𝜋

+ random hopping to avoid traps from nodes with no outgoing links:
1
𝐺 = 𝛼𝑆 + 1 − 𝛼 𝐸
𝑛 matrix of all 1’s
normalized columns; reflects the “teleport” parameter: 0.85


𝐺𝜋 = 𝜋

+ random hopping to avoid traps from nodes with no outgoing links:
1
𝐺 = 𝛼𝑆 + 1 − 𝛼 𝐸
𝑛 matrix of all 1’s
normalized columns; reflects the “teleport” parameter: 0.85

- G is a “primitive” matrix (𝐺 𝑖𝑗 ≥ 0, ∃𝑘 > 0 s.t. (𝐺 𝑖𝑗 ) 𝑘 > 0, ∀𝑖, 𝑗):
Perron-Frobenius theorem  𝜋 is unique, and a probability vector:
𝜋 encodes the relative ranking of the nodes of the webgraph

This talk
Can (adiabatic) quantum computation help to compute 𝜋 ?

This talk
Can (adiabatic) quantum computation help to compute 𝜋 ?

PageRank can be:

prepared with exp speedup

read out with poly speedup
for top-ranked log 𝑛 pages

Why?
gap of certain Hamiltonian
having PageRank as ground
state scales as
1/poly log 𝑛

numerical evidence:
arXiv:1109.6546

Classical PageRank computation

The PageRank is the principal eigenvector of G;
unique eigenvector with maximal eigenvalue 1

𝐺𝜋 = 𝜋
How do you get the PageRank, classically?

1
𝐺 = 𝛼𝑆 + 1 − 𝛼 𝐸
𝑛

Power method: G is a Markov (stochastic) matrix, so

Guaranteed to converge for any initial probability vector.

Scaling?

1
𝐺 = 𝛼𝑆 + 1 − 𝛼 𝐸
𝑛

Power method: G is a Markov (stochastic) matrix, so

Guaranteed to converge for any initial probability vector.

Scaling?
log(𝜖)
time ~ 𝑠𝑛
|log(𝛼)|

𝜖 = desired accuracy
s = sparsity of the adjacency (or hyperlink) matrix. Typically s~10


Markov Chain Monte Carlo:

Uses direct simulation of rapidly mixing random walks to
estimate the PageRank at each node.

time ~ 𝑂[𝑛 log(𝑛)]

[Modulus of the second eigenvalue of G is upper-bounded by α
 G is a gapped stochastic matrix
 walk converges in time 𝑂[log 𝑛 ] per node]

Classical computation is already efficient;
why do we need quantum?
power method: Markov chain Monte Carlo:
log(𝜖)
time ~ 𝑠𝑛 𝑂[𝑛 log(𝑛)]
log(𝛼)

log(𝜖)
log(𝛼)

𝑛 updating PageRank already takes
weeks; will only get worse.

log(𝜖)
log(𝛼)


With q-adiabatic algo can
prepare PageRank in time
𝑂[poly log 𝑛 ]

log(𝜖)
log(𝛼)


With q-adiabatic algo can
prepare PageRank in time
𝑂[poly log 𝑛 ]
Application: run successive
PageRanks and compare in time
𝑂(1); use to decide whether to
run classical update

Quantum approach

Adiabatic quantum computation of the PageRank vector

Adiabatic quantum computation

ℎ 𝑠 𝑡 = 1 − 𝑠 𝑡 ℎ0 + 𝑠(𝑡)ℎ 𝑃

initial Hamiltonian problem Hamiltonian

The q-algorithm for PageRank
𝑡 = 0: prepare ground state of the initial Hamiltonian

Uniform superposition over the complete graph of n nodes.

This requires log 𝑛 qubits so assume 𝑛 is power of 2.

𝑡 = 𝑇: evolve to ground state of the final Hamiltonian

The problem Hamiltonian is ℎ 𝑝 = 𝐼 − 𝐺 † (𝐼 − 𝐺)



- Positive semidefinite, with 0 the unique min eigenvalue

- If 𝐺𝜋 = 𝜋 then |𝜋 = 𝜋/ 𝜋 2
is corresponding ground state of ℎ 𝑝

Note for experts: since G is not reversible
(doesn’t satisfy detailed balance) we
cannot apply the standard “Szegedy trick”
of quantum random walks (mapping to a
discriminant matrix)



- Positive semidefinite, with 0 the unique min eigenvalue

- If 𝐺𝜋 = 𝜋 then |𝜋 = 𝜋/ 𝜋 2 , |𝜋 𝑖 ≠ 𝜋𝑖
is corresponding ground state of ℎ 𝑝

Yet the amplitudes of the final ground state respect the same
ranking order as the PageRank,
and amplify higher ranked pages

Efficiency of the q-algorithm

According to the adiabatic theorem, to get

adiabatic error 𝜀 ≔ 1 − 𝑓 2 , fidelity 𝑓 ≔ | 𝜓 𝑇 |𝜋 |

actual final state desired ground state

for ℎ 𝑠 𝑡 = 1 − 𝑠 𝑡 ℎ0 + 𝑠(𝑡)ℎ 𝑃


According to the adiabatic theorem, to get

adiabatic error 𝜀 ≔ 1 − 𝑓 2 , fidelity 𝑓 ≔ | 𝜓 𝑇 |𝜋 |

actual final state desired ground state

for ℎ 𝑠 𝑡 = 1 − 𝑠 𝑡 ℎ0 + 𝑠(𝑡)ℎ 𝑃

1 𝑑ℎ 1
need 𝑇~poly , max 𝑠∈[0,1] ,
min 𝑠∈[0,1] (gap) 𝑑𝑠 𝜀

- not necessarily min gap −2 : can have -1 (best case) or -3 (worst case)
- scaling of numerator can be important; needs to be checked
DAL, A. Rezakhani, and A. Hamma, J. Math. Phys. (2009)

Testing the q-algo on webgraph models
We tested the algorithm on random webgraph models,
sparse, small-world, scale-free (power law degree distribution):


- preferential attachment model
links are added at random with a bias for high-degree nodes
drawback: requires global knowledge of graph
Degree distribution: 𝑁(𝑑) ∝ 𝑑 −3


- preferential attachment model
links are added at random with a bias for high-degree nodes
drawback: requires global knowledge of graph
Degree distribution: 𝑁(𝑑) ∝ 𝑑 −3

- copy-model
- start from a small fixed initial graph of constant out-degree
- each time step:
- choose pre-existing “copying vertex” uniformly at random
- Probability 1 − p: For each neighbor of the copying
vertex, add a link from a new added vertex to that neighbor
- Probability p: add link from newly added vertex to uniformly random chosen one
requires only local knowledge of graph; has tuning parameter p
Degree distribution: 𝑁 𝑑 ∝ 𝑑 (2−𝑝)/(1−𝑝)

numerical diagonalization

ave. min gap scaling:

[Note: we computed same for generic
sparse random matrices and found gap
~1/poly 𝑛 instead]

1 𝑑ℎ 1
𝑇~poly , max 𝑠∈[0,1] ,
min(gap) 𝑑𝑠 𝜀

preferential attachment, n=16, 1000 graphs
run Schrodinger equation
with different 𝑇:

𝑇~𝜀 −2

𝑑ℎ
=
𝑑𝑠

1 𝑑ℎ 1
𝑇~poly , max 𝑠∈[0,1] ,


δ~1/poly(log𝑛)
&
𝑇~𝜀 −2
&
𝑑ℎ
=poly(loglog𝑛)
𝑑𝑠

1 𝑑ℎ 1
𝑇~poly , max 𝑠∈[0,1] , small integer >0

checked and confirmed using solution of the full Schrodinger equation, for 𝑏 = 3:
actual error always less than 𝜀

So is this really an efficient q-algorithm?

• Problem 1: The Google matrix G is a full matrix...
 ℎ[𝑠(𝑡)] requires many-body interactions...

So is this really an efficient q-algorithm?

• Problem 1: The Google matrix G is a full matrix...
 ℎ[𝑠(𝑡)] requires many-body interactions...

• Can be reduced to 1&2 qubit interactions by using one qubit
per node:
go from log(𝑛) qubits to 𝑛 qubits (unary representation), i.e.,
map to 𝑛-dim. “single particle excitation” subspace of 2 𝑛 -dim
Hilbert space:
probability of
finding excitation
at site i gives
matrix elements of ℎ 𝑠 𝑡 = 1 − 𝑠 𝑡 ℎ0 + 𝑠(𝑡)ℎ 𝑃 PageRank of page i

𝐻 𝑠 (in 1-excitation subspace) and ℎ(𝑠) have same spectrum  same 𝑇 scaling

Measuring the PageRank

• Problem 2: Once the ground-state has been prepared one
needs to measure the site occupation probabilities (𝜋 𝑖 )2 / 𝜋 2

To recover the complete length-n PageRank vector takes
∗
at least n measurements (Chernoff bound )
 back to the classical performance

• Same problem as in the quantum algorithm for solving linear
equations [Harrow, Hassidim, Lloyd, PRL (2009)];
actually our algorithm is an instance of solving linear equations,
but assumes a lot more structure

∗
To estimate ith prob. amplitude with additive error 𝑒 𝑖 need number of measurements ~ 1/poly(𝑒 𝑖 )

Measuring the PageRank

• Problem 2: Once the ground-state has been prepared one
needs to measure the site occupation probabilities (𝜋 𝑖 )2 / 𝜋 2

To recover the complete length-n PageRank vector takes
∗
at least n measurements (Chernoff bound )
 back to the classical performance

• However: one is typically interested only in the top ranked
pages
• For these pages we nevertheless obtain (again using
the Chernoff bound) a polynomial speed-up for estimating the
ranks of the top 𝐥𝐨𝐠 𝒏 pages
• This is because of the amplification of top PageRank entries
and power-law distribution of the PageRank entries

Summary of results and applications
• Can map adiabatic PageRank algo to Hamiltonians with 1&2
body interactions, with one qubit per node

• Polynomial speed-up for top-log 𝑛 set of nodes

• Exponential speedup in preparation of PageRank

Summary of results and applications
• Can map adiabatic PageRank algo to Hamiltonians with 1&2
body interactions, with one qubit per node

• Polynomial speed-up for top-log 𝑛 set of nodes

• Exponential speedup in preparation of PageRank allows for
an efficient decision procedure about updating of the
classical PageRank:
• Prepare pre-perturbation PageRank state |𝜋 : 𝑇~𝑂[poly log 𝑛 ]
• Prepare post-perturbation PageRank state |𝜋′ : 𝑇′~𝑂[poly log 𝑛′ ]
• Compute | 𝜋 𝜋′ | using the SWAP test: ~𝑂(1)
 Decide whether update needed

Conclusions
• Information retrieval provides new set of problems for
Quantum Computation
• Given the existence of efficient classical algorithms it is non-
trivial that QC can provide some form of speedup
• The humongous size of the WWW is an important motivation
to look for such a speedup
• Showed tasks for which adiabatic quantum PageRank
provides a speedup with respect to classical algorithms

Conclusions
• Information retrieval provides new set of problems for
Quantum Computation
• Given the existence of efficient classical algorithms it is non-
trivial that QC can provide some form of speedup
• The humongous size of the WWW is an important motivation
to look for such a speedup
• Showed tasks for which adiabatic quantum PageRank
provides a speedup with respect to classical algorithms

• Why does it work? Sparsity alone seems insufficient.
• Other key features of the webgraph are
• small-world (each node reachable from any other is
log(𝑛) steps)
• degree distribution of nodes is power-law
Which of these is necessary/sufficient?

zanardi

More Related Content

Viewers also liked (17)

Similar to zanardi (20)

Recently uploaded (20)

zanardi