12-markov-chains and dynamical systems.pdf

CAS CS 132
Markov Chains
Geometric Algorithms
Lecture 12

Objectives
1. Motivate linear dynamical systems
2. Analyze Markov chains and their properties
3. Learn to solve for steady-states of Markov
chains
4. Connect this to graphs and random walks

Keywords
linear dynamical systems
recurrence relations
linear difference equations
state vector
probability vector
stochastic matrix
Markov chain
steady-state vector
random walk
state diagram

Change (or Waxing Poetic)
Things change.

Things change.
Things change from one state of affairs to
another state of affairs.

Things change.
Things change often in unpredictable ways.

Things change.
Things change often in unpredictable ways.
If something changes unpredictably, what can we
say about it?

Dynamical Systems
Definition (Informal). A dynamical system is a thing
(typically with interacting parts) that changes over time.

Dynamical Systems
A dynamical system has possible states which it can be in as
time elapses and its behavior is defined by a evolution
function.

Dynamical Systems
A dynamical system has possible states which it can be in as
time elapses and its behavior is defined by a evolution
function.
Examples.
» economics (stocks)
» physical/chemical systems
» populations
» weather

An Aside: Chaos Theory
https://commons.wikimedia.org/wiki/File:Fr137.jpg

Complex systems like the
weather or the economy look
nearly random.

nearly random.
But even in chaotic systems
there are underlying patterns
and repeated structures.

nearly random.
But even in chaotic systems
there are underlying patterns
and repeated structures.
Often it's useful to consider
chaotic systems in terms of
global properties.

Motivating Questions
What does a dynamical system look like "in the
long view?"

long view?"
Does it reach a kind of equilibrium? (think
heat diffusion)

long view?"
Does it reach a kind of equilibrium? (think
heat diffusion)
Or does some part of the system dominate over
time? (think the population of rabbits without
a predator)

Linear Dynamical Systems
Definition. A (discrete time) linear dynamical
system is a described a matrix . It's evolution
function is the matrix transformation .
n × n A
x ↦ Ax

n × n A
x ↦ Ax
The possible states of the system are vectors in .
ℝn

n × n A
x ↦ Ax
ℝn
Given an initial state vector , we can determine
the state vector of the system after time steps:
v0
i
vi = Avi−1

n × n A
x ↦ Ax
ℝn
Given an initial state vector , we can determine
the state vector of the system after time steps:
v0
i
vi = Avi−1
tells us how our system evolves over time.
A

State Vectors
The state vector tells us what the system looks
like after a number time steps.
This is also called a recurrence relation or a linear
difference function.
vk
k
v1 = Av0
v2 = Av1 = A(Av0)
v3 = Av2 = A(AAv0)
v4 = Av3 = A(AAAv0)
v5 = Av4 = A(AAAAv0)
⋮

How to: Determining State Vectors
Question. Determine the state vector for the
linear dynamical system with matrix given the
initial state vector .
Solution. Compute
vi
A
v0
vi = Ai
v0

Matrix Powers in NumPy
numpy.linalg.matrix_power(a)

There is a function in NumPy for doing matrix powers.

Use can use this when you need to take a large power of
a matrix.

a matrix.
It's much faster than doing each multiplication
individually because it uses the "repeated squaring"
trick

a matrix.
It's much faster than doing each multiplication
individually because it uses the "repeated squaring"
trick
But be cautious of floating-point error.

The Setup
We're working for the census. We have 2023
population measurements for a city and a suburb
which are geographically coincident.

The Setup
We're working for the census. We have 2023
population measurements for a city and a suburb
which are geographically coincident.
We find by analyzing previous data that each
year:
» 5% of the population moves from city suburb
→
» 3% of the population moves from suburb city
→

Fundamental Question
Can we make any predictions about the
population of the city and suburb in 2043?
Note: No immigration, emigration, birth, death,
etc. The overall population stays fixed.

Setting up Linear Equations
If = 2023 city pop. = 600,000
and = 2023 suburb pop. = 400,000
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0

If = 2023 city pop. = 600,000
and = 2023 suburb pop. = 400,000
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
then the pop. in 2024 are given by:
𝖼
𝗂
𝗍
𝗒
1
= (0.95)
𝖼
𝗂
𝗍
𝗒
0
+ (0.03)
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
1 = (0.05)
𝖼
𝗂
𝗍
𝗒
0
+ (0.97)
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0

If = 2023 city pop. = 600,000
and = 2023 suburb pop. = 400,000
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
then the pop. in 2024 are given by:
𝖼
𝗂
𝗍
𝗒
1
= (0.95)
𝖼
𝗂
𝗍
𝗒
0
+ (0.03)
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
1 = (0.05)
𝖼
𝗂
𝗍
𝗒
0
+ (0.97)
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
people who stayed
people who left

Setting up a Matrix
In 2024, we expect the population of the city
to decrease.
[
𝖼
𝗂
𝗍
𝗒
1
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
1
]
=
[
0.95 0.3
0.05 0.97] [
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
]
=
[
582,000
418,000]

Setting up a Matrix
In 2025, we expect the population of the city
to continue to decrease.
Will it decrease indefinitely?
[
𝖼
𝗂
𝗍
𝗒
2
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
2
]
=
[
0.95 0.3
0.05 0.97] [
𝖼
𝗂
𝗍
𝗒
1
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
1
]
=
[
565,440
434,560]

Setting up a Matrix
This is a linear dynamical system.
So we want to guess what the population will look
like in 20 years, we need to compute
[
0.95 0.03
0.05 0.97]
20
[
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
]
[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97] [
𝖼
𝗂
𝗍
𝗒
k−1
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k−1
]

Stochastic Matrices
What's special about this matrix?
» Its entries are nonnegative.
» Its columns sum to 1.
This should make us think probability.
[
0.95 0.03
0.05 0.97]

Stochastic Matrices
Definition. A matrix is stochastic if its
entries are nonnegative and its columns sum to
1.
Example.
n × n
[
0.7 0.1 0.3
0.2 0.8 0.3
0.1 0.1 0.4]

Markov Chains
Definition. A Markov chain is a linear
dynamical system whose evolution function is
given by a stochastic matrix.
(We can construct a "chain" of state vectors,
where each state vector only depends on the one
before it.)

Key Property of Stochastic Matrices

Stochastic matrices redistribute the "stuff" in
a vector.

Stochastic matrices redistribute the "stuff" in
a vector.
Theorem. For a stochastic matrix and a vector ,
A v
sum of entries of v
sum of entries of Av
=

The sum of the entries of can be computed as
So the previous statement can be written
v
1T
v = ⟨1, v⟩
1T
(Av) = 1T
v

Let's verify this:
(I'll leave it as an exercise)
1T
(Av) = 1T
v
is stochastic
A

More General Solutions
In our example, we analyzed the dynamics of a
particular population.

What if we're interested more generally in the
behavior of the process for any population?

What if we're interested more generally in the
behavior of the process for any population?
We need to shift from a population vector to a
population distribution vector.

Returning to the Example
[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97] [
𝖼
𝗂
𝗍
𝗒
k−1
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k−1
]

[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97]
k
[
𝖼
𝗂
𝗍
𝗒
0
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0
]

[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97]
k
[
600,000
400,000]

But what if we start of with a different
population?
[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97]
k
[
600,000
400,000]

But what if we start of with a different
population?
Do we have to do all our work over again?
[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97]
k
[
600,000
400,000]

Not really.
But rather than thinking in terms of
populations, we need to think about how the
population is distributed.
[
𝖼
𝗂
𝗍
𝗒
k
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
k
]
=
[
0.95 0.3
0.05 0.97]
k
[
0.6
0.4]
60% of pop. in city
40% of pop. in suburb

Probability Vectors
Definition. A probability vector is a vector
whose entries sum to 1.

Probability Vectors
They represent
» discrete probability distributions
» distributions of collections of things

Probability Vectors
They represent
» discrete probability distributions
» distributions of collections of things
These are really the same thing.

Probability Vectors (Example)
The vector represents the distribution
where we choose:
with probability
with probability
with probability
1/3
1/6
1/2
1 1/3
2 1/6
3 1/2

Probability Vectors (Example)
The vector represented the distribution of
the population, but we can also think of this as:
If we choose a random person from the population
we'll get someone:
in the city with probability
in the suburbs with probability
[
0.6
0.4]
0.6
0.4

The point
We'll be interested in the dynamics of Markov
chains on probability vectors.

The point
Since stochastic matrices preserve , they
transform one distribution into another.
1T
v

The point
Since stochastic matrices preserve , they
transform one distribution into another.
1T
v
Can we say something about how the distribution
changes in the long run?

Steady-State Vectors
Definition. A steady-state vector for a
stochastic matrix is a probability vector
such that
A steady-state vector is not changed by the
stochastic matrix. They describe equilibrium
distributions.
A q
Aq = q

How do we interpret a steady-state vector for
our example?

our example?
The populations in the city and the suburb stay
the same over time.

our example?
The populations in the city and the suburb stay
the same over time.
The same number of people are moving into and
out of the city each year.

Fundamental Questions
Do steady states exist?
Are they unique?
How do we find them?

Finding Steady-State Vectors
Let's solve this equation for .
q
Aq = q

q
Aq − q = 0

q
Aq − Iq = 0

q
(A − I)q = 0

q
(A − I)q = 0
This is a matrix equation.
So we know how to solve it.

How to: Steady-State Vectors
Question. Determine if the Markov chain with
stochastic matrix has a steady-state vector.
If it does, find such a vector.
A

A
Solution. Solve the equation and find
a solution whose entries sum to 1 (this will be
possible given a free variable).
(A − I)x = 0

A
Solution. Solve the equation and find
a solution whose entries sum to 1 (this will be
possible given a free variable).
(A − I)x = 0
If there is no such solution, the system does
not have a steady state.

Existence vs Convergence
If infinitely many solutions, then it
has a stable state.
This does not mean:
» the stable state is unique
» the system converges to this state
(A − I)x = 0

Convergence
Definition. For a Markov chain with stochastic
matrix , an initial state converges to the
state if .
A v0
v lim
k→∞
Ak
v0 = v

Convergence
Definition. For a Markov chain with stochastic
matrix , an initial state converges to the
state if .
A v0
v lim
k→∞
Ak
v0 = v
As we repeatedly multiply by , we get closer
and closer to (in the limit).
v0 A
v

Example of Non-Convergence
Non-Example. is a stochastic matrix and
I
Iv = v
for any choice of .
v

I
Iv = v
for any choice of .
v
So this system does not have a unique steady
state.

I
Iv = v
for any choice of .
v
So this system does not have a unique steady
state.
And no vectors converge to the same stable state.

Regular Stochastic Matrices
Definition. A stochastic matrix is regular if
has all positive entries for some nonnegative .
A Ak
k

Regular Stochastic Matrices
Definition. A stochastic matrix is regular if
has all positive entries for some nonnegative .
A Ak
k
Theorem. A regular stochastic matrix has a
unique steady state, and
P
every probability vector
converges to it

Mixing
This process of converging to a unique steady
state is called "mixing."
This theorem says, after some amount of mixing,
we'll be close to the stable state, no matter
where we started.

How to: Regular Stochastic Matrices
Question. Show that is regular, and then find
it's unique steady state.
Solution. Find a power of which has all
positive entries, then solve the equation
as before.
A
A
(A − I)x = 0

Recall: Adjacency Matrices
Let be an undirected
unweighted graph with its nodes
labeled by numbers through .
We can create the adjacency
matrix for as follows.
G
1 n
A G
0 1 0 0 1 0
1 0 1 0 1 0
0 1 0 1 0 0
0 0 1 0 1 1
1 1 0 1 0 0
0 0 0 1 0 0
A12
A21
A43
A34
A64
A46
Aij =
{
1 there is an edge between i and j
0 otherwise

Random Walk
A random walk on an undirected unweighted
starting at is the following process:
G
v

Random Walk
G
v
» if is connected to nodes, roll a -sided
die
v k k

Random Walk
G
v
die
v k k
» go to the th vertex according to some order
k

Random Walk
G
v
die
v k k
» go to the th vertex according to some order
k
» repeat

Applications of Random Walks
Brownian Motion is a random
walk in 3D space.
Random walks are to simulate
complex systems in physics and
in economics.
They are also used to design
algorithms.
https://commons.wikimedia.org/wiki/File:Wiener_process_3d.png

General Adjacency Matrices
We can extend the notion of an adjacency matrix to
directed and weighted graphs.
Example.
Aij =
{
wji there is an edge from j to i
0 otherwise
0 0 1
−2 0 −3
0 5 0
1
-2
-3
5
1 2
3

State Diagrams
Definition. A state diagram is a directed
weighted graph whose adjacency matrix is
stochastic.
Example.
𝖼
𝗂
𝗍
𝗒
𝗌
𝗎
𝖻
𝗎
𝗋
𝖻
0.95
0.03
0.05
0.97

Naming Convention Clash
The nodes of a state diagram are often called
states.
The vectors which are dynamically updated
according to a linear dynamical system are
called state vectors.
This is an unfortunate naming clash.

Example: Computer System
Imagine a computer
system in which tasks
request service from
disk, network or CPU.
In the long term, which
device is busiest?
This is about finding a
stable state.

How To: State Diagram
Question. Given a state diagram, find the
stable state for the corresponding linear
dynamical system.
Solution. Find the adjacency matrix for the
state diagram and go from there.

Random Walks as Linear Dynamical Systems
Once we have a stochastic matrix, we can reason
about random walks as linear dynamical systems.
What are its steady states?
How do we interpret these steady states?

Random Walks on State Diagrams
A random walk on a state diagram starting at
is the following process:
» choose a node is connected to according to
the distribution given by the edge weights
» go to that node
» repeat
v
v

Random Walks on State Diagrams
A random walk on a state diagram starting at
is the following process:
» choose a node is connected to according to
the distribution given by the edge weights
» go to that node
» repeat
v
v
Stable states of linear dynamical systems
are stable states of random walks on
state diagrams.

Steady-States of Random Walks
Theorem (Advanced). Let be the stochastic
matrix for the graph . The probability that a
random walk starting at of length ends on
node is
the th entry of the vector
A
G
i k
j
(Ak
ei)j
j Ak
ei
transforms a distribution for length walks to
length walks.
A k
k + 1

Steady States of Random Walks
If a random walk goes on for a sufficiently
long time, then the probability that we end up
in a particular place becomes fixed.
If you wander for a sufficiently long time, it
doesn't matter where you started.

Summary
Markov chains allow us to reason about
dynamical systems that are dictated by some
amount of randomness.
Stable states represent global equilibrium.
We can think of Markov chains as random walks
on state diagrams.

12-markov-chains and dynamical systems.pdf

More Related Content

Similar to 12-markov-chains and dynamical systems.pdf (20)

More from Mohammad732983 (9)

Recently uploaded (20)

12-markov-chains and dynamical systems.pdf