SlideShare a Scribd company logo
Statistics Lab
Rodolfo Metulini
IMT Institute for Advanced Studies, Lucca, Italy
Lesson 5 - Introduction to Bootstrap (and hints on Markov
Chains) - 27.01.2015
Introduction
Let’s assume, for a moment, the Central Limit Theorem
(CLT):
If a random sample of n observations y1, y2, ..., yn is drawn from a
population of mean µ and sd σ2, for n enough large, the sample
distribution of the sample mean can be approximated by a normal
density with mean µ and variance σ2
n
Averages taken from any distribution will have a normal
distribution
The standard deviation decreases as the number of
observation increases
But .. nobody tells us exactly how big the sample has to be.
Why Bootstrap?
1. Sometimes we cannot take advantages of the CLT, because:
Nobody tells us exactly how big the sample has to be.
Empirically, in some cases the sample is really small.
So, we are not encouraged to conjecture any distribution
assumption. We just have the data and we let the raw data
speak.
The bootstrap method attempts to determine the probability
distribution from the data itself, without recourse to CLT.
2. To better estimate the variance of a parameter, and
consequently having more accurate confidence intervals and
hypothesis testing.
Basic Idea of Bootstrap
To use the original sample as the population, and to draw M
samples from the original sample (the bootstrap samples). To
Define the estimator using the bootstrap samples.
Figure: Real World versus Bootstrap World
Structure of Bootstrap
1. Originally, from a list of data (the sample), one computes a
statistic (an estimation).
2. Then, he/she can creates an artificial list of data (a new
sample), by randomly drawing elements from the list.
3. He/she computes a new statistic (estimation), from the new
sample.
4. He/she repeats, let’s say, M = 1000 times the point 2) and 3)
and he/she looks to the distribution of these 1000 statistics.
Type of resampling methods
1. The Monte Carlo algorithm: with replacement, the size of the
bootstrap sample must be equal to the size of the original data set
2. Jackknife algorithm: we simply re sample from the original sample
deleting one value at a time, the size is equal to n - 1.
Estimation of the sample mean
Suppose we extracted a sample x = (x1, x2, ..., xn) from the
population X. Let’s say the sample size is small: n = 10.
We can compute the sample mean ˆXn using the values of the
sample x. But, since n is small, the CLT does not hold, so that we
can say anything about the sample mean distribution.
APPROACH: We extract M samples (or sub-samples) of dimension
n from the sample x (with replacement, MC).
We can define the bootstrap sample means: ˆxi,b, ∀i = 1..., M. This
become the new sample with dimension M.
Bootstrap sample mean:
Mb(X) = M
i ˆxi,b/M
Bootstrap sample variance:
Vb(X) = M
i (ˆxi,b − Mb(X))2/M − 1 –(Chunk 1)
Bootstrap Confidence interval with variance
estimation
Let’s take a random sample of size n= 25 from a normal
distribution with mean 10 and standard deviation 3.
We can consider the sampling distribution of the sample mean.
From that, we estimate the intervals.
The bootstrap estimates standard error by re sampling the data in
our original sample.
Instead of repeatedly drawing samples of size n= 25 from the
population, we will repeatedly draw new samples of size n=25 from
our original sample, re sampling with replacement.
We can estimate the standard error of the sample mean using the
standard deviation of the bootstrapped sample means. –(Chunk
2)
Bootstrap confidence intervals: formula
Confidence interval with quantiles
Suppose we have a sample of data from an exponential distribution
with parameter λ:
f (x|λ) = λe−λx (remember: the estimation of λ is
ˆλ = 1/ˆxn).
An alternative solution to the use of bootstrap estimated standard
errors (since the estimation of the standard errors from an
exponential is not straightforward) is the use of bootstrap
quantiles.
We can obtain M bootstrap estimates ˆλb and define q∗(α) the α
quantile of the bootstrap distribution of the M λ estimates.
The new bootstrap confidence interval for λ will be:
[2 ∗ ˆλ − q∗(1 − α/2); 2 ∗ ˆλ − q∗(α/2)] –(Chunk 3)
Regression model coefficient estimate with Bootstrap
Now we will consider the situation where we have data on two variables.
This is the type of data that arises in linear regression models. It does
not make sense to bootstrap the two variables separately, so they remain
linked when bootstrapped.
If our original n=4 sample contains the observations (y1=1,x1=3),
(y2=2,x2=6), (y3=4,x3=3), and (y4=6,x4=2), we re-sample these
original couples in pairs.
Recall that the linear regression model is: yi = β1 + β2xi + i . We are
going to construct a bootstrap interval for the slope coefficient β2:
1. We draw M bootstrap bivariate samples.
2. We define the OLS ˆβ2 coefficient for each bootstrap sample.
3. We define the bootstrap quantiles, and we use the 0.025 (α/2) and
the 0.975 (1 − α/2) to define the confidence interval for ˆβ2.
–(Chunk 4)
Regression model coefficient estimate with Bootstrap
(alternative): sampling the residuals
An alternative solution for bootstrap estimating the regression
coefficient is a two stage methods in which:
1. You draw M samples. For each one you run a regression and
you define M bootstrap residual vectors (M vectors of
dimension n).
2. You add those residuals to each of the M dependent variable’s
vector.
3. You perform M new regression models using the new
dependent variables, to estimate M bootstrapped β2.
The method consists in using the (α/2) and the (1 - α/2)
quantiles of bootstrapped β2 to define the confidence interval.
–(Chunk 5)
References
Efron, B., Tibshirani, R. (1993). An introduction to the
bootstrap (Vol. 57). CRC press
Figure: Efron and Tbishirani foundational book
Routines in R
1. boot, by Brian Ripley.
Functions and datasets for bootstrapping from the book
Bootstrap Methods and Their Applications by A. C. Davison
and D. V. Hinkley (1997, CUP).
2. bootstrap, by Rob Tibshirani.
Software (bootstrap, cross-validation, jackknife) and data for
the book An Introduction to the Bootstrap by B. Efron and
R. Tibshirani, 1993, Chapman and Hall
Markov Chain
Markov Chain is an important method in probability and many
other area of research.
They are used to model the probability to belong to a certain state
in a certain period, given that the state in the past period is
known.
Example of weather: What is the markov probability for the state
tomorrow will be sunny, given that today is rainy?
The main properties of Markov Chain processes are:
Memory of the process (usually the memory is fixed to 1).
Stationarity of the distribution.
Chart 1
A picture of an easy example of markov chain with two possible
states and reported transition probabilities.
Figure: An example of 2 states markov chain
Notation
We define a stochastic process {Xt, t = 0, 1, 2, ...} that takes on a
finite or countable number of possible values.
Let the possible values be non negative integers (i.e.Xt ∈ Z+). If
Xt = i, then the process is said to be in state i at time t.
The Markov process (in discrete time) is defined as follows:
Pij = P[Xt+1 = j|Xt = i, Xt−1 = i, ..., X0 = i] = P[Xt+1 = j|Xt =
i], ∀i, j ∈ Z+
We call Pij a 1-step transition probability because we move from
time t to time t + 1.
It is a first order Markov Chain (memory = 1) because the
probability of being in state j at time (t + 1) only depends on the
state at time t.
Notation - 2
The t − step transition probability
Ptij = P[Xt+k = j|Xk = i], ∀t ≥ 0, i, j ≥ 0
The Champman Kolmogorov equations allow us to compute these
t − step transition probabilities. It states that:
Ptij = k PtikPmkj , ∀t, m ≥ 0, ∀i, j ≥ 0
N.B. Base probability properties:
1. Pij ≥ 0, ∀i, j ≥ 0
2. j≥0 Pij = 1, i = 0, 1, 2, ...
Example: conditional probability
Consider two states: 0 = rain and 1 = no rain.
Define two probabilities:
α = P00 = P[Xt+1 = 0|Xt = 0] the probability it will rain
tomorrow given it rains today
β = P01 = P[Xt+1 = 1|Xt = 0] the probability it will rain
tomorrow given it does not rain today. What is the probability it
will rain the day after tomorrow given it rains today, given α = 0.7
and β = 0.3?
The transition probability matrix will be:
P = [P00, P01, P10, P11], or
P = [α = 0.7, β = 0.3, 1 − α = 0.4, 1 − β = 0.6] –(Chunk 6)
Example: unconditional probababily
What is the unconditional probability it will rain the day after
tomorrow?
We need to define the unconditional or marginal distribution of the
state at time t:
P[Xt = j] = i P[Xt = j|X0 = 1]P[X0 = i] = i Ptij ∗ αi ,
where αi = P[X0 = i], ∀i ≥ 0
and P[Xt = j|X0 = 1] is the conditional probability just computed
before. –(Chunk 7)
Stationary distributions
A stationary distribution π is the probability distribution such that
when the Markov chain reaches the stationary distribution, then it
remains in that probability forever.
It means we are asking this question: What is the probability to be
in a particular state in the long-run?
Let’s define πj as the limiting probability that the process will be in
state j at time t, or
πj = limt→∞Pnij
Using Fubini’s theorem
(https://www.youtube.com/watch?v=6-sGhUeOOk8), we can
define the stationary distribution as:
πj = i Pij πi , or, better, with these approximations: π0 = β
α;
π1 = 1−α
α
Example: stationary distribution
Back to our example.
We can compute the 2 step, 3 step, ..., n- step transition
distributions, and give a look WHEN it reach the
convergence.
An alternative method to compute the stationary transition
distribution consists in using this easy formula:
π0 = β
α
π1 = 1−α
α
References
Ross, S. M. (2006). Introduction to probability models. Access
Online via Elsevier.
Figure: Cover of the 10th edition
Routines in R
markovchain, by Giorgio Alfredo Spedicato.
A package for easily handling discrete Markov chains.
MCMCpack, by Andrew D. Martin, Kevin M. Quinn, and
Jong Hee Park.
Perform Monte Carlo simulations based on Markov Chain
approach.

More Related Content

PDF
Introduction to Bootstrap and elements of Markov Chains
PDF
A bit about мcmc
PDF
Markov chain Monte Carlo methods and some attempts at parallelizing them
PDF
Monte Carlo Statistical Methods
PDF
Monte Carlo Statistical Methods
PDF
By BIRASA FABRICE
PDF
Introduction to MCMC methods
Introduction to Bootstrap and elements of Markov Chains
A bit about мcmc
Markov chain Monte Carlo methods and some attempts at parallelizing them
Monte Carlo Statistical Methods
Monte Carlo Statistical Methods
By BIRASA FABRICE
Introduction to MCMC methods

What's hot (20)

PDF
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
PDF
Hastings 1970
PDF
ABC and empirical likelihood
PDF
Monte Carlo Statistical Methods
PDF
Ch2 probability and random variables pg 81
PDF
ABC in Venezia
PDF
ABC and empirical likelihood
PPTX
Statistical Physics Assignment Help
PPTX
PDF
Using Vector Clocks to Visualize Communication Flow
PDF
Metropolis-Hastings MCMC Short Tutorial
PDF
Jere Koskela slides
PDF
Firefly exact MCMC for Big Data
PDF
Dycops2019
PDF
Introduction to advanced Monte Carlo methods
PDF
Mark Girolami's Read Paper 2010
PDF
Likelihood survey-nber-0713101
PDF
Quantum Algorithms and Lower Bounds in Continuous Time
PDF
NC time seminar
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Hastings 1970
ABC and empirical likelihood
Monte Carlo Statistical Methods
Ch2 probability and random variables pg 81
ABC in Venezia
ABC and empirical likelihood
Statistical Physics Assignment Help
Using Vector Clocks to Visualize Communication Flow
Metropolis-Hastings MCMC Short Tutorial
Jere Koskela slides
Firefly exact MCMC for Big Data
Dycops2019
Introduction to advanced Monte Carlo methods
Mark Girolami's Read Paper 2010
Likelihood survey-nber-0713101
Quantum Algorithms and Lower Bounds in Continuous Time
NC time seminar
Ad

Similar to Talk 5 (20)

PDF
Some real life data analysis on stationary and non-stationary Time Series
PDF
PDF
Montecarlophd
PDF
Intro to Approximate Bayesian Computation (ABC)
PDF
Stratified sampling and resampling for approximate Bayesian computation
PPT
Hidden Markov Models with applications to speech recognition
PPT
Hidden Markov Models with applications to speech recognition
PPTX
Monte Carlo Berkeley.pptx
PDF
2012 mdsp pr04 monte carlo
PPTX
Advanced Econometrics L5-6.pptx
PDF
Project Paper
PDF
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
PPT
ch14MarkovChainkfkkklmkllmkkaskldask.ppt
PDF
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
PDF
2012 mdsp pr06  hmm
PDF
Slides econometrics-2017-graduate-2
PPTX
Stochastic Processes Assignment Help
PDF
PDF
PPTX
probability assignment help (2)
Some real life data analysis on stationary and non-stationary Time Series
Montecarlophd
Intro to Approximate Bayesian Computation (ABC)
Stratified sampling and resampling for approximate Bayesian computation
Hidden Markov Models with applications to speech recognition
Hidden Markov Models with applications to speech recognition
Monte Carlo Berkeley.pptx
2012 mdsp pr04 monte carlo
Advanced Econometrics L5-6.pptx
Project Paper
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
ch14MarkovChainkfkkklmkllmkkaskldask.ppt
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
2012 mdsp pr06  hmm
Slides econometrics-2017-graduate-2
Stochastic Processes Assignment Help
probability assignment help (2)
Ad

More from University of Salerno (20)

PDF
Modelling traffic flows with gravity models and mobile phone large data
PDF
Regression models for panel data
PDF
Carpita metulini 111220_dssr_bari_version2
PDF
A strategy for the matching of mobile phone signals with census data
PDF
Detecting and classifying moments in basketball matches using sensor tracked ...
PDF
BASKETBALL SPATIAL PERFORMANCE INDICATORS
PDF
Human activity spatio-temporal indicators using mobile phone data
PDF
Poster venezia
PDF
Metulini280818 iasi
PDF
Players Movements and Team Performance
PDF
Big Data Analytics for Smart Cities
PDF
Meeting progetto ode_sm_rm
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
PDF
Metulini1503
PDF
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
PPT
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
PPT
The Global Virtual Water Network
PDF
The Worldwide Network of Virtual Water with Kriskogram
PDF
Ad b 1702_metu_v2
Modelling traffic flows with gravity models and mobile phone large data
Regression models for panel data
Carpita metulini 111220_dssr_bari_version2
A strategy for the matching of mobile phone signals with census data
Detecting and classifying moments in basketball matches using sensor tracked ...
BASKETBALL SPATIAL PERFORMANCE INDICATORS
Human activity spatio-temporal indicators using mobile phone data
Poster venezia
Metulini280818 iasi
Players Movements and Team Performance
Big Data Analytics for Smart Cities
Meeting progetto ode_sm_rm
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini1503
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Global Virtual Water Network
The Worldwide Network of Virtual Water with Kriskogram
Ad b 1702_metu_v2

Recently uploaded (20)

PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
IGGE1 Understanding the Self1234567891011
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
Computing-Curriculum for Schools in Ghana
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
PPTX
Introduction to Building Materials
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Chinmaya Tiranga quiz Grand Finale.pdf
IGGE1 Understanding the Self1234567891011
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
Weekly quiz Compilation Jan -July 25.pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
Final Presentation General Medicine 03-08-2024.pptx
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Computing-Curriculum for Schools in Ghana
Supply Chain Operations Speaking Notes -ICLT Program
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
Introduction to Building Materials

Talk 5

  • 1. Statistics Lab Rodolfo Metulini IMT Institute for Advanced Studies, Lucca, Italy Lesson 5 - Introduction to Bootstrap (and hints on Markov Chains) - 27.01.2015
  • 2. Introduction Let’s assume, for a moment, the Central Limit Theorem (CLT): If a random sample of n observations y1, y2, ..., yn is drawn from a population of mean µ and sd σ2, for n enough large, the sample distribution of the sample mean can be approximated by a normal density with mean µ and variance σ2 n Averages taken from any distribution will have a normal distribution The standard deviation decreases as the number of observation increases But .. nobody tells us exactly how big the sample has to be.
  • 3. Why Bootstrap? 1. Sometimes we cannot take advantages of the CLT, because: Nobody tells us exactly how big the sample has to be. Empirically, in some cases the sample is really small. So, we are not encouraged to conjecture any distribution assumption. We just have the data and we let the raw data speak. The bootstrap method attempts to determine the probability distribution from the data itself, without recourse to CLT. 2. To better estimate the variance of a parameter, and consequently having more accurate confidence intervals and hypothesis testing.
  • 4. Basic Idea of Bootstrap To use the original sample as the population, and to draw M samples from the original sample (the bootstrap samples). To Define the estimator using the bootstrap samples. Figure: Real World versus Bootstrap World
  • 5. Structure of Bootstrap 1. Originally, from a list of data (the sample), one computes a statistic (an estimation). 2. Then, he/she can creates an artificial list of data (a new sample), by randomly drawing elements from the list. 3. He/she computes a new statistic (estimation), from the new sample. 4. He/she repeats, let’s say, M = 1000 times the point 2) and 3) and he/she looks to the distribution of these 1000 statistics.
  • 6. Type of resampling methods 1. The Monte Carlo algorithm: with replacement, the size of the bootstrap sample must be equal to the size of the original data set 2. Jackknife algorithm: we simply re sample from the original sample deleting one value at a time, the size is equal to n - 1.
  • 7. Estimation of the sample mean Suppose we extracted a sample x = (x1, x2, ..., xn) from the population X. Let’s say the sample size is small: n = 10. We can compute the sample mean ˆXn using the values of the sample x. But, since n is small, the CLT does not hold, so that we can say anything about the sample mean distribution. APPROACH: We extract M samples (or sub-samples) of dimension n from the sample x (with replacement, MC). We can define the bootstrap sample means: ˆxi,b, ∀i = 1..., M. This become the new sample with dimension M. Bootstrap sample mean: Mb(X) = M i ˆxi,b/M Bootstrap sample variance: Vb(X) = M i (ˆxi,b − Mb(X))2/M − 1 –(Chunk 1)
  • 8. Bootstrap Confidence interval with variance estimation Let’s take a random sample of size n= 25 from a normal distribution with mean 10 and standard deviation 3. We can consider the sampling distribution of the sample mean. From that, we estimate the intervals. The bootstrap estimates standard error by re sampling the data in our original sample. Instead of repeatedly drawing samples of size n= 25 from the population, we will repeatedly draw new samples of size n=25 from our original sample, re sampling with replacement. We can estimate the standard error of the sample mean using the standard deviation of the bootstrapped sample means. –(Chunk 2)
  • 10. Confidence interval with quantiles Suppose we have a sample of data from an exponential distribution with parameter λ: f (x|λ) = λe−λx (remember: the estimation of λ is ˆλ = 1/ˆxn). An alternative solution to the use of bootstrap estimated standard errors (since the estimation of the standard errors from an exponential is not straightforward) is the use of bootstrap quantiles. We can obtain M bootstrap estimates ˆλb and define q∗(α) the α quantile of the bootstrap distribution of the M λ estimates. The new bootstrap confidence interval for λ will be: [2 ∗ ˆλ − q∗(1 − α/2); 2 ∗ ˆλ − q∗(α/2)] –(Chunk 3)
  • 11. Regression model coefficient estimate with Bootstrap Now we will consider the situation where we have data on two variables. This is the type of data that arises in linear regression models. It does not make sense to bootstrap the two variables separately, so they remain linked when bootstrapped. If our original n=4 sample contains the observations (y1=1,x1=3), (y2=2,x2=6), (y3=4,x3=3), and (y4=6,x4=2), we re-sample these original couples in pairs. Recall that the linear regression model is: yi = β1 + β2xi + i . We are going to construct a bootstrap interval for the slope coefficient β2: 1. We draw M bootstrap bivariate samples. 2. We define the OLS ˆβ2 coefficient for each bootstrap sample. 3. We define the bootstrap quantiles, and we use the 0.025 (α/2) and the 0.975 (1 − α/2) to define the confidence interval for ˆβ2. –(Chunk 4)
  • 12. Regression model coefficient estimate with Bootstrap (alternative): sampling the residuals An alternative solution for bootstrap estimating the regression coefficient is a two stage methods in which: 1. You draw M samples. For each one you run a regression and you define M bootstrap residual vectors (M vectors of dimension n). 2. You add those residuals to each of the M dependent variable’s vector. 3. You perform M new regression models using the new dependent variables, to estimate M bootstrapped β2. The method consists in using the (α/2) and the (1 - α/2) quantiles of bootstrapped β2 to define the confidence interval. –(Chunk 5)
  • 13. References Efron, B., Tibshirani, R. (1993). An introduction to the bootstrap (Vol. 57). CRC press Figure: Efron and Tbishirani foundational book
  • 14. Routines in R 1. boot, by Brian Ripley. Functions and datasets for bootstrapping from the book Bootstrap Methods and Their Applications by A. C. Davison and D. V. Hinkley (1997, CUP). 2. bootstrap, by Rob Tibshirani. Software (bootstrap, cross-validation, jackknife) and data for the book An Introduction to the Bootstrap by B. Efron and R. Tibshirani, 1993, Chapman and Hall
  • 15. Markov Chain Markov Chain is an important method in probability and many other area of research. They are used to model the probability to belong to a certain state in a certain period, given that the state in the past period is known. Example of weather: What is the markov probability for the state tomorrow will be sunny, given that today is rainy? The main properties of Markov Chain processes are: Memory of the process (usually the memory is fixed to 1). Stationarity of the distribution.
  • 16. Chart 1 A picture of an easy example of markov chain with two possible states and reported transition probabilities. Figure: An example of 2 states markov chain
  • 17. Notation We define a stochastic process {Xt, t = 0, 1, 2, ...} that takes on a finite or countable number of possible values. Let the possible values be non negative integers (i.e.Xt ∈ Z+). If Xt = i, then the process is said to be in state i at time t. The Markov process (in discrete time) is defined as follows: Pij = P[Xt+1 = j|Xt = i, Xt−1 = i, ..., X0 = i] = P[Xt+1 = j|Xt = i], ∀i, j ∈ Z+ We call Pij a 1-step transition probability because we move from time t to time t + 1. It is a first order Markov Chain (memory = 1) because the probability of being in state j at time (t + 1) only depends on the state at time t.
  • 18. Notation - 2 The t − step transition probability Ptij = P[Xt+k = j|Xk = i], ∀t ≥ 0, i, j ≥ 0 The Champman Kolmogorov equations allow us to compute these t − step transition probabilities. It states that: Ptij = k PtikPmkj , ∀t, m ≥ 0, ∀i, j ≥ 0 N.B. Base probability properties: 1. Pij ≥ 0, ∀i, j ≥ 0 2. j≥0 Pij = 1, i = 0, 1, 2, ...
  • 19. Example: conditional probability Consider two states: 0 = rain and 1 = no rain. Define two probabilities: α = P00 = P[Xt+1 = 0|Xt = 0] the probability it will rain tomorrow given it rains today β = P01 = P[Xt+1 = 1|Xt = 0] the probability it will rain tomorrow given it does not rain today. What is the probability it will rain the day after tomorrow given it rains today, given α = 0.7 and β = 0.3? The transition probability matrix will be: P = [P00, P01, P10, P11], or P = [α = 0.7, β = 0.3, 1 − α = 0.4, 1 − β = 0.6] –(Chunk 6)
  • 20. Example: unconditional probababily What is the unconditional probability it will rain the day after tomorrow? We need to define the unconditional or marginal distribution of the state at time t: P[Xt = j] = i P[Xt = j|X0 = 1]P[X0 = i] = i Ptij ∗ αi , where αi = P[X0 = i], ∀i ≥ 0 and P[Xt = j|X0 = 1] is the conditional probability just computed before. –(Chunk 7)
  • 21. Stationary distributions A stationary distribution π is the probability distribution such that when the Markov chain reaches the stationary distribution, then it remains in that probability forever. It means we are asking this question: What is the probability to be in a particular state in the long-run? Let’s define πj as the limiting probability that the process will be in state j at time t, or πj = limt→∞Pnij Using Fubini’s theorem (https://www.youtube.com/watch?v=6-sGhUeOOk8), we can define the stationary distribution as: πj = i Pij πi , or, better, with these approximations: π0 = β α; π1 = 1−α α
  • 22. Example: stationary distribution Back to our example. We can compute the 2 step, 3 step, ..., n- step transition distributions, and give a look WHEN it reach the convergence. An alternative method to compute the stationary transition distribution consists in using this easy formula: π0 = β α π1 = 1−α α
  • 23. References Ross, S. M. (2006). Introduction to probability models. Access Online via Elsevier. Figure: Cover of the 10th edition
  • 24. Routines in R markovchain, by Giorgio Alfredo Spedicato. A package for easily handling discrete Markov chains. MCMCpack, by Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. Perform Monte Carlo simulations based on Markov Chain approach.