SlideShare a Scribd company logo
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Nested Sampling for General Bayesian
Computation
Represented by WU Changye
12 février 2015
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Introduction
In the Bayesian paradigm, the parameter θ follows the prior
distribution π, the observations y follow the distribution L(y|θ)
given θ, then the posterior distribution f (θ|y) which indicates the
distribution of θ given the observations y has the following form :
f (θ|y) =
L(y|θ)π(θ)
Θ L(y|θ)π(θ)dθ
The objective of nested sampling is to compute the ’evidence’ :
Z =
Θ
L(y|θ)π(θ)dθ
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
θ is a random variable, then
Z = Eπ(L(θ))
For simplicity, let L(θ) denote the likelihood L(y|θ). The cumulative
distribution function of L(θ) is
F(λ) =
L(θ)<λ
π(θ)dθ
Define the induced measure µ on R by the likelihood function and
the prior as follwing
µ(A) = Pπ(L(θ) ∈ A)
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Lemma 1 : Eπ(L(θ)) = Eµ(X).
Proof : ∀g is a indication function of a measurable set A in R.
Then
Eπ(g(L(θ))) = Eπ(IA(L(θ))) =
L(θ)∈A
π(θ)dθ
However, µ(dx) = Θ δ{L(θ)}(dx)π(θ)dθ.
Eµ(g(X)) =
R
IA(x)µ(dx) =
Θ R
IA(x)δ{L(θ)}(dx) π(θ)dθ
Therefore,
Eµ(g(X)) = Eπ(IA(L(θ))) = Eπ(g(L(θ)))
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
In the general case, let {gn} be an increasing sequence of step
functions converging to identity function Id ; then {gn ◦ L} is an
increasing sequence of step functions converging to L and the
desired conclusion follows by taking limits.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Lemma 2 : If X is a positive-valued random variable, has p.d.f. f
and c.d.f. F, then :
∞
0
(1 − F(x))dx =
∞
0
xf (x)dx = E(X).
Proof :
∞
0
(1 − F(x))dx =
∞
0
(1 − P(X < x))dx
=
∞
0
P(X ≥ x)dx
=
∞
0
∞
x
f (y) · dy · dx
=
∞
0
f (y)
y
0
dx · dy
=
∞
0
f (y) · ydy = E(X)
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
According to Lemma 1 and 2,
Z = Eµ(X) =
∞
0
xdF(x) =
∞
0
(1 − F(x))dx
Let ϕ−1(x) = 1 − F(x) = P{θ : L(θ) > x}
Z =
∞
0
ϕ−1
(x)dx =
1
0
ϕ(x)dx
Therefore, we have the evidence represented by an one-dimensional
integration.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
In order to compute the following integration :
J =
1
0
ϕ(x)dx
there are three methods based on sampling.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
1) Importance Sampling :
i = 1, · · · , n, Ui ∼ U[0,1],
ˆJ1 = 1
n
n
i=1 ϕ(Ui )
2) Riemann approximation :
i = 1, · · · , n, Ui ∼ U[0,1], U(i) is the order statistics of
(U1, · · · , Un), U(1) ≤ · · · ≤ U(n),
ˆJ2 = n−1
i=1 ϕ(U(i))(U(i+1) − U(i))
3) A complicated method :
x0 = 1
step1 : i = 1, · · · , N, U1
i ∼ U[0,1], x1 = max{U1
1 , · · · , U1
N}
step2 : i = 1, · · · , N, U2
i ∼ U[0,x1], x2 = max{U2
1 , · · · , U2
N}
· · · · · ·
setp n : i = 1, · · · , N, Un
i ∼ U[0,xn−1], xn = max{Un
1 , · · · , Un
N}
ˆJ3 = n
i=1 ϕ(xi )(xi−1 − xi )
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Nested sampling takes the third method and the reason is that ϕ is
a decreasing function and in many cases it decreases rapidly.
Figure: Graph of ϕ(x) and the trace of (xi , ϕ(xi ))
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
First, we consider the distributions of x1, · · · , xn :
for u ∈ [0, 1],
P(x1 < u) = P(U1
1 < u, · · · , U1
N < u)
=
N
i=1
P(U1
i < u)
= uN
As a result, the density function of x1 is
f (x1) = NxN−1
1
By the same method, we have :
f (xk|xk−1) =
N
xk−1
xk
xk−1
N−1
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Note tk = xk
xk−1
,
P(tk ≤ t) = P(xk ≤ tx|xk−1 = x)fxk−1
(x)dx
=
xk−1
tx
0
fxk |xk−1
(y|x)fxk−1
(x)dxdy
=
xk−1
tx
0
N
x
y
x
N−1
fxk−1
(x)dxdy
=
xk−1
tN
fxk−1
(x)dx = tN
Besides,
P(tk ≤ t|xk−1 = x) = P(xk ≤ tx|xk−1 = x) = tN
As a result, we have tk ⊥ xk−1.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Moreover, a point estimate for xk can be written entirely in terms
of point estimates for the tk,
xk =
xk
xk−1
×
xk−1
xk−2
×· · ·×
x1
x0
×x0 = tk ·tk−1 · · · t1 ·x0 =
k
i=1
ti ·x0
More appropriate to the large range common to many problems,
log xk becomes
log xk = log
k
i=1
ti · x0 =
k
i=1
log ti + log x0
where the logarithmic shrinkage is distributed as
f (log t) = Ne(N−1) log t
with the mean and the variance :
E(log t) = −
1
N
V(log t) =
1
N2
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Taking the mean as the point estimate for each log ti finally gives
log
xk
x0
= −
k
N
±
√
k
N
Parameterizing xk in terms of the shrinkage proves immediately
advantageous – because the log ti are independent, the errors in the
point estimates tend to cancel and the estimates for the xk grow
increasingly more accurate with k.
xk = exp(−
k
N
)
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Next, we consider the distribution of ϕ(X), where X ∼ U[0, 1]
Considering the random variable X = ϕ−1(L(θ)), where θ ∼ π.
Notice that :
ϕ−1
: [0, Lmax] → [0, 1],
λ → P(L(θ) > λ)
for u ∈ [0, 1],
P(X < u) = P(ϕ−1
(L(θ)) < u)
= P(L(θ) > ϕ(u))
= ϕ−1
(ϕ(u))
= u
This means that ϕ−1(L(θ)) follows the U[0, 1] and ϕ(X) ∼ L(θ).
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Considering the situation on the truncated distribution :
π(θ) ∝
π(θ) L(θ) > L0
0 otherwise
Let X0 = ϕ−1(L0) and X = ϕ−1(L(θ)), where θ ∼ π.
For u ∈ [0, X0],
P(X < u) = P(ϕ−1
(L(θ)) < u|L(θ) > L0)
=
P(L(θ) > ϕ(u))
P(L(θ) > L0)
=
ϕ−1(ϕ(u))
X0
=
u
X0
X ∼ U[0, X0],
As a result, ϕ(X) ∼ L(θ), where X ∼ U[0, X0] and θ ∼ π.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Algorithm
The algorithm based on the method discussed in the previous
section is described in below :
– Iteration 1 : sample independently N points θ1,i from the prior
π(θ), determine θ1 = arg min1≤i≤N L(θ1,i ) and set ϕ1 = L(θ1)
– Iteration 2 : obtain the N current values θ2,i , by reproducing the
θ1,i ’s except for θ1 that is replaced by a draw from the prior
distribution π conditional upon L(θ) ≥ ϕ1 ; then select θ2 as
θ2 = arg min1≤i≤N L(θ2,i ), and set ϕ2 = L(θ2)
– Iterate the above step until a given stopping rule is satisfied, for
instance when observing very small changes in the approximation
ˆZ or when reaching the maximal value of L(θ) when it is known.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
ˆZ =
J
i=1
ϕi (xi−1 − xi )
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
By-product of Nested Sampling
Skilling indicates that nested sampling provides simulations from
the posterior distribution at no extra cost : "the existing sequence
of points θ1, θ2, θ3, . . . already gives a set of posterior
representatives, provided the i’th is assigned the appropriate
importance ωi Li "
Eπ(f (θ)) = Θ π(θ)L(θ)f (θ)dθ
Θ π(θ)L(θ)dθ
We can use a single run of nested sampling to obtain estimators of
both the numerator and the denominator, the latter being the
evidence Z. The estimator of the numerator is
j
i=1
(xi−1 − xi )ϕi f (θi ) (1)
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Lemma 3(N.Chopin & C.P Robert) :
Let f (l) = Eπ{f (θ)|L(θ) = l} for l > 0, then, if f is absolutely
continuous,
1
0
ϕ(x)f (ϕ(x)) dx = π(θ)L(θ)f (θ)dθ
Proof : Let ψ : x → xf (x),
π(θ)L(θ)f (θ)dθ = Eπ[ψ{L(θ}]
=
+∞
0
Pπ(ψ{L(θ} > l)dl
=
+∞
0
ϕ−1
(ψ−1
(l))dl =
1
0
ψ(ϕ(x))dx
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Termination
The author suggests that
max(L1, · · · , LN)Xj < fZj =⇒ termination
where f is some fraction.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
N ?
The larger N is, the smaller the variability of the approximation is.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
How to sample N points from the constraint parametric
space
Using a MCMC method which constructs a Markov Chain that has
the invariant distribution of the truncated distribution.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
A decentred gaussian example
The prior is
π(θ) =
d
i=1
1
√
2π
exp −
1
2
(θ(k)
)2
and the likelihood is
L(y|θ) =
d
i=1
1
√
2π
exp −
1
2
(yk − θ(k)
)2
In this example, we can calculate the evidence analytically
Z =
Rd
L(θ)π(θ)dθ =
exp(−
d
k=1 y2
k
4 )
2d πd/2
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Figure: Graph of ϕ(x) and the trace of (xi , ϕ(xi )) with d = 1 and y = 10.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Figure: The prior distribution and the likelihood with d = 1 and y = 10.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Figure: the box-plot of log ˆZ − log Z with d = 1 and y = 10 for Nested
sampling and Monte Carlo.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Figure: the box-plot of log ˆZ − log Z with d = 5 and y = (3, 3, 3, 3, 3).
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
A Probit Model
We consider the arsenic dataset and a probit model studied in
Chapter 5 of Gelman & Hill (2006). The observations are
independent Bernoulli variables yi such that
P(yi = 1|xi ) = Φ(xT
i θ), where xi is a vector of d covariates, θ is a
vector parameter of size d, and Φ denotes the standard normal
distribution function. In this particular example, d = 7.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
The prior is
θ ∼ N(0, 102
Id )
L(θ) =
n
i=1
Φ(xT
i θ)
yi
1 − Φ(xT
i θ)
1−yi
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Figure: the box-plot of log ˆZ with N = 20 for HMC and random walk
MCMC. The blue line remarks the true value of log Z(Chib’s method).
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Posterior Samples
We use the Gaussian example to illustrate this result. Let
f (θ) = exp(−3θ + 9d
2 ).
Figure: The box-plot of the log-relative error of log ˆZ − log Z and
log ˆE(f ) − log E(f )
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Conclusion
– Nested sampling reverses the accepted approach to Bayesian
computation by putting the evidence first.
– Nested sampling samples more sparsely from the prior in regions
where the likelihood is low and more densely where the likelihood
is high, resulting in greater efficiency than a sampler that draws
directly from the prior.
– The procedure runs with an evolving collection of N points,
where N can be chosen small for speed or large for accuracy.
– Nested sampling always reduces a multidimensional integral to
the integral of a one-dimensional monotonic function, no matter
how many dimensions θ occupies, and no matter how strange the
shape of the likelihood function L(θ) is.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Problems
– How to generate N independent points in the constraint
parametric space is an important problem. Techniques to do so
effectively and efficiently may vary from problem to problem.
– Termination is also another problem in practice.
Represented by WU Changye Nested Sampling for General Bayesian Computation
Outline
Nested Sampling
Posterior Simulation
Nested Sampling Termination and Size of N
Numerical Examples
Conclusion
Thank you !
Represented by WU Changye Nested Sampling for General Bayesian Computation

More Related Content

PDF
Can we estimate a constant?
PDF
ABC in Venezia
PDF
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
PDF
"reflections on the probability space induced by moment conditions with impli...
PDF
Big model, big data
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
accurate ABC Oliver Ratmann
PDF
Delayed acceptance for Metropolis-Hastings algorithms
Can we estimate a constant?
ABC in Venezia
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
"reflections on the probability space induced by moment conditions with impli...
Big model, big data
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
accurate ABC Oliver Ratmann
Delayed acceptance for Metropolis-Hastings algorithms

What's hot (20)

PDF
Multiple estimators for Monte Carlo approximations
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Approximate Bayesian Computation with Quasi-Likelihoods
PDF
ABC based on Wasserstein distances
PDF
Bayesian hybrid variable selection under generalized linear models
PDF
Maximum likelihood estimation of regularisation parameters in inverse problem...
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
NCE, GANs & VAEs (and maybe BAC)
PDF
Mark Girolami's Read Paper 2010
PDF
Inference in generative models using the Wasserstein distance [[INI]
PDF
random forests for ABC model choice and parameter estimation
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Coordinate sampler : A non-reversible Gibbs-like sampler
PDF
ABC convergence under well- and mis-specified models
PDF
Unbiased Bayes for Big Data
PDF
Convergence of ABC methods
PDF
the ABC of ABC
PDF
Poster for Bayesian Statistics in the Big Data Era conference
PDF
Approximating Bayes Factors
Multiple estimators for Monte Carlo approximations
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Approximate Bayesian Computation with Quasi-Likelihoods
ABC based on Wasserstein distances
Bayesian hybrid variable selection under generalized linear models
Maximum likelihood estimation of regularisation parameters in inverse problem...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
NCE, GANs & VAEs (and maybe BAC)
Mark Girolami's Read Paper 2010
Inference in generative models using the Wasserstein distance [[INI]
random forests for ABC model choice and parameter estimation
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Coordinate sampler : A non-reversible Gibbs-like sampler
ABC convergence under well- and mis-specified models
Unbiased Bayes for Big Data
Convergence of ABC methods
the ABC of ABC
Poster for Bayesian Statistics in the Big Data Era conference
Approximating Bayes Factors
Ad

Viewers also liked (17)

PDF
Species sampling models in Bayesian Nonparametrics
PDF
Asymptotics for discrete random measures
PDF
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
PDF
Presentation of Bassoum Abou on Stein's 1981 AoS paper
PDF
Dependent processes in Bayesian Nonparametrics
PDF
A Gentle Introduction to Bayesian Nonparametrics
PDF
Bayesian Classics
PDF
A Gentle Introduction to Bayesian Nonparametrics
PDF
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
PDF
Gelfand and Smith (1990), read by
PDF
Reading Birnbaum's (1962) paper, by Li Chenlu
PDF
Reading Neyman's 1933
PDF
Testing point null hypothesis, a discussion by Amira Mziou
PDF
Reading Efron's 1979 paper on bootstrap
PDF
Reading the Lasso 1996 paper by Robert Tibshirani
PDF
slides Céline Beji
PDF
Reading the Lindley-Smith 1973 paper on linear Bayes estimators
Species sampling models in Bayesian Nonparametrics
Asymptotics for discrete random measures
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Presentation of Bassoum Abou on Stein's 1981 AoS paper
Dependent processes in Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
Bayesian Classics
A Gentle Introduction to Bayesian Nonparametrics
Reading Testing a point-null hypothesis, by Jiahuan Li, Feb. 25, 2013
Gelfand and Smith (1990), read by
Reading Birnbaum's (1962) paper, by Li Chenlu
Reading Neyman's 1933
Testing point null hypothesis, a discussion by Amira Mziou
Reading Efron's 1979 paper on bootstrap
Reading the Lasso 1996 paper by Robert Tibshirani
slides Céline Beji
Reading the Lindley-Smith 1973 paper on linear Bayes estimators
Ad

Similar to Nested sampling (20)

PDF
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
PDF
ABC-Gibbs
PDF
ABC-Gibbs
PPTX
Particle filter
PDF
Asymptotics of ABC, lecture, Collège de France
PDF
Workshop in honour of Don Poskitt and Gael Martin
PDF
asymptotics of ABC
PDF
17_monte_carlo.pdf
PDF
2012 mdsp pr04 monte carlo
PDF
Laplace's Demon: seminar #1
PDF
thesis_final_draft
PDF
Introduction to Bayesian Analysis in Python
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
Basics of probability in statistical simulation and stochastic programming
PDF
8803-09-lec16.pdf
PDF
Bayesian case studies, practical 1
PDF
talk MCMC & SMC 2004
PDF
Discussion cabras-robert-130323171455-phpapp02
PDF
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
PPTX
Mathematics Assignment Help
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
ABC-Gibbs
ABC-Gibbs
Particle filter
Asymptotics of ABC, lecture, Collège de France
Workshop in honour of Don Poskitt and Gael Martin
asymptotics of ABC
17_monte_carlo.pdf
2012 mdsp pr04 monte carlo
Laplace's Demon: seminar #1
thesis_final_draft
Introduction to Bayesian Analysis in Python
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Basics of probability in statistical simulation and stochastic programming
8803-09-lec16.pdf
Bayesian case studies, practical 1
talk MCMC & SMC 2004
Discussion cabras-robert-130323171455-phpapp02
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Mathematics Assignment Help

More from Christian Robert (19)

PDF
Insufficient Gibbs sampling (A. Luciano, C.P. Robert and R. Ryder)
PDF
The future of conferences towards sustainability and inclusivity
PDF
Adaptive Restore algorithm & importance Monte Carlo
PDF
discussion of ICML23.pdf
PDF
How many components in a mixture?
PDF
restore.pdf
PDF
Testing for mixtures at BNP 13
PDF
Inferring the number of components: dream or reality?
PDF
CDT 22 slides.pdf
PDF
Testing for mixtures by seeking components
PDF
discussion on Bayesian restricted likelihood
PDF
ABC-Gibbs
PDF
eugenics and statistics
PDF
Likelihood-free Design: a discussion
PDF
CISEA 2019: ABC consistency and convergence
PDF
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
PDF
short course at CIRM, Bayesian Masterclass, October 2018
PDF
ABC with Wasserstein distances
PDF
prior selection for mixture estimation
Insufficient Gibbs sampling (A. Luciano, C.P. Robert and R. Ryder)
The future of conferences towards sustainability and inclusivity
Adaptive Restore algorithm & importance Monte Carlo
discussion of ICML23.pdf
How many components in a mixture?
restore.pdf
Testing for mixtures at BNP 13
Inferring the number of components: dream or reality?
CDT 22 slides.pdf
Testing for mixtures by seeking components
discussion on Bayesian restricted likelihood
ABC-Gibbs
eugenics and statistics
Likelihood-free Design: a discussion
CISEA 2019: ABC consistency and convergence
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
short course at CIRM, Bayesian Masterclass, October 2018
ABC with Wasserstein distances
prior selection for mixture estimation

Recently uploaded (20)

PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
BIOMOLECULES PPT........................
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
Microbiology with diagram medical studies .pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
famous lake in india and its disturibution and importance
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
An interstellar mission to test astrophysical black holes
PPT
protein biochemistry.ppt for university classes
PPTX
2Systematics of Living Organisms t-.pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
The KM-GBF monitoring framework – status & key messages.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Comparative Structure of Integument in Vertebrates.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
BIOMOLECULES PPT........................
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Microbiology with diagram medical studies .pptx
2. Earth - The Living Planet Module 2ELS
famous lake in india and its disturibution and importance
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
An interstellar mission to test astrophysical black holes
protein biochemistry.ppt for university classes
2Systematics of Living Organisms t-.pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
TOTAL hIP ARTHROPLASTY Presentation.pptx

Nested sampling

  • 1. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Nested Sampling for General Bayesian Computation Represented by WU Changye 12 février 2015 Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 2. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 3. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Introduction In the Bayesian paradigm, the parameter θ follows the prior distribution π, the observations y follow the distribution L(y|θ) given θ, then the posterior distribution f (θ|y) which indicates the distribution of θ given the observations y has the following form : f (θ|y) = L(y|θ)π(θ) Θ L(y|θ)π(θ)dθ The objective of nested sampling is to compute the ’evidence’ : Z = Θ L(y|θ)π(θ)dθ Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 4. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion θ is a random variable, then Z = Eπ(L(θ)) For simplicity, let L(θ) denote the likelihood L(y|θ). The cumulative distribution function of L(θ) is F(λ) = L(θ)<λ π(θ)dθ Define the induced measure µ on R by the likelihood function and the prior as follwing µ(A) = Pπ(L(θ) ∈ A) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 5. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Lemma 1 : Eπ(L(θ)) = Eµ(X). Proof : ∀g is a indication function of a measurable set A in R. Then Eπ(g(L(θ))) = Eπ(IA(L(θ))) = L(θ)∈A π(θ)dθ However, µ(dx) = Θ δ{L(θ)}(dx)π(θ)dθ. Eµ(g(X)) = R IA(x)µ(dx) = Θ R IA(x)δ{L(θ)}(dx) π(θ)dθ Therefore, Eµ(g(X)) = Eπ(IA(L(θ))) = Eπ(g(L(θ))) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 6. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion In the general case, let {gn} be an increasing sequence of step functions converging to identity function Id ; then {gn ◦ L} is an increasing sequence of step functions converging to L and the desired conclusion follows by taking limits. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 7. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Lemma 2 : If X is a positive-valued random variable, has p.d.f. f and c.d.f. F, then : ∞ 0 (1 − F(x))dx = ∞ 0 xf (x)dx = E(X). Proof : ∞ 0 (1 − F(x))dx = ∞ 0 (1 − P(X < x))dx = ∞ 0 P(X ≥ x)dx = ∞ 0 ∞ x f (y) · dy · dx = ∞ 0 f (y) y 0 dx · dy = ∞ 0 f (y) · ydy = E(X) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 8. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion According to Lemma 1 and 2, Z = Eµ(X) = ∞ 0 xdF(x) = ∞ 0 (1 − F(x))dx Let ϕ−1(x) = 1 − F(x) = P{θ : L(θ) > x} Z = ∞ 0 ϕ−1 (x)dx = 1 0 ϕ(x)dx Therefore, we have the evidence represented by an one-dimensional integration. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 9. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion In order to compute the following integration : J = 1 0 ϕ(x)dx there are three methods based on sampling. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 10. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion 1) Importance Sampling : i = 1, · · · , n, Ui ∼ U[0,1], ˆJ1 = 1 n n i=1 ϕ(Ui ) 2) Riemann approximation : i = 1, · · · , n, Ui ∼ U[0,1], U(i) is the order statistics of (U1, · · · , Un), U(1) ≤ · · · ≤ U(n), ˆJ2 = n−1 i=1 ϕ(U(i))(U(i+1) − U(i)) 3) A complicated method : x0 = 1 step1 : i = 1, · · · , N, U1 i ∼ U[0,1], x1 = max{U1 1 , · · · , U1 N} step2 : i = 1, · · · , N, U2 i ∼ U[0,x1], x2 = max{U2 1 , · · · , U2 N} · · · · · · setp n : i = 1, · · · , N, Un i ∼ U[0,xn−1], xn = max{Un 1 , · · · , Un N} ˆJ3 = n i=1 ϕ(xi )(xi−1 − xi ) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 11. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Nested sampling takes the third method and the reason is that ϕ is a decreasing function and in many cases it decreases rapidly. Figure: Graph of ϕ(x) and the trace of (xi , ϕ(xi )) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 12. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion First, we consider the distributions of x1, · · · , xn : for u ∈ [0, 1], P(x1 < u) = P(U1 1 < u, · · · , U1 N < u) = N i=1 P(U1 i < u) = uN As a result, the density function of x1 is f (x1) = NxN−1 1 By the same method, we have : f (xk|xk−1) = N xk−1 xk xk−1 N−1 Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 13. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Note tk = xk xk−1 , P(tk ≤ t) = P(xk ≤ tx|xk−1 = x)fxk−1 (x)dx = xk−1 tx 0 fxk |xk−1 (y|x)fxk−1 (x)dxdy = xk−1 tx 0 N x y x N−1 fxk−1 (x)dxdy = xk−1 tN fxk−1 (x)dx = tN Besides, P(tk ≤ t|xk−1 = x) = P(xk ≤ tx|xk−1 = x) = tN As a result, we have tk ⊥ xk−1. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 14. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Moreover, a point estimate for xk can be written entirely in terms of point estimates for the tk, xk = xk xk−1 × xk−1 xk−2 ×· · ·× x1 x0 ×x0 = tk ·tk−1 · · · t1 ·x0 = k i=1 ti ·x0 More appropriate to the large range common to many problems, log xk becomes log xk = log k i=1 ti · x0 = k i=1 log ti + log x0 where the logarithmic shrinkage is distributed as f (log t) = Ne(N−1) log t with the mean and the variance : E(log t) = − 1 N V(log t) = 1 N2 Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 15. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Taking the mean as the point estimate for each log ti finally gives log xk x0 = − k N ± √ k N Parameterizing xk in terms of the shrinkage proves immediately advantageous – because the log ti are independent, the errors in the point estimates tend to cancel and the estimates for the xk grow increasingly more accurate with k. xk = exp(− k N ) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 16. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Next, we consider the distribution of ϕ(X), where X ∼ U[0, 1] Considering the random variable X = ϕ−1(L(θ)), where θ ∼ π. Notice that : ϕ−1 : [0, Lmax] → [0, 1], λ → P(L(θ) > λ) for u ∈ [0, 1], P(X < u) = P(ϕ−1 (L(θ)) < u) = P(L(θ) > ϕ(u)) = ϕ−1 (ϕ(u)) = u This means that ϕ−1(L(θ)) follows the U[0, 1] and ϕ(X) ∼ L(θ). Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 17. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Considering the situation on the truncated distribution : π(θ) ∝ π(θ) L(θ) > L0 0 otherwise Let X0 = ϕ−1(L0) and X = ϕ−1(L(θ)), where θ ∼ π. For u ∈ [0, X0], P(X < u) = P(ϕ−1 (L(θ)) < u|L(θ) > L0) = P(L(θ) > ϕ(u)) P(L(θ) > L0) = ϕ−1(ϕ(u)) X0 = u X0 X ∼ U[0, X0], As a result, ϕ(X) ∼ L(θ), where X ∼ U[0, X0] and θ ∼ π. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 18. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Algorithm The algorithm based on the method discussed in the previous section is described in below : – Iteration 1 : sample independently N points θ1,i from the prior π(θ), determine θ1 = arg min1≤i≤N L(θ1,i ) and set ϕ1 = L(θ1) – Iteration 2 : obtain the N current values θ2,i , by reproducing the θ1,i ’s except for θ1 that is replaced by a draw from the prior distribution π conditional upon L(θ) ≥ ϕ1 ; then select θ2 as θ2 = arg min1≤i≤N L(θ2,i ), and set ϕ2 = L(θ2) – Iterate the above step until a given stopping rule is satisfied, for instance when observing very small changes in the approximation ˆZ or when reaching the maximal value of L(θ) when it is known. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 19. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion ˆZ = J i=1 ϕi (xi−1 − xi ) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 20. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion By-product of Nested Sampling Skilling indicates that nested sampling provides simulations from the posterior distribution at no extra cost : "the existing sequence of points θ1, θ2, θ3, . . . already gives a set of posterior representatives, provided the i’th is assigned the appropriate importance ωi Li " Eπ(f (θ)) = Θ π(θ)L(θ)f (θ)dθ Θ π(θ)L(θ)dθ We can use a single run of nested sampling to obtain estimators of both the numerator and the denominator, the latter being the evidence Z. The estimator of the numerator is j i=1 (xi−1 − xi )ϕi f (θi ) (1) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 21. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Lemma 3(N.Chopin & C.P Robert) : Let f (l) = Eπ{f (θ)|L(θ) = l} for l > 0, then, if f is absolutely continuous, 1 0 ϕ(x)f (ϕ(x)) dx = π(θ)L(θ)f (θ)dθ Proof : Let ψ : x → xf (x), π(θ)L(θ)f (θ)dθ = Eπ[ψ{L(θ}] = +∞ 0 Pπ(ψ{L(θ} > l)dl = +∞ 0 ϕ−1 (ψ−1 (l))dl = 1 0 ψ(ϕ(x))dx Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 22. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Termination The author suggests that max(L1, · · · , LN)Xj < fZj =⇒ termination where f is some fraction. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 23. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion N ? The larger N is, the smaller the variability of the approximation is. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 24. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion How to sample N points from the constraint parametric space Using a MCMC method which constructs a Markov Chain that has the invariant distribution of the truncated distribution. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 25. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion A decentred gaussian example The prior is π(θ) = d i=1 1 √ 2π exp − 1 2 (θ(k) )2 and the likelihood is L(y|θ) = d i=1 1 √ 2π exp − 1 2 (yk − θ(k) )2 In this example, we can calculate the evidence analytically Z = Rd L(θ)π(θ)dθ = exp(− d k=1 y2 k 4 ) 2d πd/2 Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 26. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Figure: Graph of ϕ(x) and the trace of (xi , ϕ(xi )) with d = 1 and y = 10. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 27. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Figure: The prior distribution and the likelihood with d = 1 and y = 10. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 28. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Figure: the box-plot of log ˆZ − log Z with d = 1 and y = 10 for Nested sampling and Monte Carlo. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 29. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Figure: the box-plot of log ˆZ − log Z with d = 5 and y = (3, 3, 3, 3, 3). Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 30. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion A Probit Model We consider the arsenic dataset and a probit model studied in Chapter 5 of Gelman & Hill (2006). The observations are independent Bernoulli variables yi such that P(yi = 1|xi ) = Φ(xT i θ), where xi is a vector of d covariates, θ is a vector parameter of size d, and Φ denotes the standard normal distribution function. In this particular example, d = 7. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 31. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion The prior is θ ∼ N(0, 102 Id ) L(θ) = n i=1 Φ(xT i θ) yi 1 − Φ(xT i θ) 1−yi Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 32. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Figure: the box-plot of log ˆZ with N = 20 for HMC and random walk MCMC. The blue line remarks the true value of log Z(Chib’s method). Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 33. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Posterior Samples We use the Gaussian example to illustrate this result. Let f (θ) = exp(−3θ + 9d 2 ). Figure: The box-plot of the log-relative error of log ˆZ − log Z and log ˆE(f ) − log E(f ) Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 34. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Conclusion – Nested sampling reverses the accepted approach to Bayesian computation by putting the evidence first. – Nested sampling samples more sparsely from the prior in regions where the likelihood is low and more densely where the likelihood is high, resulting in greater efficiency than a sampler that draws directly from the prior. – The procedure runs with an evolving collection of N points, where N can be chosen small for speed or large for accuracy. – Nested sampling always reduces a multidimensional integral to the integral of a one-dimensional monotonic function, no matter how many dimensions θ occupies, and no matter how strange the shape of the likelihood function L(θ) is. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 35. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Problems – How to generate N independent points in the constraint parametric space is an important problem. Techniques to do so effectively and efficiently may vary from problem to problem. – Termination is also another problem in practice. Represented by WU Changye Nested Sampling for General Bayesian Computation
  • 36. Outline Nested Sampling Posterior Simulation Nested Sampling Termination and Size of N Numerical Examples Conclusion Thank you ! Represented by WU Changye Nested Sampling for General Bayesian Computation