Discussion of Fearnhead and Prangle, RSS< Dec. 14, 2011

Semi-automatic ABCA Discussion

Semi-automatic ABC
A Discussion

Christian P. Robert

Universit´ Paris-Dauphine, IuF, & CREST
e
http://xianblog.wordpress.com

November 2, 2011
LA TEX code borrowed from arXiv:1004.1112v2

Approximate Bayesian computation (recap)

Approximate Bayesian computation


Summary statistic selection


Regular Bayesian computation issues

When faced with a non-standard posterior distribution

π(θ|y) ∝ π(θ)L(θ|y)

the standard solution is to use simulation (Monte Carlo) to
produce a sample
θ1 , . . . , θ T
from π(θ|y) (or approximately by Markov chain Monte Carlo
methods)
[Robert & Casella, 2004]


Untractable likelihoods

Cases when the likelihood function f (y|θ) is unavailable and when
the completion step

f (y|θ) = f (y, z|θ) dz
Z

is impossible or too costly because of the dimension of z
c MCMC cannot be implemented!


Untractable likelihoods

c MCMC cannot be implemented!


The ABC method

Bayesian setting: target is π(θ)f (x|θ)


The ABC method

When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:


The ABC method

When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
ABC algorithm
For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y.

[Rubin, 1984; Tavar´ et al., 1997]
e


A as approximative

When y is a continuous random variable, equality z = y is replaced
with a tolerance condition,

(y, z) ≤

where is a distance


A as approximative

When y is a continuous random variable, equality z = y is replaced
with a tolerance condition,

(y, z) ≤

where is a distance
Output distributed from

π(θ) Pθ { (y, z) < } ∝ π(θ| (y, z) < )


ABC algorithm

Algorithm 1 Likelihood-free rejection sampler
for i = 1 to N do
repeat
generate θ from the prior distribution π(·)
generate z from the likelihood f (·|θ )
until ρ{η(z), η(y)} ≤
set θi = θ
end for

where η(y) deﬁnes a (generaly in-suﬃcient) statistic


Output

The likelihood-free algorithm samples from the marginal in z of:

π(θ)f (z|θ)IA ,y (z)
π (θ, z|y) = ,
A ,y ×Θ π(θ)f (z|θ)dzdθ

where A ,y = {z ∈ D|ρ(η(z), η(y)) < }.


Output

The likelihood-free algorithm samples from the marginal in z of:

π(θ)f (z|θ)IA ,y (z)
π (θ, z|y) = ,
A ,y ×Θ π(θ)f (z|θ)dzdθ

where A ,y = {z ∈ D|ρ(η(z), η(y)) < }.
The idea behind ABC is that the summary statistics coupled with a
small tolerance should provide a good approximation of the
posterior distribution:

π (θ|y) = π (θ, z|y)dz ≈ π(θ|η(y)) .

[Not garanteed!]




F&P’s setting
Noisy ABC
Optimal summary statistic

F&P’s setting

F&P’s ABC

Use of a summary statistic S(·), an importance proposal g(·), a
kernel K(·) ≤ 1 and a bandwidth h > 0 such that

(θ, ysim ) ∼ g(θ)f (ysim |θ)

is accepted with probability (hence the bound)

K[{S(ysim ) − sobs }/h]

[or is it K[{S(ysim ) − sobs }]/h, cf (2)? typo]
the corresponding importance weight deﬁned by

π(θ) g(θ)

F&P’s setting

Errors, errors, and errors

Three levels of approximation
π(θ|yobs ) by π(θ|sobs ) loss of information
π(θ|sobs ) by

π(s)K[{s − sobs }/h]π(θ|s) ds
πABC (θ|sobs ) =
π(s)K[{s − sobs }/h] ds

noisy observations
πABC (θ|sobs ) by importance Monte Carlo based on N
simulations, represented by var(a(θ)|sobs )/Nacc [expected
number of acceptances]

F&P’s setting

Average acceptance asymptotics

For the average acceptance probability/approximate likelihood

p(θ|sobs ) = f (ysim |θ) K[{S(ysim ) − sobs }/h] dysim ,

overall acceptance probability

p(sobs ) = p(θ|sobs ) π(θ) dθ = π(sobs )hd + o(hd )

[Lemma 1]

F&P’s setting

Optimal importance proposal

Best choice of importance proposal in terms of eﬀective sample size

g (θ|sobs ) ∝ π(θ)p(θ|sobs )1/2

[Not particularly useful in practice]

F&P’s setting

Calibration of h

“This result gives insight into how S(·) and h aﬀect the
Monte Carlo error. To minimize Monte Carlo error, we
need hd to be not too small. Thus ideally we want S(·)
to be a low dimensional summary of the data that is
suﬃciently informative about θ that π(θ|sobs ) is close, in
some sense, to π(θ|yobs )” (p.5)

Constraint on h only addresses one term in the approximation
error and acceptance probability
h large prevents π(θ|sobs ) to be close to πABC (θ|sobs )
d small prevents π(θ|sobs ) to be close to π(θ|yobs )

Noisy ABC

Calibrated ABC

Definition
For 0 < q < 1 and subset A, fix [one specific?/all?] event Eq (A)
with PrABC (θ ∈ Eq (A)|sobs ) = q. Then ABC is calibrated if

Pr(θ ∈ A|Eq (A)) = q

Why calibrated and not exact?

Noisy ABC

Calibrated ABC

Theorem
Noisy ABC, where

sobs = S(yobs ) + h , ∼ K(·)

is calibrated
[Wilkinson, 2008]
no condition on h

Noisy ABC

Calibrated ABC

Theorem
For noisy ABC, the expected noisy-ABC log-likelihood,

E {log[p(θ|sobs )]} = log[p(θ|S(yobs ) + h )]π(yobs |θ0 )K( )dyobs dx,

has its maximum at θ = θ0 .

[Last line of proof contains a typo]

True for any choice of summary statistic?
[Imposes at least identiﬁability...]
Relevant in asymptotia and not for the data

Noisy ABC

Calibrated ABC

Corollary
For noisy ABC, the ABC posterior converges onto a point mass on
the true parameter value as m → ∞.

For standard ABC, not always the case (unless h goes to zero).
Strength of regularity conditions (c1) and (c2) in Bernardo
& Smith, 1994?
[constraints on posterior]
Some condition upon summary statistic?


Loss motivated statistic

Under quadratic loss function,
Theorem
ˆ
(i) The minimal posterior error E[L(θ, θ)|yobs ] occurs when
ˆ = E(θ|yobs ) (!)
θ
(ii) When h → 0, EABC (θ|sobs ) converges to E(θ|yobs )
ˆ
iii If S(yobs ) = E[θ|yobs ] then for θ = EABC [θ|sobs ]

ˆ
E[L(θ, θ)|yobs ] = trace(AΣ) + h2 xT AxK(x)dx + o(h2 ).

Measure-theoretic diﬃculties?
dependence of sobs on h makes me uncomfortable
Relevant for choice of K?


“We take a diﬀerent approach, and weaken the
requirement for πABC to be a good approximation to
π(θ|yobs ). We argue for πABC to be a good
approximation solely in terms of the accuracy of certain
estimates of the parameters.” (p.5)

From this result, F&P derive their choice of summary statistic,

S(y) = E(θ|y)

[almost suﬃcient]
suggest

h = O(N −1/(2+d) ) and h = O(N −1/(4+d) )

as optimal bandwidths for noisy and standard ABC.


“We take a diﬀerent approach, and weaken the
requirement for πABC to be a good approximation to
π(θ|yobs ). We argue for πABC to be a good
approximation solely in terms of the accuracy of certain
estimates of the parameters.” (p.5)

From this result, F&P derive their choice of summary statistic,

S(y) = E(θ|y)

[EABC [θ|S(yobs )] = E[θ|yobs ]]
suggest

h = O(N −1/(2+d) ) and h = O(N −1/(4+d) )

as optimal bandwidths for noisy and standard ABC.


Caveat

Since E(θ|yobs ) is most usually unavailable, F&P suggest
(i) use a pilot run of ABC to determine a region of non-negligible
posterior mass;
(ii) simulate sets of parameter values and data;
(iii) use the simulated sets of parameter values and data to
estimate the summary statistic; and
(iv) run ABC with this choice of summary statistic.


Approximating the summary statistic

As Beaumont et al. (2002) and Blum and Fran¸ois (2010), F&P
c
use a linear regression to approximate E(θ|yobs ):
(i)
θi = β0 + β (i) f (yobs ) + i


Applications

The paper’s second half covers:
g-and-k-distribution
stochastic kinetic biochemical networks
LotkaVolterra model
Ricker map ecological model
M/G/1-queue
tuberculosis bacteria genotype data


Questions

dependence on h and S(·) in the early stage
reduction of Bayesian inference to point estimation
approximation error in step (iii) not accounted for
not parameterisation invariant
practice shows that proper approximation to genuine posterior
distributions stems from using a (much) larger number of
summary statistics than the dimension of the parameter
the validity of the approximation to the optimal summary
statistic depends on the quality of the pilot run;
important inferential issues like model choice are not covered
by this approach.

Discussion of Fearnhead and Prangle, RSS< Dec. 14, 2011

More Related Content

What's hot (20)

Similar to Discussion of Fearnhead and Prangle, RSS< Dec. 14, 2011 (20)

More from Christian Robert (20)

Recently uploaded (20)