SlideShare a Scribd company logo
Automatic Bayesian method for Numerical
Integration
Jagadeeswaran Rathinavel, Fred J. Hickernell
Department of Applied Mathematics, Illinois Institute of Technology
jrathin1@iit.edu
Thanks to the CASSC 2017 organizers
Introduction Bayesian Cubature Simulation results Conclusion References
Multivariate Integration
Approximate the d-dimensional integral over [0, 1)d
µ := E[f(X)] :=
ż
[0,1)d
f(x) ν(dx)
by a simple cubature rule
µ « ^µ :=
n´1ÿ
j=0
f(xj)wj =
ż
[0,1)d
f(x) ^ν(dx)
using points (xj)n´1
j=0 and associated weights wj. Then the approximation error
µ ´ ^µ =
ş
[0,1)d f(x) (ν ´ ^ν)(dx)
use extensible pointset and an algorithm that allows to add more points if needed.
2/22
Introduction Bayesian Cubature Simulation results Conclusion References
Motivating Example
How to measure the water volume of a creek in a given area (For Ex: 10sqft) ?
Figure: (Nuyens, 2007)
3/22
Introduction Bayesian Cubature Simulation results Conclusion References
What is multivariate integration?
Figure: (Nuyens, 2007)
A d-dim integral:
ż
[0,1)d
f(x) dx =
ż 1
0
ż 1
0
. . .
ż 1
0
f(x1, x2, . . . xd) dx1 dx2 . . . dxd
4/22
Introduction Bayesian Cubature Simulation results Conclusion References
What is multivariate integration?
Figure: (Nuyens, 2007)
A d-dim integral:
ż
[0,1)d
f(x) dx =
ż 1
0
ż 1
0
. . .
ż 1
0
f(x1, x2, . . . xd) dx1 dx2 . . . dxd
Grid points: Curse of dimensionality
4/22
Introduction Bayesian Cubature Simulation results Conclusion References
What is multivariate integration?
Figure: (Nuyens, 2007)
A d-dim integral:
ż
[0,1)d
f(x) dx =
ż 1
0
ż 1
0
. . .
ż 1
0
f(x1, x2, . . . xd) dx1 dx2 . . . dxd
Grid points: Curse of dimensionality
IID Monte Carlo: Converges O(n´ 1
2 )
^µ = 1
n
řn´1
j=0 f(xj)
4/22
Introduction Bayesian Cubature Simulation results Conclusion References
What is multivariate integration?
Figure: (Nuyens, 2007)
A d-dim integral:
ż
[0,1)d
f(x) dx =
ż 1
0
ż 1
0
. . .
ż 1
0
f(x1, x2, . . . xd) dx1 dx2 . . . dxd
Grid points: Curse of dimensionality
IID Monte Carlo: Converges O(n´ 1
2 )
^µ = 1
n
řn´1
j=0 f(xj)
Typical Quasi Monte Carlo: Converges O(n´1+
)
xj chosen carefully (low-discrepancy points)
4/22
Introduction Bayesian Cubature Simulation results Conclusion References
What is multivariate integration?
Figure: (Nuyens, 2007)
A d-dim integral:
ż
[0,1)d
f(x) dx =
ż 1
0
ż 1
0
. . .
ż 1
0
f(x1, x2, . . . xd) dx1 dx2 . . . dxd
Grid points: Curse of dimensionality
IID Monte Carlo: Converges O(n´ 1
2 )
^µ = 1
n
řn´1
j=0 f(xj)
Typical Quasi Monte Carlo: Converges O(n´1+
)
xj chosen carefully (low-discrepancy points)
Can we do better? 4/22
Introduction Bayesian Cubature Simulation results Conclusion References
Algorithm
1: procedure AutoCubature(f, errtol) Ź Integrate within the error tolerance
2: n Ð 28
3: do
4: Generate (xi)n´1
i=0
5: Sample (f(xi))n´1
i=0
6: Compute errn
7: n Ð 2 ˆ n
8: while errn ą errtol Ź Iterate till error tolerance is met
9: Compute weights (wi)n´1
i=0
10: Compute integral ^µn
11: return ^µn Ź Integral estimate ^µn
12: end procedure
Problem:
How to choose (xj)n´1
j=0 and (wj)n´1
j=0 to make |µ ´ ^µ| small?
Given error tolerance errtol, how big must ‘n’ be to guarantee |µ ´ ^µ| ď errtol
5/22
Introduction Bayesian Cubature Simulation results Conclusion References
Algorithm
1: procedure AutoCubature(f, errtol) Ź Integrate within the error tolerance
2: n Ð 28
3: do
4: Generate (xi)n´1
i=0
5: Sample (f(xi))n´1
i=0
6: Compute errn
7: n Ð 2 ˆ n
8: while errn ą errtol Ź Iterate till error tolerance is met
9: Compute weights (wi)n´1
i=0
10: Compute integral ^µn
11: return ^µn Ź Integral estimate ^µn
12: end procedure
Problem:
How to choose (xj)n´1
j=0 and (wj)n´1
j=0 to make |µ ´ ^µ| small?
Given error tolerance errtol, how big must ‘n’ be to guarantee |µ ´ ^µ| ď errtol
5/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Trio Identity
Random f : f „ GP(0, s2
Cθ), a Gaussian process
from the sample space F
with zero mean and covariance kernel, s2
Cθ, Cθ : X ˆ X Ñ R. Then
6/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Trio Identity
Random f : f „ GP(0, s2
Cθ), a Gaussian process
from the sample space F
with zero mean and covariance kernel, s2
Cθ, Cθ : X ˆ X Ñ R. Then
c0 =
ż
XˆX
C(x, t) ν(dx)ν(dt), c =
ż
X
C(xi, t) ν(dt)
n´1
i=0
C = C(xi, xj)
n´1
i,j=0
, w = wi
n´1
i=0
,
µ ´ ^µ =
ş
X f(x) (ν ´ ^ν)(dx)
s
a
c0 ´ 2cTw + wTCwloooooooooooooomoooooooooooooon
ALNB
(f, ν ´ ^ν)
a
c0 ´ 2cTw + wTCwlooooooooooooomooooooooooooon
DSCB
(ν ´ ^ν)
sloomoon
VARB
(f)
6/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Trio Identity
Random f : f „ GP(0, s2
Cθ), a Gaussian process
from the sample space F
with zero mean and covariance kernel, s2
Cθ, Cθ : X ˆ X Ñ R. Then
c0 =
ż
XˆX
C(x, t) ν(dx)ν(dt), c =
ż
X
C(xi, t) ν(dt)
n´1
i=0
C = C(xi, xj)
n´1
i,j=0
, w = wi
n´1
i=0
,
µ ´ ^µ =
ş
X f(x) (ν ´ ^ν)(dx)
s
a
c0 ´ 2cTw + wTCwloooooooooooooomoooooooooooooon
ALNB
(f, ν ´ ^ν)
a
c0 ´ 2cTw + wTCwlooooooooooooomooooooooooooon
DSCB
(ν ´ ^ν)
sloomoon
VARB
(f)
The scale parameter, s, and shape parameter, θ, should be estimated.
w = C´1
c minimizes DSCB
(ν ´ ^ν)
makes ALNB
(f, ν ´ ^ν)
ˇ
ˇ(f(xi) = yi)n´1
i=0 „ N(0, 1)
Ref: Diaconis (1988), O’Hagan (1991), Ritter (2000), Rasmussen (2003) and
others 6/22
Introduction Bayesian Cubature Simulation results Conclusion References
Covariance kernel
shift invariant kernel
Cθ(x, t) :=
ÿ
mPZd
λm,θ e
?
´12πmT
(x´t)
, 0 ď |x ´ t| ď 1
7/22
Introduction Bayesian Cubature Simulation results Conclusion References
Covariance kernel
shift invariant kernel
Cθ(x, t) :=
ÿ
mPZd
λm,θ e
?
´12πmT
(x´t)
, 0 ď |x ´ t| ď 1
when
λm,θ :=
dź
l=1
1
max(|ml|
θ , 1)r
θď1
, with λ0,θ = 1, θ P (0, 1]d
, r P 2N
Cθ(x, t) =
dź
l=1
1 ´ θr
l
(2π
?
´1)r
r!
Br(|xl ´ tl|), θ P (0, 1]d
, r P 2N
7/22
Introduction Bayesian Cubature Simulation results Conclusion References
Covariance kernel
shift invariant kernel
Cθ(x, t) :=
ÿ
mPZd
λm,θ e
?
´12πmT
(x´t)
, 0 ď |x ´ t| ď 1
when
λm,θ :=
dź
l=1
1
max(|ml|
θ , 1)r
θď1
, with λ0,θ = 1, θ P (0, 1]d
, r P 2N
Cθ(x, t) =
dź
l=1
1 ´ θr
l
(2π
?
´1)r
r!
Br(|xl ´ tl|), θ P (0, 1]d
, r P 2N
where Br is Bernoulli polynomial of order r (Olver et al., 2013).
Br(x) =
´r!
(2π
?
´1)r
∞ÿ
k‰0,k=´∞
e2π
?
´1kx
kr
#
for r = 1, 0 ă x ă 1
for r = 2, 3, . . . 0 ď x ď 1
7/22
Introduction Bayesian Cubature Simulation results Conclusion References
Covariance kernel
shift invariant kernel
Cθ(x, t) :=
ÿ
mPZd
λm,θ e
?
´12πmT
(x´t)
, 0 ď |x ´ t| ď 1
when
λm,θ :=
dź
l=1
1
max(|ml|
θ , 1)r
θď1
, with λ0,θ = 1, θ P (0, 1]d
, r P 2N
Cθ(x, t) =
dź
l=1
1 ´ θr
l
(2π
?
´1)r
r!
Br(|xl ´ tl|), θ P (0, 1]d
, r P 2N
where Br is Bernoulli polynomial of order r (Olver et al., 2013).
Br(x) =
´r!
(2π
?
´1)r
∞ÿ
k‰0,k=´∞
e2π
?
´1kx
kr
#
for r = 1, 0 ă x ă 1
for r = 2, 3, . . . 0 ď x ď 1
Given (xi : i = 0, ..., n ´ 1), the symmetric kernel matrix is formed
Cθ = Cθ(xi, xj)
n´1
i,j=0
7/22
Introduction Bayesian Cubature Simulation results Conclusion References
Quais Monte-Carlo : Sample “more uniformly”
8/22
Introduction Bayesian Cubature Simulation results Conclusion References
Rank-1 Lattice rules : Low discrepancy point set
Given the “generating vector” g, the construction of n - Rank-1 lattice points is
given by
"
kg
n
* n´1
k=0
then the Lattice rule approximation is
1
n
n´1ÿ
k=0
f
"
kg
n
*
with t.u the fractional part, i.e, ‘modulo 1’ operator.
9/22
Introduction Bayesian Cubature Simulation results Conclusion References
Rank-1 Lattice rules : Low discrepancy point set
Shift invariant kernel + Lattice points = ’Symmetric circulant kernel’ matrix
9/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Cubature
µ =
ż
X
f(x)ν(dx) « ^µn =
n´1ÿ
i=0
wif(xi)
10/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Cubature
µ =
ż
X
f(x)ν(dx) « ^µn =
n´1ÿ
i=0
wif(xi)
Assume f „ GP(0, s2
Cθ)
µ ´ ^µn
ˇ
ˇ(f(xi) = yi)n´1
i=0 „ N yT
(C´1
c ´ w), s2
(c0 ´ cT
C´1
c)
10/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Cubature
µ =
ż
X
f(x)ν(dx) « ^µn =
n´1ÿ
i=0
wif(xi)
Assume f „ GP(0, s2
Cθ)
µ ´ ^µn
ˇ
ˇ(f(xi) = yi)n´1
i=0 „ N yT
(C´1
c ´ w), s2
(c0 ´ cT
C´1
c)
Choosing w = C´1
^θ
c^θ is optimal
P[|µ ´ ^µn| ď errn] ě 99% for errn = 2.58
d
c^θ,0 ´ cT
^θ
C´1
^θ
c^θ
yTC´1
^θ
y
n
MLE ^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
, where y = f(xi)
n´1
i=0
.
10/22
Introduction Bayesian Cubature Simulation results Conclusion References
Bayesian Cubature
µ =
ż
X
f(x)ν(dx) « ^µn =
n´1ÿ
i=0
wif(xi)
Assume f „ GP(0, s2
Cθ)
µ ´ ^µn
ˇ
ˇ(f(xi) = yi)n´1
i=0 „ N yT
(C´1
c ´ w), s2
(c0 ´ cT
C´1
c)
Choosing w = C´1
^θ
c^θ is optimal
P[|µ ´ ^µn| ď errn] ě 99% for errn = 2.58
d
c^θ,0 ´ cT
^θ
C´1
^θ
c^θ
yTC´1
^θ
y
n
MLE ^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
, where y = f(xi)
n´1
i=0
.
C´1
typically requires O(n3
) operations.
But with covariance kernel C for which matrix C is symmetric circulant.
So operations on C require only O(n log(n)) operations.
10/22
Introduction Bayesian Cubature Simulation results Conclusion References
Optimal Shape parameter θ
Maximum likelihood estimate of θ by
argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
11/22
Introduction Bayesian Cubature Simulation results Conclusion References
Optimal Shape parameter θ
Maximum likelihood estimate of θ by
argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
simplified to (Bernstein, 2009)
argmin
θ
log
n´1ÿ
i=0
|zi|2
γi
+
1
n
n´1ÿ
i=0
log(γi)
where (γi) eigen values of Cθ
z := (zi)n´1
i=0 = DFT y , γ := (γi)n´1
i=0 = DFT C(xi, x0)
n´1
i=0
O(n log(n)) operations to estimate the ^θ
11/22
Introduction Bayesian Cubature Simulation results Conclusion References
Cubature rule - Computing ^µ efficiently
Define
Cθ = Cθ(xi, xj)
n´1
i,j=0
, cθ =
ż
[0,1)d
Cθ(xi, x)dx
n´1
i=0
12/22
Introduction Bayesian Cubature Simulation results Conclusion References
Cubature rule - Computing ^µ efficiently
Define
Cθ = Cθ(xi, xj)
n´1
i,j=0
, cθ =
ż
[0,1)d
Cθ(xi, x)dx
n´1
i=0
To find the approximate mean ^µ
^µ = wT
y = C´1
θ cθloomoon
wT
y
12/22
Introduction Bayesian Cubature Simulation results Conclusion References
Cubature rule - Computing ^µ efficiently
Define
Cθ = Cθ(xi, xj)
n´1
i,j=0
, cθ =
ż
[0,1)d
Cθ(xi, x)dx
n´1
i=0
To find the approximate mean ^µ
^µ = wT
y = C´1
θ cθloomoon
wT
y
Further simplified using shift-invariant and circulant matrix property (Bernstein,
2009)
^µ =
ş
[0,1]d K x, x0 dx ˆ
řn´1
i=0 C(x0, xi)
´1
ˆ
řn´1
i=0 yi
O(n) operations to compute the ^µ
12/22
Introduction Bayesian Cubature Simulation results Conclusion References
Computing error bound efficiently
Let’s simplify: errn = 2.58
d
C^θ,0 ´ cT
^θ
C´1
^θ
c^θ
yTC´1
^θ
y
n
13/22
Introduction Bayesian Cubature Simulation results Conclusion References
Computing error bound efficiently
Let’s simplify: errn = 2.58
d
C^θ,0 ´ cT
^θ
C´1
^θ
c^θ
yTC´1
^θ
y
n
using the facts about the shift invariant kernel and R-1 Lattice points (Bernstein,
2009), where
z := (zi)n´1
i=0 = DFT (y) , γ := (γi)n´1
i=0 = DFT (C(xi, x0))n´1
i=0
13/22
Introduction Bayesian Cubature Simulation results Conclusion References
Computing error bound efficiently
Let’s simplify: errn = 2.58
d
C^θ,0 ´ cT
^θ
C´1
^θ
c^θ
yTC´1
^θ
y
n
using the facts about the shift invariant kernel and R-1 Lattice points (Bernstein,
2009), where
z := (zi)n´1
i=0 = DFT (y) , γ := (γi)n´1
i=0 = DFT (C(xi, x0))n´1
i=0
Finally
errn = 2.58
g
f
f
e 1 ´
1
řn´1
i=0 C(x0, xi)
1
n
n´1ÿ
i=0
|zi|2
γi
13/22
Introduction Bayesian Cubature Simulation results Conclusion References
Periodization Transforms
Our algorithm works best with periodic functions
Baker : ˜f(t) = f 1 ´ 2|t ´
1
2
|
C0 : ˜f(t) = f 3t2
´ 2t3
dź
j=1
(6tj(1 ´ tj))
C1 : ˜f(t) = f t3
(10 ´ 15t + 6t2
)
dź
j=1
(30t2
j (1 ´ tj)2
)
Sidi’s C1 : ˜f(t) = f tj ´
sin(2πtj)
2π
d
j=1
dź
j=1
(1 ´ cos(2πtj))
14/22
Introduction Bayesian Cubature Simulation results Conclusion References
Test functions for Numerical integration:
Multivariate Normal (MVN)
µ =
ż
[a,b]
exp ´1
2 tT
Σ´1
t
a
(2π)d det(Σ)
dt
Genz (1993)
=
ż
[0,1]d´1
f(x) dx
Keister : µ =
ż
Rd
cos( x ) exp(´ x 2
) dx, d = 1, 2, . . . .
Exp(Cos) : µ =
ż
(0,1]d
exp(cos(x))dx, d = 1, 2, . . . .
15/22
Introduction Bayesian Cubature Simulation results Conclusion References
Integrating Multivariate Normal Probability
16/22
Introduction Bayesian Cubature Simulation results Conclusion References
Exponential of Cosine
17/22
Introduction Bayesian Cubature Simulation results Conclusion References
Keister Integral Example
18/22
Introduction Bayesian Cubature Simulation results Conclusion References
Summary
Automatic Bayesian Cubature with O(n) complexity and O(n log(n)) MLE
Complexity
Having the advantages of a kernel method and the low computation cost of
Quasi Monte carlo
Scalable based on the complexity of the Integrand
i.e, Kernel order and Lattice-points can be chosen to suit the smoothness of
the integrand
More about Guaranteed Automatic Algorithms (GAIL):
http://gailgithub.github.io/GAIL_Dev/
These slides will also be available here.
19/22
Introduction Bayesian Cubature Simulation results Conclusion References
Future work
Dimension independent
Tighter error bound
Roundoff error
Lattice points optimization specific the the kernel
Choosing kernel order automatically as part of MLE
Can we compute n directly instead of in Loop?
Choosing Periodization transform automatically
20/22
Thank you
Introduction Bayesian Cubature Simulation results Conclusion References
References I
Bernstein, Dennis S. 2009. Matrix mathematics, theory, facts, and formulas, Princeton University
Press.
Diaconis, P. 1988. Bayesian numerical analysis, Statistical decision theory and related topics iv,
papers from the 4th purdue symp., west lafayette/indiana 1986, pp. 163–175.
Genz, A. 1993. Comparison of methods for the computation of multivariate normal probabilities,
Computing Science and Statistics 25, 400–405.
Nuyens, Dirk. 2007. Fast construction of good lattice rules, Ph.D. Thesis.
O’Hagan, A. 1991. Bayes-Hermite quadrature, J. Statist. Plann. Inference 29, 245–260.
Olver, F. W. J., D. W. Lozier, R. F. Boisvert, C. W. Clark, and A. B. O. Dalhuis. 2013. Digital library of
mathematical functions.
Rasmussen, C. E. 2003. Bayesian Monte Carlo, Advances in Neural Information Processing
Systems, pp. 489–496.
Ritter, K. 2000. Average-case analysis of numerical problems, Lecture Notes in Mathematics,
vol. 1733, Springer-Verlag, Berlin.
22/22

More Related Content

PDF
Optimal interval clustering: Application to Bregman clustering and statistica...
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Lesson 28: The Fundamental Theorem of Calculus
PDF
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
PDF
Learning Sparse Representation
PDF
ABC based on Wasserstein distances
PDF
Can we estimate a constant?
Optimal interval clustering: Application to Bregman clustering and statistica...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Lesson 28: The Fundamental Theorem of Calculus
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
Learning Sparse Representation
ABC based on Wasserstein distances
Can we estimate a constant?

What's hot (20)

PDF
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
PDF
Tulane March 2017 Talk
PDF
Delayed acceptance for Metropolis-Hastings algorithms
PDF
ABC convergence under well- and mis-specified models
PDF
Slides: Jeffreys centroids for a set of weighted histograms
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Multiple estimators for Monte Carlo approximations
PDF
Geodesic Method in Computer Vision and Graphics
PDF
accurate ABC Oliver Ratmann
PDF
Lesson 31: Evaluating Definite Integrals
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
Low Complexity Regularization of Inverse Problems
PDF
Proximal Splitting and Optimal Transport
PDF
ABC with Wasserstein distances
PDF
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...
PDF
Coordinate sampler: A non-reversible Gibbs-like sampler
PDF
comments on exponential ergodicity of the bouncy particle sampler
PDF
Maximum likelihood estimation of regularisation parameters in inverse problem...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
Tulane March 2017 Talk
Delayed acceptance for Metropolis-Hastings algorithms
ABC convergence under well- and mis-specified models
Slides: Jeffreys centroids for a set of weighted histograms
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Multiple estimators for Monte Carlo approximations
Geodesic Method in Computer Vision and Graphics
accurate ABC Oliver Ratmann
Lesson 31: Evaluating Definite Integrals
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Low Complexity Regularization of Inverse Problems
Proximal Splitting and Optimal Transport
ABC with Wasserstein distances
Efficient Simulations for Contamination of Groundwater Aquifers under Uncerta...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
2018 MUMS Fall Course - Mathematical surrogate and reduced-order models - Ral...
Coordinate sampler: A non-reversible Gibbs-like sampler
comments on exponential ergodicity of the bouncy particle sampler
Maximum likelihood estimation of regularisation parameters in inverse problem...
Ad

Similar to Automatic Bayesian method for Numerical Integration (20)

PDF
Automatic bayesian cubature
PDF
SIAM - Minisymposium on Guaranteed numerical algorithms
PDF
Georgia Tech 2017 March Talk
PDF
SIAM CSE 2017 talk
PDF
Monte Carlo Methods 2017 July Talk in Montreal
PDF
Mines April 2017 Colloquium
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
QMC Error SAMSI Tutorial Aug 2017
PDF
Kernel Bayes Rule
PPTX
NICE Implementations of Variational Inference
PPTX
NICE Research -Variational inference project
PPTX
A machine learning method for efficient design optimization in nano-optics
PDF
Variational inference
ODP
Iwsmbvs
PDF
A Solution Manual and Notes for The Elements of Statistical Learning.pdf
PDF
ABC workshop: 17w5025
PDF
11 Machine Learning Important Issues in Machine Learning
PPTX
A machine learning method for efficient design optimization in nano-optics
PDF
Asymptotics of ABC, lecture, Collège de France
PDF
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Automatic bayesian cubature
SIAM - Minisymposium on Guaranteed numerical algorithms
Georgia Tech 2017 March Talk
SIAM CSE 2017 talk
Monte Carlo Methods 2017 July Talk in Montreal
Mines April 2017 Colloquium
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
QMC Error SAMSI Tutorial Aug 2017
Kernel Bayes Rule
NICE Implementations of Variational Inference
NICE Research -Variational inference project
A machine learning method for efficient design optimization in nano-optics
Variational inference
Iwsmbvs
A Solution Manual and Notes for The Elements of Statistical Learning.pdf
ABC workshop: 17w5025
11 Machine Learning Important Issues in Machine Learning
A machine learning method for efficient design optimization in nano-optics
Asymptotics of ABC, lecture, Collège de France
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Ad

Recently uploaded (20)

PPTX
2. Earth - The Living Planet Module 2ELS
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
The scientific heritage No 166 (166) (2025)
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PDF
. Radiology Case Scenariosssssssssssssss
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
famous lake in india and its disturibution and importance
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
2. Earth - The Living Planet earth and life
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
2. Earth - The Living Planet Module 2ELS
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Placing the Near-Earth Object Impact Probability in Context
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The scientific heritage No 166 (166) (2025)
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
neck nodes and dissection types and lymph nodes levels
The KM-GBF monitoring framework – status & key messages.pptx
Introduction to Cardiovascular system_structure and functions-1
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Derivatives of integument scales, beaks, horns,.pptx
. Radiology Case Scenariosssssssssssssss
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Viruses (History, structure and composition, classification, Bacteriophage Re...
ECG_Course_Presentation د.محمد صقران ppt
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
famous lake in india and its disturibution and importance
Introduction to Fisheries Biotechnology_Lesson 1.pptx
2. Earth - The Living Planet earth and life
AlphaEarth Foundations and the Satellite Embedding dataset

Automatic Bayesian method for Numerical Integration

  • 1. Automatic Bayesian method for Numerical Integration Jagadeeswaran Rathinavel, Fred J. Hickernell Department of Applied Mathematics, Illinois Institute of Technology jrathin1@iit.edu Thanks to the CASSC 2017 organizers
  • 2. Introduction Bayesian Cubature Simulation results Conclusion References Multivariate Integration Approximate the d-dimensional integral over [0, 1)d µ := E[f(X)] := ż [0,1)d f(x) ν(dx) by a simple cubature rule µ « ^µ := n´1ÿ j=0 f(xj)wj = ż [0,1)d f(x) ^ν(dx) using points (xj)n´1 j=0 and associated weights wj. Then the approximation error µ ´ ^µ = ş [0,1)d f(x) (ν ´ ^ν)(dx) use extensible pointset and an algorithm that allows to add more points if needed. 2/22
  • 3. Introduction Bayesian Cubature Simulation results Conclusion References Motivating Example How to measure the water volume of a creek in a given area (For Ex: 10sqft) ? Figure: (Nuyens, 2007) 3/22
  • 4. Introduction Bayesian Cubature Simulation results Conclusion References What is multivariate integration? Figure: (Nuyens, 2007) A d-dim integral: ż [0,1)d f(x) dx = ż 1 0 ż 1 0 . . . ż 1 0 f(x1, x2, . . . xd) dx1 dx2 . . . dxd 4/22
  • 5. Introduction Bayesian Cubature Simulation results Conclusion References What is multivariate integration? Figure: (Nuyens, 2007) A d-dim integral: ż [0,1)d f(x) dx = ż 1 0 ż 1 0 . . . ż 1 0 f(x1, x2, . . . xd) dx1 dx2 . . . dxd Grid points: Curse of dimensionality 4/22
  • 6. Introduction Bayesian Cubature Simulation results Conclusion References What is multivariate integration? Figure: (Nuyens, 2007) A d-dim integral: ż [0,1)d f(x) dx = ż 1 0 ż 1 0 . . . ż 1 0 f(x1, x2, . . . xd) dx1 dx2 . . . dxd Grid points: Curse of dimensionality IID Monte Carlo: Converges O(n´ 1 2 ) ^µ = 1 n řn´1 j=0 f(xj) 4/22
  • 7. Introduction Bayesian Cubature Simulation results Conclusion References What is multivariate integration? Figure: (Nuyens, 2007) A d-dim integral: ż [0,1)d f(x) dx = ż 1 0 ż 1 0 . . . ż 1 0 f(x1, x2, . . . xd) dx1 dx2 . . . dxd Grid points: Curse of dimensionality IID Monte Carlo: Converges O(n´ 1 2 ) ^µ = 1 n řn´1 j=0 f(xj) Typical Quasi Monte Carlo: Converges O(n´1+ ) xj chosen carefully (low-discrepancy points) 4/22
  • 8. Introduction Bayesian Cubature Simulation results Conclusion References What is multivariate integration? Figure: (Nuyens, 2007) A d-dim integral: ż [0,1)d f(x) dx = ż 1 0 ż 1 0 . . . ż 1 0 f(x1, x2, . . . xd) dx1 dx2 . . . dxd Grid points: Curse of dimensionality IID Monte Carlo: Converges O(n´ 1 2 ) ^µ = 1 n řn´1 j=0 f(xj) Typical Quasi Monte Carlo: Converges O(n´1+ ) xj chosen carefully (low-discrepancy points) Can we do better? 4/22
  • 9. Introduction Bayesian Cubature Simulation results Conclusion References Algorithm 1: procedure AutoCubature(f, errtol) Ź Integrate within the error tolerance 2: n Ð 28 3: do 4: Generate (xi)n´1 i=0 5: Sample (f(xi))n´1 i=0 6: Compute errn 7: n Ð 2 ˆ n 8: while errn ą errtol Ź Iterate till error tolerance is met 9: Compute weights (wi)n´1 i=0 10: Compute integral ^µn 11: return ^µn Ź Integral estimate ^µn 12: end procedure Problem: How to choose (xj)n´1 j=0 and (wj)n´1 j=0 to make |µ ´ ^µ| small? Given error tolerance errtol, how big must ‘n’ be to guarantee |µ ´ ^µ| ď errtol 5/22
  • 10. Introduction Bayesian Cubature Simulation results Conclusion References Algorithm 1: procedure AutoCubature(f, errtol) Ź Integrate within the error tolerance 2: n Ð 28 3: do 4: Generate (xi)n´1 i=0 5: Sample (f(xi))n´1 i=0 6: Compute errn 7: n Ð 2 ˆ n 8: while errn ą errtol Ź Iterate till error tolerance is met 9: Compute weights (wi)n´1 i=0 10: Compute integral ^µn 11: return ^µn Ź Integral estimate ^µn 12: end procedure Problem: How to choose (xj)n´1 j=0 and (wj)n´1 j=0 to make |µ ´ ^µ| small? Given error tolerance errtol, how big must ‘n’ be to guarantee |µ ´ ^µ| ď errtol 5/22
  • 11. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Trio Identity Random f : f „ GP(0, s2 Cθ), a Gaussian process from the sample space F with zero mean and covariance kernel, s2 Cθ, Cθ : X ˆ X Ñ R. Then 6/22
  • 12. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Trio Identity Random f : f „ GP(0, s2 Cθ), a Gaussian process from the sample space F with zero mean and covariance kernel, s2 Cθ, Cθ : X ˆ X Ñ R. Then c0 = ż XˆX C(x, t) ν(dx)ν(dt), c = ż X C(xi, t) ν(dt) n´1 i=0 C = C(xi, xj) n´1 i,j=0 , w = wi n´1 i=0 , µ ´ ^µ = ş X f(x) (ν ´ ^ν)(dx) s a c0 ´ 2cTw + wTCwloooooooooooooomoooooooooooooon ALNB (f, ν ´ ^ν) a c0 ´ 2cTw + wTCwlooooooooooooomooooooooooooon DSCB (ν ´ ^ν) sloomoon VARB (f) 6/22
  • 13. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Trio Identity Random f : f „ GP(0, s2 Cθ), a Gaussian process from the sample space F with zero mean and covariance kernel, s2 Cθ, Cθ : X ˆ X Ñ R. Then c0 = ż XˆX C(x, t) ν(dx)ν(dt), c = ż X C(xi, t) ν(dt) n´1 i=0 C = C(xi, xj) n´1 i,j=0 , w = wi n´1 i=0 , µ ´ ^µ = ş X f(x) (ν ´ ^ν)(dx) s a c0 ´ 2cTw + wTCwloooooooooooooomoooooooooooooon ALNB (f, ν ´ ^ν) a c0 ´ 2cTw + wTCwlooooooooooooomooooooooooooon DSCB (ν ´ ^ν) sloomoon VARB (f) The scale parameter, s, and shape parameter, θ, should be estimated. w = C´1 c minimizes DSCB (ν ´ ^ν) makes ALNB (f, ν ´ ^ν) ˇ ˇ(f(xi) = yi)n´1 i=0 „ N(0, 1) Ref: Diaconis (1988), O’Hagan (1991), Ritter (2000), Rasmussen (2003) and others 6/22
  • 14. Introduction Bayesian Cubature Simulation results Conclusion References Covariance kernel shift invariant kernel Cθ(x, t) := ÿ mPZd λm,θ e ? ´12πmT (x´t) , 0 ď |x ´ t| ď 1 7/22
  • 15. Introduction Bayesian Cubature Simulation results Conclusion References Covariance kernel shift invariant kernel Cθ(x, t) := ÿ mPZd λm,θ e ? ´12πmT (x´t) , 0 ď |x ´ t| ď 1 when λm,θ := dź l=1 1 max(|ml| θ , 1)r θď1 , with λ0,θ = 1, θ P (0, 1]d , r P 2N Cθ(x, t) = dź l=1 1 ´ θr l (2π ? ´1)r r! Br(|xl ´ tl|), θ P (0, 1]d , r P 2N 7/22
  • 16. Introduction Bayesian Cubature Simulation results Conclusion References Covariance kernel shift invariant kernel Cθ(x, t) := ÿ mPZd λm,θ e ? ´12πmT (x´t) , 0 ď |x ´ t| ď 1 when λm,θ := dź l=1 1 max(|ml| θ , 1)r θď1 , with λ0,θ = 1, θ P (0, 1]d , r P 2N Cθ(x, t) = dź l=1 1 ´ θr l (2π ? ´1)r r! Br(|xl ´ tl|), θ P (0, 1]d , r P 2N where Br is Bernoulli polynomial of order r (Olver et al., 2013). Br(x) = ´r! (2π ? ´1)r ∞ÿ k‰0,k=´∞ e2π ? ´1kx kr # for r = 1, 0 ă x ă 1 for r = 2, 3, . . . 0 ď x ď 1 7/22
  • 17. Introduction Bayesian Cubature Simulation results Conclusion References Covariance kernel shift invariant kernel Cθ(x, t) := ÿ mPZd λm,θ e ? ´12πmT (x´t) , 0 ď |x ´ t| ď 1 when λm,θ := dź l=1 1 max(|ml| θ , 1)r θď1 , with λ0,θ = 1, θ P (0, 1]d , r P 2N Cθ(x, t) = dź l=1 1 ´ θr l (2π ? ´1)r r! Br(|xl ´ tl|), θ P (0, 1]d , r P 2N where Br is Bernoulli polynomial of order r (Olver et al., 2013). Br(x) = ´r! (2π ? ´1)r ∞ÿ k‰0,k=´∞ e2π ? ´1kx kr # for r = 1, 0 ă x ă 1 for r = 2, 3, . . . 0 ď x ď 1 Given (xi : i = 0, ..., n ´ 1), the symmetric kernel matrix is formed Cθ = Cθ(xi, xj) n´1 i,j=0 7/22
  • 18. Introduction Bayesian Cubature Simulation results Conclusion References Quais Monte-Carlo : Sample “more uniformly” 8/22
  • 19. Introduction Bayesian Cubature Simulation results Conclusion References Rank-1 Lattice rules : Low discrepancy point set Given the “generating vector” g, the construction of n - Rank-1 lattice points is given by " kg n * n´1 k=0 then the Lattice rule approximation is 1 n n´1ÿ k=0 f " kg n * with t.u the fractional part, i.e, ‘modulo 1’ operator. 9/22
  • 20. Introduction Bayesian Cubature Simulation results Conclusion References Rank-1 Lattice rules : Low discrepancy point set Shift invariant kernel + Lattice points = ’Symmetric circulant kernel’ matrix 9/22
  • 21. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Cubature µ = ż X f(x)ν(dx) « ^µn = n´1ÿ i=0 wif(xi) 10/22
  • 22. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Cubature µ = ż X f(x)ν(dx) « ^µn = n´1ÿ i=0 wif(xi) Assume f „ GP(0, s2 Cθ) µ ´ ^µn ˇ ˇ(f(xi) = yi)n´1 i=0 „ N yT (C´1 c ´ w), s2 (c0 ´ cT C´1 c) 10/22
  • 23. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Cubature µ = ż X f(x)ν(dx) « ^µn = n´1ÿ i=0 wif(xi) Assume f „ GP(0, s2 Cθ) µ ´ ^µn ˇ ˇ(f(xi) = yi)n´1 i=0 „ N yT (C´1 c ´ w), s2 (c0 ´ cT C´1 c) Choosing w = C´1 ^θ c^θ is optimal P[|µ ´ ^µn| ď errn] ě 99% for errn = 2.58 d c^θ,0 ´ cT ^θ C´1 ^θ c^θ yTC´1 ^θ y n MLE ^θ = argmin θ yT C´1 θ y [det(C´1 θ )]1/n , where y = f(xi) n´1 i=0 . 10/22
  • 24. Introduction Bayesian Cubature Simulation results Conclusion References Bayesian Cubature µ = ż X f(x)ν(dx) « ^µn = n´1ÿ i=0 wif(xi) Assume f „ GP(0, s2 Cθ) µ ´ ^µn ˇ ˇ(f(xi) = yi)n´1 i=0 „ N yT (C´1 c ´ w), s2 (c0 ´ cT C´1 c) Choosing w = C´1 ^θ c^θ is optimal P[|µ ´ ^µn| ď errn] ě 99% for errn = 2.58 d c^θ,0 ´ cT ^θ C´1 ^θ c^θ yTC´1 ^θ y n MLE ^θ = argmin θ yT C´1 θ y [det(C´1 θ )]1/n , where y = f(xi) n´1 i=0 . C´1 typically requires O(n3 ) operations. But with covariance kernel C for which matrix C is symmetric circulant. So operations on C require only O(n log(n)) operations. 10/22
  • 25. Introduction Bayesian Cubature Simulation results Conclusion References Optimal Shape parameter θ Maximum likelihood estimate of θ by argmin θ yT C´1 θ y [det(C´1 θ )]1/n 11/22
  • 26. Introduction Bayesian Cubature Simulation results Conclusion References Optimal Shape parameter θ Maximum likelihood estimate of θ by argmin θ yT C´1 θ y [det(C´1 θ )]1/n simplified to (Bernstein, 2009) argmin θ log n´1ÿ i=0 |zi|2 γi + 1 n n´1ÿ i=0 log(γi) where (γi) eigen values of Cθ z := (zi)n´1 i=0 = DFT y , γ := (γi)n´1 i=0 = DFT C(xi, x0) n´1 i=0 O(n log(n)) operations to estimate the ^θ 11/22
  • 27. Introduction Bayesian Cubature Simulation results Conclusion References Cubature rule - Computing ^µ efficiently Define Cθ = Cθ(xi, xj) n´1 i,j=0 , cθ = ż [0,1)d Cθ(xi, x)dx n´1 i=0 12/22
  • 28. Introduction Bayesian Cubature Simulation results Conclusion References Cubature rule - Computing ^µ efficiently Define Cθ = Cθ(xi, xj) n´1 i,j=0 , cθ = ż [0,1)d Cθ(xi, x)dx n´1 i=0 To find the approximate mean ^µ ^µ = wT y = C´1 θ cθloomoon wT y 12/22
  • 29. Introduction Bayesian Cubature Simulation results Conclusion References Cubature rule - Computing ^µ efficiently Define Cθ = Cθ(xi, xj) n´1 i,j=0 , cθ = ż [0,1)d Cθ(xi, x)dx n´1 i=0 To find the approximate mean ^µ ^µ = wT y = C´1 θ cθloomoon wT y Further simplified using shift-invariant and circulant matrix property (Bernstein, 2009) ^µ = ş [0,1]d K x, x0 dx ˆ řn´1 i=0 C(x0, xi) ´1 ˆ řn´1 i=0 yi O(n) operations to compute the ^µ 12/22
  • 30. Introduction Bayesian Cubature Simulation results Conclusion References Computing error bound efficiently Let’s simplify: errn = 2.58 d C^θ,0 ´ cT ^θ C´1 ^θ c^θ yTC´1 ^θ y n 13/22
  • 31. Introduction Bayesian Cubature Simulation results Conclusion References Computing error bound efficiently Let’s simplify: errn = 2.58 d C^θ,0 ´ cT ^θ C´1 ^θ c^θ yTC´1 ^θ y n using the facts about the shift invariant kernel and R-1 Lattice points (Bernstein, 2009), where z := (zi)n´1 i=0 = DFT (y) , γ := (γi)n´1 i=0 = DFT (C(xi, x0))n´1 i=0 13/22
  • 32. Introduction Bayesian Cubature Simulation results Conclusion References Computing error bound efficiently Let’s simplify: errn = 2.58 d C^θ,0 ´ cT ^θ C´1 ^θ c^θ yTC´1 ^θ y n using the facts about the shift invariant kernel and R-1 Lattice points (Bernstein, 2009), where z := (zi)n´1 i=0 = DFT (y) , γ := (γi)n´1 i=0 = DFT (C(xi, x0))n´1 i=0 Finally errn = 2.58 g f f e 1 ´ 1 řn´1 i=0 C(x0, xi) 1 n n´1ÿ i=0 |zi|2 γi 13/22
  • 33. Introduction Bayesian Cubature Simulation results Conclusion References Periodization Transforms Our algorithm works best with periodic functions Baker : ˜f(t) = f 1 ´ 2|t ´ 1 2 | C0 : ˜f(t) = f 3t2 ´ 2t3 dź j=1 (6tj(1 ´ tj)) C1 : ˜f(t) = f t3 (10 ´ 15t + 6t2 ) dź j=1 (30t2 j (1 ´ tj)2 ) Sidi’s C1 : ˜f(t) = f tj ´ sin(2πtj) 2π d j=1 dź j=1 (1 ´ cos(2πtj)) 14/22
  • 34. Introduction Bayesian Cubature Simulation results Conclusion References Test functions for Numerical integration: Multivariate Normal (MVN) µ = ż [a,b] exp ´1 2 tT Σ´1 t a (2π)d det(Σ) dt Genz (1993) = ż [0,1]d´1 f(x) dx Keister : µ = ż Rd cos( x ) exp(´ x 2 ) dx, d = 1, 2, . . . . Exp(Cos) : µ = ż (0,1]d exp(cos(x))dx, d = 1, 2, . . . . 15/22
  • 35. Introduction Bayesian Cubature Simulation results Conclusion References Integrating Multivariate Normal Probability 16/22
  • 36. Introduction Bayesian Cubature Simulation results Conclusion References Exponential of Cosine 17/22
  • 37. Introduction Bayesian Cubature Simulation results Conclusion References Keister Integral Example 18/22
  • 38. Introduction Bayesian Cubature Simulation results Conclusion References Summary Automatic Bayesian Cubature with O(n) complexity and O(n log(n)) MLE Complexity Having the advantages of a kernel method and the low computation cost of Quasi Monte carlo Scalable based on the complexity of the Integrand i.e, Kernel order and Lattice-points can be chosen to suit the smoothness of the integrand More about Guaranteed Automatic Algorithms (GAIL): http://gailgithub.github.io/GAIL_Dev/ These slides will also be available here. 19/22
  • 39. Introduction Bayesian Cubature Simulation results Conclusion References Future work Dimension independent Tighter error bound Roundoff error Lattice points optimization specific the the kernel Choosing kernel order automatically as part of MLE Can we compute n directly instead of in Loop? Choosing Periodization transform automatically 20/22
  • 41. Introduction Bayesian Cubature Simulation results Conclusion References References I Bernstein, Dennis S. 2009. Matrix mathematics, theory, facts, and formulas, Princeton University Press. Diaconis, P. 1988. Bayesian numerical analysis, Statistical decision theory and related topics iv, papers from the 4th purdue symp., west lafayette/indiana 1986, pp. 163–175. Genz, A. 1993. Comparison of methods for the computation of multivariate normal probabilities, Computing Science and Statistics 25, 400–405. Nuyens, Dirk. 2007. Fast construction of good lattice rules, Ph.D. Thesis. O’Hagan, A. 1991. Bayes-Hermite quadrature, J. Statist. Plann. Inference 29, 245–260. Olver, F. W. J., D. W. Lozier, R. F. Boisvert, C. W. Clark, and A. B. O. Dalhuis. 2013. Digital library of mathematical functions. Rasmussen, C. E. 2003. Bayesian Monte Carlo, Advances in Neural Information Processing Systems, pp. 489–496. Ritter, K. 2000. Average-case analysis of numerical problems, Lecture Notes in Mathematics, vol. 1733, Springer-Verlag, Berlin. 22/22