SlideShare a Scribd company logo
Numerical Smoothing and Hierarchical Approximations
for Efficient Option Pricing and Density Estimation
Chiheb Ben Hammouda
Christian Bayer Raúl Tempone
Center for Uncertainty
Quantification
Center for Un
Quantification
nter for Uncertainty Quantification Logo Lock-up
Stochastic Numerics and Statistical Learning: Theory and
Applications, KAUST
May 23, 2022
Related Manuscripts to the Talk
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone.
“Numerical Smoothing with Hierarchical Adaptive Sparse Grids
and Quasi-Monte Carlo Methods for Efficient Option Pricing”. In:
arXiv preprint arXiv:2111.01874 (2021), to appear in Quantitative
Finance Journal (2022).
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone.
“Multilevel Monte Carlo combined with numerical smoothing for
robust and efficient option pricing and density estimation”. In:
arXiv preprint arXiv:2003.05708 (2022)
1
Outline
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
1
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
Option Pricing as a High-Dimensional, non-Smooth
Numerical Integration Problem
Option pricing in finance often
correspond to high-dimensional
integration problems with non-smooth
integrands.
▸ High-dimensional:
(i) Time-discretization of an SDE; (ii)
A large number of underlying assets;
(iii) Path dependence on the whole
trajectory of the underlying.
▸ Non-smooth: E.g., call options
(ST − K)+
, digital options 1ST >K, . . .
Methods like quasi Monte Carlo
(QMC) or (adaptive) sparse grids
quadrature (ASGQ) critically rely on
smoothness of integrand and dimension
of the problem.
Monte Carlo (MC) does not care (in
terms of convergence rates) BUT
multilevel Monte Carlo (MLMC) does.
Figure 1.1: Integrand of two-dimensional
European basket option (Black–Scholes
model)
Framework
Approximate efficiently E[g(X(T))]
Given a (smooth) φ ∶ Rd
→ R, the payoff g ∶ Rd
→ R has either
jumps or kinks:
▸ Hockey-stick functions: g(x) = max(φ(x),0) (put or call payoffs,
. . . ).
▸ Indicator functions: g(x) = 1(φ(x)≥0) (digital option, financial
Greeks, distribution functions, . . . ).
▸ Dirac Delta functions: g(x) = δ(φ(x)=0) (density estimation, financial
Greeks, . . . ).
The process X is approximated (via a discretization scheme) by
X, E.g.,
▸ One/multi-dimensional geometric Brownian motion (GBM) process.
▸ Multi-dimensional stochastic volatility model: E.g., the Heston
model
dXt = µXtdt +
√
vtXtdWX
t
dvt = κ(θ − vt)dt + ξ
√
vtdWv
t ,
(WX
t ,Wv
t ): correlated Wiener processes with correlation ρ.
3
How Does Regularity Affect
QMC, ASGQ, and MLMC Performance?
3
Randomized QMC (rQMC)
A (rank-1) lattice rule (Sloan 1985; Nuyens 2014) with n points
Qn(g) ∶=
1
n
n−1
∑
k=0
g (
kz mod n
n
),
where z = (z1,...,zd) ∈ Nd
(the generating vector).
A randomly shifted lattice rule
Qn,q(g) =
1
q
q−1
∑
i=0
Q(i)
n (f) =
1
q
q−1
∑
i=0
(
1
n
n−1
∑
k=0
g (
kz + ∆(i)
mod n
n
)),
where {∆(i)
}q
i=1: independent random shifts, and MrQMC = q × n.
" See previous talk by Bruno Tuffin for further details about QMC.
4
How Does Regularity Affect QMC Performance?
The analysis in (Niederreiter 1992) shows that the convergence rate of
rQMC is O (M
− 1
2
−δ
rQMC (log MrQMC)d
), where 0 ≤ δ ≤ 1
2 is related to the
degree of regularity of the integrand g.
(Sloan and Woźniakowski 1998; Dick, Kuo, and Sloan 2013) show that
convergence rates of O (M−1
rQMC) can be observed for the lattice rule if
the integrand g belongs to Wd,γ equipped with the norm
∣∣g∣∣2
Wd,γ
= ∑
α⊆{1∶d}
1
γα
∫
[0,1]∣α∣
(∫
[0,1]d−∣α∣
∂∣α∣
∂yα
f(y)dy−α)
2
dyα, (1)
where yα ∶= (yj)j∈α and y−α ∶= (yj)j∈{1∶d}∖α.
Notation
Wd,γ: a d-dimensional weighted Sobolev space of functions with
square-integrable mixed first derivatives.
γ ∶= {γα > 0 ∶ α ⊆ {1,2,...,d}} being a given collection of weights, and
d being the dimension of the problem.
5
ASGQ (I)
Given F ∶ Rd
→ R and a multi-index β ∈ Nd
+.
Fβ ∶= Qm(β)
[F] a quadrature operator based on a Cartesian quadrature grid
(m(βn) points along yn, with m ∶ N → N a strictly increasing function).
" Approximating E[F] with Fβ is not an appropriate option due to the
well-known curse of dimensionality.
Idea: A quadrature estimate of E[F] is
MI`
[F] = ∑
β∈I`
∆[Fβ],
where
▸ The mixed (first-order tensor) difference operators: ∆[Fβ] = ⊗d
i=1∆iFβ
▸ The first-order difference operators: ∆iFβ {
Fβ − Fβ−ei , if βi > 1
Fβ if βi = 1
with ei denotes the ith d-dimensional unit vector.
For instance, when d = 2, then
∆Fβ = ∆2∆1F(β1,β2) = ∆2 (F(β1,β2) − F(β1−1,β2)) = ∆2F(β1,β2) − ∆2F(β1−1,β2)
= F(β1,β2) − F(β1,β2−1) − F(β1−1,β2) + F(β1−1,β2−1).
" See previous talk by Lorenzo Tamellini for further details about ASGQ in
relation to MISC, which is a more general version. 6
ASGQ (II)
E[F] ≈ MI`
[F] = ∑
β∈I`
∆[Fβ],
Product approach: I` = {∣∣ β ∣∣∞≤ `; β ∈ Nd
+}.
Regular sparse grids: I` = {∣∣ β ∣∣1≤ ` + d − 1; β ∈ Nd
+}
Adaptive sparse grids quadrature (ASGQ) (Gerstner and Griebel 1998): The
construction of I` = IASGQ
is done a posteriori and adaptively by profit
thresholding IASGQ
= {β ∈ Nd
+ ∶ Pβ ≥ T}.
▸ Profit of a hierarchical surplus Pβ =
∣∆Eβ∣
∆Wβ
.
▸ Error contribution: ∆Eβ = ∣MI∪{β}
− MI
∣.
▸ Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 1.2: Left are product grids ∆β1 ⊗ ∆β2
for 1 ≤ β1,β2 ≤ 3. Right is the corresponding
sparse grids construction.
Figure 1.3: Illustration of ASG grid
7
How Does Regularity Affect ASGQ Performance?
Product approach: EQ(M) = O (M−r/d
) (for functions with
bounded total derivatives up to order r).
Adaptive sparse grids quadrature (ASGQ): From the analysis in
(Chen 2018; Ernst, Sprungk, and Tamellini 2018)
EQ(M) = O (M−p/2
) (where1
p is independent from the problem
dimension, and is related to the order up to which the weighted
mixed derivatives are bounded).
Notation: M: number of quadrature points; EQ: quadrature error.
1
characterizes the relation between the regularity of the integrand and the
anisotropic property of the integrand with respect to different dimensions. 8
Multilevel Monte Carlo (MLMC)
(Heinrich 2001; Kebaier et al. 2005; Giles 2008b)
Aim: Improve MC complexity, when estimating E[g(X(T))].
Setting:
▸ A hierarchy of nested meshes of [0,T], indexed by {`}L
`=0.
▸ ∆t` = K−`
∆t0: The time steps size for levels ` ≥ 1; K>1, K ∈ N.
▸ X` ∶= X∆t`
: The approximate process generated using a step size of ∆t`.
MLMC idea
E[g(XL(T))] = E[g(X0(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
+
L
∑
`=1
E[g(X`(T)) − g(X`−1(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
(2)
Var[g(X0(T))] ≫ Var[g(X`(T)) − g(X`−1(T))] ↘ as ` ↗
M0 ≫ M` ↘ as ` ↗
MLMC estimator: ̂
Q ∶=
L
∑
`=0
̂
Q`,
̂
Q0 ∶=
1
M0
M0
∑
m0=1
g(X0,[m0](T)); ̂
Q` ∶=
1
M`
M`
∑
m`=1
(g(X`,[m`](T)) − g(X`−1,[m`](T))), 1 ≤ ` ≤ L
9
How Does Regularity Affect MLMC Complexity?
Complexity analysis for MLMC
MLMC Complexity (Cliffe, Giles,
Scheichl, and Teckentrup 2011)
O (TOL
−2−max(0, γ−β
α
)
log (TOL)2×1{β=γ}
)
(3)
i) Weak rate:
∣E[g (X`(T)) − g (X(T))]∣ ≤ c12−α`
ii) Variance decay rate:
Var[g (X`(T)) − g (X`−1(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
∶=V`
≤ c22−β`
iii) Work growth rate: W` ≤ c32γ`
(W`:
expected cost)
For forward Euler (γ = 1): If g is Lipschitz ⇒ V` ≃ ∆t` due to strong rate 1/2, that is
β = γ. Otherwise β < γ ⇒ worst-case complexity.
Higher order schemes, E.g., the Milstein scheme, may lead to better complexities
even for non-Lipschitz observables (Giles, Debrabant, and Rößler 2013; Giles 2015).
However,
▸ For moderate/high-dimensional dynamics, coupling issues may arise and the scheme
becomes computationally expensive.
▸ Deterioration of the robustness of the MLMC estimator because the kurtosis explodes
as ∆t` decreases: O (∆t−1
` ) compared with O (∆t
−1/2
` ) for Euler without smoothing
(Giles, Nagapetyan, and Ritter 2015) and O (1) in (Bayer, Ben Hammouda, and
Tempone 2022) (see next slides).
10
How Does Regularity Affect MLMC Robustness?
/ For non-lipschitz payoffs: The Kurtosis, κ` ∶=
E[(Y`−E[Y`])4
]
(Var[Y`])2 is of
O(∆t
−1/2
` ) for Euler scheme and O(∆t−1
` ) for the Milstein scheme.
Large kurtosis problem: discussed previously in (Ben Hammouda, Moraes,
and Tempone 2017; Ben Hammouda, Ben Rached, and Tempone 2020) ⇒
/ Expensive cost for reliable/robust estimates of sample statistics.
Why is large kurtosis bad?
σS2(Y`) =
Var[Y`]
√
M`
√
(κ` − 1) +
2
M` − 1
; " M` ≫ κ`.
Why are accurate variance estimates, V` = Var[Y`], important?
M∗
` ∝
√
V`W−1
`
L
∑
`=0
√
V`W`.
Notation
Y` ∶= g(X`(T)) − g(X`−1(T))
σS2(Y`): Standard deviation of the sample variance of Y`;
κ`: the kurtosis; V` = Var[Y`]; M`: number of samples;
M∗
` : Optimal number of samples per level; W`: Cost per sample path.
11
Previous Successful Smoothing Techniques
for QMC and ASGQ
Mollification: E.g., by convolution with Gaussian kernel or manually;
(-) Additional error that may scale exponentially with the dimension.
Mapping the problem to the frequency space (E.g., (Bayer, Ben Hammouda,
Papapantoleon, Samet, and Tempone 2022)): Better regularity compared to the
physical space;
(-) Applicable when the Fourier transform of the density function is available and
cheap to compute.
Bias-free mollification (E.g., (Bayer, Siebenmorgen, and Tempone 2018; Bayer,
Ben Hammouda, and Tempone 2020)) by taking conditional expectations or exact
integration over subset of integration variables;
(-) Not always possible.
Example 1.1 ((Romano and Touzi 1997))
(St,vt) stochastic. volatility. model, S driven by Bm Z ∶= ρW +
√
1 − ρ2B, v
σ(W⋅)–measurable. Then
E [g(ST )] = E [g (S0 exp(∫
T
0
√
vtdZt −
1
2
∫
T
0
vtdt))] = E [E [g(ST ) ∣ σ(W⋅)]]
= E
⎡
⎢
⎢
⎢
⎢
⎣
CBS
⎛
⎝
S0 = S0 exp(ρ∫
T
0
√
vtdWt −
ρ2
2
∫
T
0
vtdt),σ2
= (1 − ρ2
)∫
T
0
vtdt
⎞
⎠
⎤
⎥
⎥
⎥
⎥
⎦
.
Previous Successful Smoothing Techniques
for MLMC
Implicit smoothing based on conditional expectation combined with
the Milstein scheme: (E.g, (Giles, Debrabant, and Rößler 2013));
(-) Not always possible and drawbacks of Milstein scheme.
Parametric smoothing: carefully constructing a regularized version
of the observable (E.g, (Giles, Nagapetyan, and Ritter 2015));
(-) Possible deterioration of the strong convergence behavior, and
additional bias that may increase exponentially with the dimension
(see next slides).
Malliavin calculus integration by parts: (E.g, (Altmayer and
Neuenkirch 2015)): splitting the payoff function into a smooth
component (treated by standard MLMC) and a compactly
supported discontinuous part (treated via Malliavin MLMC).
Adaptive sampling: (E.g, (Haji-Ali, Spence, and Teckentrup 2021))
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
Numerical Smoothing Steps
Guiding example: g ∶ Rd
→ R payoff function, X
∆t
T (∆t = T
N ) Euler
discretization of d-dimensional SDE , E.g.,
dX
(i)
t = ai(Xt)dt + ∑d
j=1 bij(Xt)dW
(j)
t , where {W(j)
}d
j=1 are standard
Brownian motions.
X
∆t
T = X
∆t
T (∆W
(1)
1 ,...,∆W
(1)
N ,...,∆W
(d)
1 ,...,∆W
(d)
N )
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
∶=∆W
. E[g(X
∆t
T )] =?
1 Identify hierarchical representation of integration variables:
(a) X
∆t
T (∆W) ≡ X
∆t
T (Z), Z = (Zi)dN
i=1 ∼ N(0,IdN ):
s.t. “Z1 ∶= (Z
(1)
1 ,...,Z
(d)
1 ) substantially contributes even for ∆t → 0”.
E.g., Brownian bridges of ∆W / Haar wavelet construction of W.
" Different from previous techniques which smooth out only at the
final step with respect to ∆W ⇒ the smoothing effect vanishes as
∆t → 0.
(b) Design of a sub-optimal smoothing direction (A: rotation matrix
which is easy to construct and whose structure depends on the payoff
g.)
Y = AZ1.
14
Numerical Smoothing Steps
2
E[g(X(T))] ≈ E[g(X
∆t
(T))] = ∫
Rd×N
G(z)ρd×N (z)dz
(1)
1 ...dz
(1)
N ...dz
(d)
1 ...dz
(d)
N
= ∫
RdN−1
I(y−1,z
(1)
−1 ,...,z
(d)
−1 )ρd−1(y−1)dy−1ρdN−d(z
(1)
−1 ,...,z
(d)
−1 )dz
(1)
−1 ...dz
(d)
−1
= E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] ≈ E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )], (4)
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
= ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1 + ∫
+∞
y∗
1
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
≈ I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
Mlag
∑
k=0
ηkG(ζk (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 ), (5)
3 Compute the remaining (dN − 1)-integral in (4) by ASGQ or QMC or MLMC.
Notation
G ∶= g ○ Φ ○ (ψ(1)
,...,ψ(d)
) maps N × d Gaussian random inputs to g(X
∆t
(T)); where
▸ ψ(j)
∶ (Z
(j)
1 ,...,Z
(j)
N ) ↦ (B
(j)
1 ,...,B
(j)
N ) denotes the mapping of the Brownian bridge construction.
▸ Φ ∶ (∆t,B) ↦ X
∆t
(T) denotes the mapping of the time-stepping scheme.
y∗
1 (y−1,z
(1)
−1 ,...,z
(d)
−1 ): the exact discontinuity location s.t
φ(X
∆t
(T)) = P(y∗
1 ;y−1,z
(1)
−1 ,...,z
(d)
−1 ) = 0;
y∗
1(y−1,z
(1)
−1 ,...,z
(d)
−1 ): the approximated discontinuity location via root finding;
MLag: the number of Laguerre quadrature points ζk ∈ R, and corresponding weights ηk;
ρd×N (z) = 1
(2π)d×N/2 e−1
2
zT z
.
15
Extending Numerical Smoothing for
Multiple Discontinuities
Multiple Discontinuities: Due to the payoff structure/use of Richardson extrapolation.
R different ordered multiple roots, e.g., {y∗
i }R
i=1, the smoothed integrand is
I (y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1 + ∫
+∞
y∗
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1
+
R−1
∑
i=1
∫
y∗
i+1
y∗
i
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1,
and its approximation I is given by
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
MLag,1
∑
k=0
ηLag
k G(ζLag
k,1 (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 )
+
MLag,R
∑
k=0
ηLag
k G(ζLag
k,R (y∗
R),y−1,z
(1)
−1 ,...,z
(d)
−1 )
+
R−1
∑
i=1
⎛
⎝
MLeg,i
∑
k=0
ηLeg
k G(ζLeg
k,i (y∗
i ,y∗
i+1),y−1,z
(1)
−1 ,...,z
(d)
−1 )
⎞
⎠
,
{y∗
i }R
i=1: the approximated discontinuities locations; MLag,1 and MLag,R: the number
of Laguerre quadrature points ζLag
.,. ∈ R with corresponding weights ηLag
. ; {MLeg,i}R−1
i=1 :
the number of Legendre quadrature points ζLeg
.,. with corresponding weights ηLeg
. .
I can be approximated further depending on (i) the decay of G × ρ1 in the semi-infinite
domains and (ii) how close the roots are to each other.
Extending Numerical Smoothing for
Density Estimation
Goal: Approximate the density ρX at u, for a stochastic process X
ρX(u) = E[δ(X − u)], δ is the Dirac delta function.
" Without any smoothing techniques (regularization, kernel
density,. . . ) MC and MLMC fail due to the infinite variance caused
by the singularity of the function δ.
Strategy: in (Bayer, Hammouda, and Tempone 2022)
1 Conditioning with respect to the Brownian bridge
ρX(u) =
1
√
2π
E[exp(−(Y ∗
1 (u))
2
/2)∣
dY ∗
1
du
(u)∣]
≈
1
√
2π
E
⎡
⎢
⎢
⎢
⎣
exp(−(Y
∗
1(u))
2
/2)
R
R
R
R
R
R
R
R
R
R
R
dY
∗
1
du
(u)
R
R
R
R
R
R
R
R
R
R
R
⎤
⎥
⎥
⎥
⎦
, (6)
Y ∗
1 (u;Z−1): the exact discontinuity; Y
∗
1(u;Z−1): the approximated
discontinuity.
Why not Kernel Density Techniques
in Multiple Dimensions?
Similar to approaches based on parametric regularization as in (Giles, Nagapetyan,
and Ritter 2015).
This class of approaches has a pointwise error that increases exponentially with
respect to the dimension of the state vector X.
For a d-dimensional problem, a kernel density estimator with a bandwidth matrix,
H = diag(h,...,h)
MSE ≈ c1M−1
h−d
+ c2h4
. (7)
M is the number of samples, and c1 and c2 are constants.
Our approach in high dimension: For u ∈ Rd
ρX(u) = E[δ(X − u)] = E[ρd (Y∗
(u))∣det(J(u))∣]
≈ E[ρd (Y
∗
(u))∣det(J(u))∣], (8)
where
▸ Y∗
(u;⋅): the exact discontinuity; Y
∗
(u;⋅): the approximated discontinuity.
▸ J is the Jacobian matrix, with Jij =
∂y∗
i
∂uj
; ρd(⋅) is the multivariate Gaussian density.
Thanks to the exact conditional expectation with respect to the Brownian bridge
⇒ the smoothing error in our approach is insensitive to the dimension of the
problem.
18
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
Notations
The Haar basis functions, ψn,k, of L2
([0,1]) with support
[2−n
k,2−n
(k + 1)]:
ψ−1(t) ∶= 1[0,1](t); ψn,k(t) ∶= 2n/2
ψ (2n
t − k), n ∈ N0, k = 0,...,2n
− 1,
where ψ(⋅) is the Haar mother wavelet:
ψ(t) ∶=
⎧
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎩
1, 0 ≤ t < 1
2
,
−1, 1
2
≤ t < 1,
0, else,
Grid Dn ∶= {tn
` ∣ ` = 0,...,2n+1
} with tn
` ∶= `
2n+1 T. Observe: the
Haar basis functions up to level n are piece-wise constants with
points of discontinuity given by Dn
.
For i.i.d. standard normal rdvs Z−1, Zn,k, n ∈ N0, k = 0,...,2n
− 1,
we define the (truncated) standard Brownian motion
WN
t ∶= Z−1Ψ−1(t) +
N
∑
n=0
2n
−1
∑
k=0
Zn,kΨn,k(t).
with Ψ−1(⋅) and Ψn,k(⋅) are the antiderivatives of the Haar basis
functions. 19
Notations
We define the corresponding increments for any function or
process F as follows:
∆N
` F ∶= F(tN
`+1) − F(tN
` ).
The solution of the Euler scheme of dXt = b(Xt)dWt, X0 = x ∈ R,
along DN
as
XN
`+1 ∶= XN
` + b(XN
` )∆N
` W, ` = 0,...,2N
− 1, (9)
with XN
0 ∶= X0 = x; for convenience, we also define XN
T ∶= XN
2N .
We define the deterministic function HN
∶ R2N+1−1
→ R as
HN
(zN
) ∶= EZ−1
[g (XN
T (Z−1,zN
))], (10)
where ZN
∶= (Zn,k)n=0,...,N, k=0,...2n−1.
20
Assumptions
Assumption 3.1
There are positive rdvs Cp with finite moments of all orders such that
∀N ∈ N, ∀`1,...,`p ∈ {0,...,2N
− 1} ∶
R
R
R
R
R
R
R
R
R
R
R
R
∂p
XN
T
∂XN
`1
⋯∂XN
`p
R
R
R
R
R
R
R
R
R
R
R
R
≤ Cp a.s.
this means thata ∂pXN
T
∂XN
`1
⋯∂XN
`p
= Ô
P(1).
a
For sequences of rdvs FN and GN , we write that FN = Ô
P(GN ) if there
is a rdv C with finite moments of all orders such that for all N, we have
∣FN ∣ ≤ C ∣GN ∣ a.s.
Assumption 3.1 is natural because it is fulfilled if the diffusion
coefficient b(⋅) is smooth; this situation is valid for many option pricing
models.
21
Assumptions
Assumption 3.2
For any p ∈ N we obtaina
(
∂XN
T
∂y
(Z−1,ZN
))
−p
= Ô
P(1).
a
y ∶= z−1
In (Bayer, Ben Hammouda, and Tempone 2022), we show
sufficient conditions where this assumption is valid. For instance,
Assumption 3.2 is valid for
▸ one-dimensional models with a linear or constant diffusion.
▸ multivariate models with a linear drift and constant diffusion,
including the multivariate lognormal model (see (Bayer,
Siebenmorgen, and Tempone 2018)).
There may be some cases in which Assumption 3.2 is not fulfilled,
e.g., XT = W2
T . However, our method works well in such cases
since we have g(XT ) = G(y2
1).
The Smoothness Theorem
Theorem 3.3 ((Bayer, Ben Hammouda, and Tempone 2022))
Assume that XN
T , defined by (9), satisfies Assumptions 3.1 and 3.2.
Then, for any p ∈ N and indices n1,...,np and k1,...,kp (satisfying
0 ≤ kj < 2nj ), the function HN
defined in (10) satisfies the following
(with constants independent of nj,kj)a
∂p
HN
∂zn1,k1 ⋯∂znp,kp
(zN
) = Ô
P (2− ∑
p
j=1 nj/2
).
In particular, HN
is of class C∞
.
a
The constants increase in p and N.
Sketch of the Proof
1 We consider a mollified version gδ of g and the corresponding function
HN
δ (defined by replacing g with gδ in (10)).
2 We prove that we can interchange the integration and differentiation,
which implies
∂HN
δ (zN
)
∂zn,k
= E [g′
δ (XN
T (Z−1,zN
))
∂XN
T (Z−1,zN
)
∂zn,k
].
3 Multiplying and dividing by
∂XN
T (Z−1,zN )
∂y and replacing the expectation
by an integral w.r.t. the standard normal density, we obtain
∂HN
δ (zN
)
∂zn,k
= ∫
R
∂gδ (XN
T (y,zN
))
∂y
(
∂XN
T
∂y
(y,zN
))
−1
∂XN
T
∂zn,k
(y,zN
)
1
√
2π
e− y2
2 dy.
(11)
4 We show that integration by parts is possible, and then we can discard
the mollified version to obtain the smoothness of HN
because
∂HN
(zN
)
∂zn,k
= −∫
R
g (XN
T (y,zN
))
∂
∂y
⎡
⎢
⎢
⎢
⎢
⎣
(
∂XN
T
∂y
(y,zN
))
−1
∂XN
T
∂zn,k
(y,zN
)
1
√
2π
e− y2
2
⎤
⎥
⎥
⎥
⎥
⎦
dy.
5 The proof relies on successively applying the technique above of
dividing by
∂XN
T
∂y and then integrating by parts.
24
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
Overcoming the High Dimensionality
for ASGQ and QMC
We combine ASGQ and QMC with hierarchical representations, as in
(Bayer, Ben Hammouda, and Tempone 2020)
Brownian bridges as a Wiener path generation method ⇒ ↘
the effective dimension of the problem.
Richardson extrapolation ⇒ Faster convergence of the weak
error ⇒ ↘ number of time steps to achieve a certain error
tolerance ⇒ smaller total dimension of the input space.
25
Error Discussion for ASGQ
QASGQ
N : the ASGQ estimator
E[g(X(T)] − QASGQ
N = E[g(X(T))] − E[g(X
∆t
(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error I: bias or weak error of O(∆t)
+ E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error II: numerical smoothing error of O(M−s
Lag
)+O(TOLκ+1
Newton
)
+ E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − QASGQ
N
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error III: ASGQ error of O(M
−p/2
ASGQ
)
, (12)
Notations
MASGQ: the number of quadrature points used by the ASGQ estimator, and p > 0.
y∗
1: the approximated location of the non smoothness obtained by Newton iteration
⇒ ∣y∗
1 − y∗
1∣ = TOLNewton
κ ≥ 0 (κ = 0: heavy-side payoff (digital option), and κ = 1: call or put payoffs).
MLag is the number of points used by the Laguerre quadrature for the one
dimensional pre-integration step.
s > 0: Derivatives of G with respect to y1 are bounded up to order s.
26
Work and Complexity Discussion for ASGQ
An optimal performance of ASGQ is given by
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
min
(MASGQ,MLag,TOLNewton)
WorkASGQ ∝ MASGQ × MLag × ∆t−1
s.t. Etotal,ASGQ = TOL.
(13)
Etotal, ASGQ ∶= E[g(X(T)] − QASGQ
N
= O (∆t) + O (M
−p/2
ASGQ) + O (M−s
Lag) + O (TOLκ+1
Newton).
We show in (Bayer, Ben Hammouda, and Tempone 2022) that
under certain conditions for the regularity parameters s and p
(p,s ≫ 1)
▸ WorkASGQ = O (TOL−1
) (for the best case) compared to
WorkMC = O (TOL−3
) (for the best case of MC).
27
Multilevel Monte Carlo with Numerical Smoothing
Recall our QoI
E[g(X(T))] ≈ E[g(X
∆t
(T))] = ∫
Rd×N
G(z)ρd×N (z)dz
(1)
1 ...dz
(1)
N ...dz
(d)
1 ...dz
(d)
N
= ∫
RdN−1
I(y−1,z
(1)
−1 ,...,z
(d)
−1 )ρd−1(y−1)dy−1ρdN−d(z
(1)
−1 ,...,z
(d)
−1 )dz
(1)
−1 ...dz
(d)
−1
= E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] ≈ E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )],
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
= ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1 + ∫
+∞
y∗
1
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
≈ I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
Mlag
∑
k=0
ηkG(ζk (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 ),
Denote by I`:= I`(y`
−1,z
(1),`
−1 ,...,z
(d),`
−1 ): the level ` Euler approximation of I,
computed with: step size of ∆t`; MLag,` Laguerre quadrature points;
TOLNewton,` as the tolerance of the Newton method at level `
MLMC estimator
̂
QMLMC
∶=
L
∑
`=0
̂
Q`. (14)
with
̂
Q0 ∶=
1
M0
M0
∑
m0=1
I0,[m0]; ̂
Q` ∶=
1
M`
M`
∑
m`=1
(I`,[m`] − I`−1,[m`]), 1 ≤ ` ≤ L
28
Multilevel Monte Carlo with Numerical Smoothing:
Variance Decay, Complexity and Robustness
Corollary 4.1 ((Bayer, Hammouda, and Tempone 2022))
Under Assmuptions 3.1 and 3.2, V` ∶= Var[I` − I`−1] = O (∆t`) compared
with O (∆t
1/2
` ) for MLMC without smoothing.
Corollary 4.2 ((Bayer, Hammouda, and Tempone 2022))
Under Assmuptions 3.1 and 3.2, the complexity of MLMC combined with
numerical smoothing using Euler discretization is O (TOL−2
(log(TOL))2
)
compared with O (TOL−2.5
) for MLMC without smoothing.
Corollary 4.3 ((Bayer, Hammouda, and Tempone 2022))
Let κ` be the kurtosis of the random variable Y` ∶= I` − I`−1, then under
Assmuptions 3.1 and 3.2, we obtain κ` = O (1) compared with O (∆t
−1/2
` )
for MLMC without smoothing.
Work Discussion for MLMC
̂
QMLMC
: the MLMC estimator, as defined in (14).
E[g(X(T)] − ̂
QMLMC
= E[g(X(T))] − E[g(X
∆tL
(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error I: bias or weak error of O(∆tL)
+ E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error II: numerical smoothing error of O(M−s
Lag,L
)+O(TOLκ+1
Newton,L
)
+ E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − ̂
QMLMC
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error III: MLMC statistical error of O
⎛
⎝
√
∑L
`=L0
√
MLag,`+log(TOL−1
Newton,`
)
⎞
⎠
.
(15)
30
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
ASGQ Quadrature Error Convergence
101
102
103
104
MASGQ
10-3
10
-2
10
-1
100
101
Relative
Quadrature
Error
ASGQ without smoothing
ASGQ with numerical smoothing
(a)
101
102
103
104
MASGQ
10-3
10
-2
10-1
100
101
Relative
Quadrature
Error
ASGQ without smoothing
ASGQ with numerical smoothing
(b)
Figure 5.1: Digital option under the Heston model: Comparison of the relative
quadrature error convergence for the ASGQ method with/out numerical
smoothing. (a) Without Richardson extrapolation (N = 8), (b) with
Richardson extrapolation (Nfine level = 8).
31
QMC Error Convergence
102
103
104
QMC samples
10-3
10
-2
10-1
95%
Statistical
error
QMC without smoothing
slope=-0.52
QMC with numerical smoothing
slope=-0.85
(a)
102
103
104
105
QMC samples
10-5
10
-4
10
-3
10-2
10-1
95%
Statistical
error
QMC without smoothing
slope=-0.64
QMC with numerical smoothing
slope=-0.92
(b)
Figure 5.2: Comparison of the 95% statistical error convergence for rQMC
with/out numerical smoothing. (a) Digital option under Heston, (b) digital
option under GBM.
32
Errors in the Numerical Smoothing
101
102
MLag
10-5
10
-4
10
-3
10-2
10
-1
Relative
numerical
smoothing
error
Relative numerical smoothing error
slope=-4.02
(a)
10-10
10-8
10-6
10-4
10-2
100
TOLNewton
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Relative
numerical
smoothing
error
10
-4
Relative numerical smoothing error
(b)
Figure 5.3: Call option under GBM with N = 4: The relative numerical
smoothing error for a fixed number of ASGQ points MASGQ = 103
plotted
against (a) different values of MLag with a fixed Newton tolerance
TOLNewton = 10−10
, (b) different values of TOLNewton with a fixed number of
Laguerre quadrature points MLag = 128.
33
Comparing ASGQ with MC
10
-3
10
-2
10
-1
Total Relative Error
10
-1
10
0
101
10
2
10
3
Computational
Work
MC (without smoothing, without Richardson extrapolation)
MC (without smoothing, with Richardson extrapolation)
ASGQ (with smoothing, without Richardson extrapolation)
ASGQ (with smoothing, with Richardson extrapolation)
Figure 5.4: Digital option under Heston: Computational work comparison for
ASGQ with numerical smoothing and MC with the different configurations in
terms of the level of Richardson extrapolation.
34
Numerical Results for ASGQ
Consider digital/call/basket options in Heston or discretized GBM
models.
ASGQ with numerical smoothing 10 - 100× faster in dim. around
20 than MC.
Example Total relative CPU time
error (ASGQ/MC) in %
Single digital option (GBM) 0.4% 0.2%
Single call option (GBM) 0.5% 0.3%
Single digital option (Heston) 0.4% 3.2%
Single call option (Heston) 0.5% 0.4%
4-dimensional basket call option (GBM) 0.8% 7.4%
Table 1: Summary of the relative errors and computational gains achieved
using ASGQ with numerical smoothing compared to the MC method, to
realize a certain error tolerance. The CPU time ratios are computed for the
best configuration with Richardson extrapolation for each method.
35
Numerical Results for MLMC
Method κL α β γ Numerical Complexity
Without smoothing for digital under GBM 709 1 1/2 1 O (TOL−2.5
)
With numerical smoothing for digital under GBM 3 1 1 1 O (TOL−2
(log(TOL))
2
)
Without smoothing for digital under Heston 245 1 1/2 1 O (TOL−2.5
)
With numerical smoothing for digital under Heston 7 1 1 1 O (TOL−2
log(TOL)2
)
With numerical smoothing for GBM density 5 1 1 1 O (TOL−2
(log(TOL))
2
)
With numerical smoothing for Heston density 8 1 1 1 O (TOL−2
(log(TOL))
2
)
Table 2: Summary of the MLMC numerical results observed different examples. κL is the kurtosis at the
deepest levels of MLMC, (α,β,γ) are weak, variance decay and work rates respectively. TOL is the
user-selected MLMC tolerance.
36
Digital Option under the Heston Model:
MLMC Without Smoothing
0 1 2 3 4 5 6
-8
-6
-4
-2
0 1 2 3 4 5 6
-10
-5
0
0 1 2 3 4 5 6
2
4
6
0 1 2 3 4 5 6
50
100
150
200
kurtosis
Figure 5.5: Digital option under Heston: Convergence plots for MLMC
without smoothing.
37
Digital Option under the Heston Model:
MLMC With Numerical Smoothing
0 1 2 3 4 5 6 7
-15
-10
-5
0 1 2 3 4 5 6 7
-10
-5
0
0 1 2 3 4 5 6 7
2
4
6
8
0 1 2 3 4 5 6 7
8
9
10
11
12
kurtosis
Figure 5.6: Digital option under Heston: Convergence plots for MLMC with
numerical smoothing.
38
Digital Option under the Heston Model:
Numerical Complexity Comparison
10-4
10-3
10-2
10-1
TOL
1e-04
1e-02
10
2
10
3
E[W]
MLMC without smoothing
TOL
-2.5
MLMC+ Numerical smoothing
TOL-2
log(TOL)2
Figure 5.7: Digital option under Heston: Comparison of the numerical
complexity of i) standard MLMC, and ii) MLMC with numerical smoothing.
Numerical computational cost improved from O(TOL−5/2
) without smoothing
to O(TOL−2
) with numerical smoothing.
39
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
Conclusions in the Context of
Deterministic Quadrature Methods
1 We introduce a a novel numerical smoothing technique, combined with
ASGQ/QMC, Hierarchical Brownian Bridge and Richardson
extrapolation.
2 Our analysis and numerical experiments show that our novel approach
outperforms substantially the MC method for high dimensional cases
and for dynamics where discretization is needed.
3 We provide a smoothness analysis for the smoothed integrand in the
time stepping setting. We also provide a related error discussion of
our approach.
4 We illustrate numerically that traditional schemes of the Heston
dynamics have low regularity, and we use an alternative smooth
scheme to simulate the volatility process based on the sum of
Ornstein-Uhlenbeck (OU) processes.
5 More details can be found in
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone.
“Numerical Smoothing with Hierarchical Adaptive Sparse Grids and
Quasi-Monte Carlo Methods for Efficient Option Pricing”. In: arXiv
preprint arXiv:2111.01874 (2021), to appear in Quantitative Finance
Journal (2022)
Conclusions in the Context of
MLMC Methods
1 We propose a numerical smoothing approach that can be combined with
MLMC estimator for efficient option pricing and density estimation.
2 Compared to the case without smoothing
▸ We significantly reduce the kurtosis at the deep levels of MLMC which
improves the robustness of the estimator.
▸ We improve the strong convergence rate ⇒ improvement of MLMC
complexity from O (TOL−2.5
) to O (TOL−2
log(TOL)2
)
" without the need to use higher order schemes such as Milstein scheme as
in (Giles 2008a; Giles, Debrabant, and Rößler 2013)
3 Contrary to the smoothing strategy used in (Giles, Nagapetyan, and Ritter
2015), our numerical smoothing approach
▸ Does not deteriorate the strong convergence behavior.
▸ When estimating densities: our pointwise error does not increases
exponentially with respect to the dimension of state vector.
4 Our approach can be extended: (i) to many model dynamics and payoff
structures; (ii) computing financial Greeks; (iii) computing distribution
functions; (iv) risk estimation .
5 More details can be found in
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Multilevel
Monte Carlo combined with numerical smoothing for robust and efficient
option pricing and density estimation”. In: arXiv preprint
arXiv:2003.05708 (2022)
Related References
Thank you for your attention!
[1] C. Bayer, C. Ben Hammouda, R. Tempone. Numerical Smoothing with
Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for
Efficient Option Pricing, arXiv:2111.01874 (2021), to appear in
Quantitative Finance.
[2] C. Bayer, C. Ben Hammouda, R. Tempone. Multilevel Monte Carlo
combined with numerical smoothing for robust and efficient option pricing
and density estimation, arXiv:2003.05708 (2022).
[3] C. Bayer, C. Ben Hammouda, A. Papapantoleon, M. Samet, R. Tempone.
Optimal Damping with Hierarchical Adaptive Quadrature for Efficient
Fourier Pricing of Multi-Asset Options in Lévy Models, arXiv:2203.08196
(2022).
[4] C. Bayer, C. Ben Hammouda, R. Tempone. Hierarchical adaptive sparse
grids and quasi-Monte Carlo for option pricing under the rough Bergomi
model, Quantitative Finance, 2020.
42

More Related Content

PDF
ICCF_2022_talk.pdf
PDF
MCQMC_talk_Chiheb_Ben_hammouda.pdf
PDF
Talk_HU_Berlin_Chiheb_benhammouda.pdf
PDF
Presentation.pdf
PDF
talk_NASPDE.pdf
PDF
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
PDF
Talk iccf 19_ben_hammouda
PDF
Numerical smoothing and hierarchical approximations for efficient option pric...
ICCF_2022_talk.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdf
Talk_HU_Berlin_Chiheb_benhammouda.pdf
Presentation.pdf
talk_NASPDE.pdf
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Talk iccf 19_ben_hammouda
Numerical smoothing and hierarchical approximations for efficient option pric...

Similar to Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation (20)

PDF
International journal of engineering and mathematical modelling vol2 no1_2015_1
PDF
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
PDF
Sparse data formats and efficient numerical methods for uncertainties in nume...
PDF
Empowering Fourier-based Pricing Methods for Efficient Valuation of High-Dime...
PDF
CDT 22 slides.pdf
PDF
Automatic bayesian cubature
PDF
PhD defense talk slides
PDF
Presentation.pdf
PDF
Fourier_Pricing_ICCF_2022.pdf
PDF
A kernel-free particle method: Smile Problem Resolved
PDF
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
Triggering patterns of topology changes in dynamic attributed graphs
PPT
Jörg Stelzer
PDF
Random Matrix Theory and Machine Learning - Part 4
PDF
KAUST_talk_short.pdf
PDF
NCE, GANs & VAEs (and maybe BAC)
PDF
The Sample Average Approximation Method for Stochastic Programs with Integer ...
PDF
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
International journal of engineering and mathematical modelling vol2 no1_2015_1
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Sparse data formats and efficient numerical methods for uncertainties in nume...
Empowering Fourier-based Pricing Methods for Efficient Valuation of High-Dime...
CDT 22 slides.pdf
Automatic bayesian cubature
PhD defense talk slides
Presentation.pdf
Fourier_Pricing_ICCF_2022.pdf
A kernel-free particle method: Smile Problem Resolved
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Triggering patterns of topology changes in dynamic attributed graphs
Jörg Stelzer
Random Matrix Theory and Machine Learning - Part 4
KAUST_talk_short.pdf
NCE, GANs & VAEs (and maybe BAC)
The Sample Average Approximation Method for Stochastic Programs with Integer ...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Ad

Recently uploaded (20)

PPTX
The Effect of Human Resource Management Practice on Organizational Performanc...
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PPTX
Sustainable Forest Management ..SFM.pptx
PPTX
Tour Presentation Educational Activity.pptx
PDF
COLEAD A2F approach and Theory of Change
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
water for all cao bang - a charity project
PPTX
fundraisepro pitch deck elegant and modern
DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PPTX
An Unlikely Response 08 10 2025.pptx
PPTX
Anesthesia and it's stage with mnemonic and images
PPTX
Hydrogel Based delivery Cancer Treatment
PDF
Presentation1 [Autosaved].pdf diagnosiss
PPTX
Tablets And Capsule Preformulation Of Paracetamol
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
PPT
First Aid Training Presentation Slides.ppt
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
PPTX
MERISTEMATIC TISSUES (MERISTEMS) PPT PUBLIC
The Effect of Human Resource Management Practice on Organizational Performanc...
_ISO_Presentation_ISO 9001 and 45001.pptx
Sustainable Forest Management ..SFM.pptx
Tour Presentation Educational Activity.pptx
COLEAD A2F approach and Theory of Change
oil_refinery_presentation_v1 sllfmfls.pdf
water for all cao bang - a charity project
fundraisepro pitch deck elegant and modern
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
An Unlikely Response 08 10 2025.pptx
Anesthesia and it's stage with mnemonic and images
Hydrogel Based delivery Cancer Treatment
Presentation1 [Autosaved].pdf diagnosiss
Tablets And Capsule Preformulation Of Paracetamol
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
First Aid Training Presentation Slides.ppt
Emphasizing It's Not The End 08 06 2025.pptx
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
MERISTEMATIC TISSUES (MERISTEMS) PPT PUBLIC
Ad

Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation

  • 1. Numerical Smoothing and Hierarchical Approximations for Efficient Option Pricing and Density Estimation Chiheb Ben Hammouda Christian Bayer Raúl Tempone Center for Uncertainty Quantification Center for Un Quantification nter for Uncertainty Quantification Logo Lock-up Stochastic Numerics and Statistical Learning: Theory and Applications, KAUST May 23, 2022
  • 2. Related Manuscripts to the Talk Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Numerical Smoothing with Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for Efficient Option Pricing”. In: arXiv preprint arXiv:2111.01874 (2021), to appear in Quantitative Finance Journal (2022). Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Multilevel Monte Carlo combined with numerical smoothing for robust and efficient option pricing and density estimation”. In: arXiv preprint arXiv:2003.05708 (2022) 1
  • 3. Outline 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions 1
  • 4. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 5. Option Pricing as a High-Dimensional, non-Smooth Numerical Integration Problem Option pricing in finance often correspond to high-dimensional integration problems with non-smooth integrands. ▸ High-dimensional: (i) Time-discretization of an SDE; (ii) A large number of underlying assets; (iii) Path dependence on the whole trajectory of the underlying. ▸ Non-smooth: E.g., call options (ST − K)+ , digital options 1ST >K, . . . Methods like quasi Monte Carlo (QMC) or (adaptive) sparse grids quadrature (ASGQ) critically rely on smoothness of integrand and dimension of the problem. Monte Carlo (MC) does not care (in terms of convergence rates) BUT multilevel Monte Carlo (MLMC) does. Figure 1.1: Integrand of two-dimensional European basket option (Black–Scholes model)
  • 6. Framework Approximate efficiently E[g(X(T))] Given a (smooth) φ ∶ Rd → R, the payoff g ∶ Rd → R has either jumps or kinks: ▸ Hockey-stick functions: g(x) = max(φ(x),0) (put or call payoffs, . . . ). ▸ Indicator functions: g(x) = 1(φ(x)≥0) (digital option, financial Greeks, distribution functions, . . . ). ▸ Dirac Delta functions: g(x) = δ(φ(x)=0) (density estimation, financial Greeks, . . . ). The process X is approximated (via a discretization scheme) by X, E.g., ▸ One/multi-dimensional geometric Brownian motion (GBM) process. ▸ Multi-dimensional stochastic volatility model: E.g., the Heston model dXt = µXtdt + √ vtXtdWX t dvt = κ(θ − vt)dt + ξ √ vtdWv t , (WX t ,Wv t ): correlated Wiener processes with correlation ρ. 3
  • 7. How Does Regularity Affect QMC, ASGQ, and MLMC Performance? 3
  • 8. Randomized QMC (rQMC) A (rank-1) lattice rule (Sloan 1985; Nuyens 2014) with n points Qn(g) ∶= 1 n n−1 ∑ k=0 g ( kz mod n n ), where z = (z1,...,zd) ∈ Nd (the generating vector). A randomly shifted lattice rule Qn,q(g) = 1 q q−1 ∑ i=0 Q(i) n (f) = 1 q q−1 ∑ i=0 ( 1 n n−1 ∑ k=0 g ( kz + ∆(i) mod n n )), where {∆(i) }q i=1: independent random shifts, and MrQMC = q × n. " See previous talk by Bruno Tuffin for further details about QMC. 4
  • 9. How Does Regularity Affect QMC Performance? The analysis in (Niederreiter 1992) shows that the convergence rate of rQMC is O (M − 1 2 −δ rQMC (log MrQMC)d ), where 0 ≤ δ ≤ 1 2 is related to the degree of regularity of the integrand g. (Sloan and Woźniakowski 1998; Dick, Kuo, and Sloan 2013) show that convergence rates of O (M−1 rQMC) can be observed for the lattice rule if the integrand g belongs to Wd,γ equipped with the norm ∣∣g∣∣2 Wd,γ = ∑ α⊆{1∶d} 1 γα ∫ [0,1]∣α∣ (∫ [0,1]d−∣α∣ ∂∣α∣ ∂yα f(y)dy−α) 2 dyα, (1) where yα ∶= (yj)j∈α and y−α ∶= (yj)j∈{1∶d}∖α. Notation Wd,γ: a d-dimensional weighted Sobolev space of functions with square-integrable mixed first derivatives. γ ∶= {γα > 0 ∶ α ⊆ {1,2,...,d}} being a given collection of weights, and d being the dimension of the problem. 5
  • 10. ASGQ (I) Given F ∶ Rd → R and a multi-index β ∈ Nd +. Fβ ∶= Qm(β) [F] a quadrature operator based on a Cartesian quadrature grid (m(βn) points along yn, with m ∶ N → N a strictly increasing function). " Approximating E[F] with Fβ is not an appropriate option due to the well-known curse of dimensionality. Idea: A quadrature estimate of E[F] is MI` [F] = ∑ β∈I` ∆[Fβ], where ▸ The mixed (first-order tensor) difference operators: ∆[Fβ] = ⊗d i=1∆iFβ ▸ The first-order difference operators: ∆iFβ { Fβ − Fβ−ei , if βi > 1 Fβ if βi = 1 with ei denotes the ith d-dimensional unit vector. For instance, when d = 2, then ∆Fβ = ∆2∆1F(β1,β2) = ∆2 (F(β1,β2) − F(β1−1,β2)) = ∆2F(β1,β2) − ∆2F(β1−1,β2) = F(β1,β2) − F(β1,β2−1) − F(β1−1,β2) + F(β1−1,β2−1). " See previous talk by Lorenzo Tamellini for further details about ASGQ in relation to MISC, which is a more general version. 6
  • 11. ASGQ (II) E[F] ≈ MI` [F] = ∑ β∈I` ∆[Fβ], Product approach: I` = {∣∣ β ∣∣∞≤ `; β ∈ Nd +}. Regular sparse grids: I` = {∣∣ β ∣∣1≤ ` + d − 1; β ∈ Nd +} Adaptive sparse grids quadrature (ASGQ) (Gerstner and Griebel 1998): The construction of I` = IASGQ is done a posteriori and adaptively by profit thresholding IASGQ = {β ∈ Nd + ∶ Pβ ≥ T}. ▸ Profit of a hierarchical surplus Pβ = ∣∆Eβ∣ ∆Wβ . ▸ Error contribution: ∆Eβ = ∣MI∪{β} − MI ∣. ▸ Work contribution: ∆Wβ = Work[MI∪{β} ] − Work[MI ]. Figure 1.2: Left are product grids ∆β1 ⊗ ∆β2 for 1 ≤ β1,β2 ≤ 3. Right is the corresponding sparse grids construction. Figure 1.3: Illustration of ASG grid 7
  • 12. How Does Regularity Affect ASGQ Performance? Product approach: EQ(M) = O (M−r/d ) (for functions with bounded total derivatives up to order r). Adaptive sparse grids quadrature (ASGQ): From the analysis in (Chen 2018; Ernst, Sprungk, and Tamellini 2018) EQ(M) = O (M−p/2 ) (where1 p is independent from the problem dimension, and is related to the order up to which the weighted mixed derivatives are bounded). Notation: M: number of quadrature points; EQ: quadrature error. 1 characterizes the relation between the regularity of the integrand and the anisotropic property of the integrand with respect to different dimensions. 8
  • 13. Multilevel Monte Carlo (MLMC) (Heinrich 2001; Kebaier et al. 2005; Giles 2008b) Aim: Improve MC complexity, when estimating E[g(X(T))]. Setting: ▸ A hierarchy of nested meshes of [0,T], indexed by {`}L `=0. ▸ ∆t` = K−` ∆t0: The time steps size for levels ` ≥ 1; K>1, K ∈ N. ▸ X` ∶= X∆t` : The approximate process generated using a step size of ∆t`. MLMC idea E[g(XL(T))] = E[g(X0(T))] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ + L ∑ `=1 E[g(X`(T)) − g(X`−1(T))] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ (2) Var[g(X0(T))] ≫ Var[g(X`(T)) − g(X`−1(T))] ↘ as ` ↗ M0 ≫ M` ↘ as ` ↗ MLMC estimator: ̂ Q ∶= L ∑ `=0 ̂ Q`, ̂ Q0 ∶= 1 M0 M0 ∑ m0=1 g(X0,[m0](T)); ̂ Q` ∶= 1 M` M` ∑ m`=1 (g(X`,[m`](T)) − g(X`−1,[m`](T))), 1 ≤ ` ≤ L 9
  • 14. How Does Regularity Affect MLMC Complexity? Complexity analysis for MLMC MLMC Complexity (Cliffe, Giles, Scheichl, and Teckentrup 2011) O (TOL −2−max(0, γ−β α ) log (TOL)2×1{β=γ} ) (3) i) Weak rate: ∣E[g (X`(T)) − g (X(T))]∣ ≤ c12−α` ii) Variance decay rate: Var[g (X`(T)) − g (X`−1(T))] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ∶=V` ≤ c22−β` iii) Work growth rate: W` ≤ c32γ` (W`: expected cost) For forward Euler (γ = 1): If g is Lipschitz ⇒ V` ≃ ∆t` due to strong rate 1/2, that is β = γ. Otherwise β < γ ⇒ worst-case complexity. Higher order schemes, E.g., the Milstein scheme, may lead to better complexities even for non-Lipschitz observables (Giles, Debrabant, and Rößler 2013; Giles 2015). However, ▸ For moderate/high-dimensional dynamics, coupling issues may arise and the scheme becomes computationally expensive. ▸ Deterioration of the robustness of the MLMC estimator because the kurtosis explodes as ∆t` decreases: O (∆t−1 ` ) compared with O (∆t −1/2 ` ) for Euler without smoothing (Giles, Nagapetyan, and Ritter 2015) and O (1) in (Bayer, Ben Hammouda, and Tempone 2022) (see next slides). 10
  • 15. How Does Regularity Affect MLMC Robustness? / For non-lipschitz payoffs: The Kurtosis, κ` ∶= E[(Y`−E[Y`])4 ] (Var[Y`])2 is of O(∆t −1/2 ` ) for Euler scheme and O(∆t−1 ` ) for the Milstein scheme. Large kurtosis problem: discussed previously in (Ben Hammouda, Moraes, and Tempone 2017; Ben Hammouda, Ben Rached, and Tempone 2020) ⇒ / Expensive cost for reliable/robust estimates of sample statistics. Why is large kurtosis bad? σS2(Y`) = Var[Y`] √ M` √ (κ` − 1) + 2 M` − 1 ; " M` ≫ κ`. Why are accurate variance estimates, V` = Var[Y`], important? M∗ ` ∝ √ V`W−1 ` L ∑ `=0 √ V`W`. Notation Y` ∶= g(X`(T)) − g(X`−1(T)) σS2(Y`): Standard deviation of the sample variance of Y`; κ`: the kurtosis; V` = Var[Y`]; M`: number of samples; M∗ ` : Optimal number of samples per level; W`: Cost per sample path. 11
  • 16. Previous Successful Smoothing Techniques for QMC and ASGQ Mollification: E.g., by convolution with Gaussian kernel or manually; (-) Additional error that may scale exponentially with the dimension. Mapping the problem to the frequency space (E.g., (Bayer, Ben Hammouda, Papapantoleon, Samet, and Tempone 2022)): Better regularity compared to the physical space; (-) Applicable when the Fourier transform of the density function is available and cheap to compute. Bias-free mollification (E.g., (Bayer, Siebenmorgen, and Tempone 2018; Bayer, Ben Hammouda, and Tempone 2020)) by taking conditional expectations or exact integration over subset of integration variables; (-) Not always possible. Example 1.1 ((Romano and Touzi 1997)) (St,vt) stochastic. volatility. model, S driven by Bm Z ∶= ρW + √ 1 − ρ2B, v σ(W⋅)–measurable. Then E [g(ST )] = E [g (S0 exp(∫ T 0 √ vtdZt − 1 2 ∫ T 0 vtdt))] = E [E [g(ST ) ∣ σ(W⋅)]] = E ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ CBS ⎛ ⎝ S0 = S0 exp(ρ∫ T 0 √ vtdWt − ρ2 2 ∫ T 0 vtdt),σ2 = (1 − ρ2 )∫ T 0 vtdt ⎞ ⎠ ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ .
  • 17. Previous Successful Smoothing Techniques for MLMC Implicit smoothing based on conditional expectation combined with the Milstein scheme: (E.g, (Giles, Debrabant, and Rößler 2013)); (-) Not always possible and drawbacks of Milstein scheme. Parametric smoothing: carefully constructing a regularized version of the observable (E.g, (Giles, Nagapetyan, and Ritter 2015)); (-) Possible deterioration of the strong convergence behavior, and additional bias that may increase exponentially with the dimension (see next slides). Malliavin calculus integration by parts: (E.g, (Altmayer and Neuenkirch 2015)): splitting the payoff function into a smooth component (treated by standard MLMC) and a compactly supported discontinuous part (treated via Malliavin MLMC). Adaptive sampling: (E.g, (Haji-Ali, Spence, and Teckentrup 2021))
  • 18. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 19. Numerical Smoothing Steps Guiding example: g ∶ Rd → R payoff function, X ∆t T (∆t = T N ) Euler discretization of d-dimensional SDE , E.g., dX (i) t = ai(Xt)dt + ∑d j=1 bij(Xt)dW (j) t , where {W(j) }d j=1 are standard Brownian motions. X ∆t T = X ∆t T (∆W (1) 1 ,...,∆W (1) N ,...,∆W (d) 1 ,...,∆W (d) N ) ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ ∶=∆W . E[g(X ∆t T )] =? 1 Identify hierarchical representation of integration variables: (a) X ∆t T (∆W) ≡ X ∆t T (Z), Z = (Zi)dN i=1 ∼ N(0,IdN ): s.t. “Z1 ∶= (Z (1) 1 ,...,Z (d) 1 ) substantially contributes even for ∆t → 0”. E.g., Brownian bridges of ∆W / Haar wavelet construction of W. " Different from previous techniques which smooth out only at the final step with respect to ∆W ⇒ the smoothing effect vanishes as ∆t → 0. (b) Design of a sub-optimal smoothing direction (A: rotation matrix which is easy to construct and whose structure depends on the payoff g.) Y = AZ1. 14
  • 20. Numerical Smoothing Steps 2 E[g(X(T))] ≈ E[g(X ∆t (T))] = ∫ Rd×N G(z)ρd×N (z)dz (1) 1 ...dz (1) N ...dz (d) 1 ...dz (d) N = ∫ RdN−1 I(y−1,z (1) −1 ,...,z (d) −1 )ρd−1(y−1)dy−1ρdN−d(z (1) −1 ,...,z (d) −1 )dz (1) −1 ...dz (d) −1 = E[I(Y−1,Z (1) −1 ,...,Z (d) −1 )] ≈ E[I(Y−1,Z (1) −1 ,...,Z (d) −1 )], (4) I(y−1,z (1) −1 ,...,z (d) −1 ) = ∫ R G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 = ∫ y∗ 1 −∞ G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 + ∫ +∞ y∗ 1 G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 ≈ I(y−1,z (1) −1 ,...,z (d) −1 ) ∶= Mlag ∑ k=0 ηkG(ζk (y∗ 1),y−1,z (1) −1 ,...,z (d) −1 ), (5) 3 Compute the remaining (dN − 1)-integral in (4) by ASGQ or QMC or MLMC. Notation G ∶= g ○ Φ ○ (ψ(1) ,...,ψ(d) ) maps N × d Gaussian random inputs to g(X ∆t (T)); where ▸ ψ(j) ∶ (Z (j) 1 ,...,Z (j) N ) ↦ (B (j) 1 ,...,B (j) N ) denotes the mapping of the Brownian bridge construction. ▸ Φ ∶ (∆t,B) ↦ X ∆t (T) denotes the mapping of the time-stepping scheme. y∗ 1 (y−1,z (1) −1 ,...,z (d) −1 ): the exact discontinuity location s.t φ(X ∆t (T)) = P(y∗ 1 ;y−1,z (1) −1 ,...,z (d) −1 ) = 0; y∗ 1(y−1,z (1) −1 ,...,z (d) −1 ): the approximated discontinuity location via root finding; MLag: the number of Laguerre quadrature points ζk ∈ R, and corresponding weights ηk; ρd×N (z) = 1 (2π)d×N/2 e−1 2 zT z . 15
  • 21. Extending Numerical Smoothing for Multiple Discontinuities Multiple Discontinuities: Due to the payoff structure/use of Richardson extrapolation. R different ordered multiple roots, e.g., {y∗ i }R i=1, the smoothed integrand is I (y−1,z (1) −1 ,...,z (d) −1 ) = ∫ y∗ 1 −∞ G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρ1(y1)dy1 + ∫ +∞ y∗ R G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρ1(y1)dy1 + R−1 ∑ i=1 ∫ y∗ i+1 y∗ i G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρ1(y1)dy1, and its approximation I is given by I(y−1,z (1) −1 ,...,z (d) −1 ) ∶= MLag,1 ∑ k=0 ηLag k G(ζLag k,1 (y∗ 1),y−1,z (1) −1 ,...,z (d) −1 ) + MLag,R ∑ k=0 ηLag k G(ζLag k,R (y∗ R),y−1,z (1) −1 ,...,z (d) −1 ) + R−1 ∑ i=1 ⎛ ⎝ MLeg,i ∑ k=0 ηLeg k G(ζLeg k,i (y∗ i ,y∗ i+1),y−1,z (1) −1 ,...,z (d) −1 ) ⎞ ⎠ , {y∗ i }R i=1: the approximated discontinuities locations; MLag,1 and MLag,R: the number of Laguerre quadrature points ζLag .,. ∈ R with corresponding weights ηLag . ; {MLeg,i}R−1 i=1 : the number of Legendre quadrature points ζLeg .,. with corresponding weights ηLeg . . I can be approximated further depending on (i) the decay of G × ρ1 in the semi-infinite domains and (ii) how close the roots are to each other.
  • 22. Extending Numerical Smoothing for Density Estimation Goal: Approximate the density ρX at u, for a stochastic process X ρX(u) = E[δ(X − u)], δ is the Dirac delta function. " Without any smoothing techniques (regularization, kernel density,. . . ) MC and MLMC fail due to the infinite variance caused by the singularity of the function δ. Strategy: in (Bayer, Hammouda, and Tempone 2022) 1 Conditioning with respect to the Brownian bridge ρX(u) = 1 √ 2π E[exp(−(Y ∗ 1 (u)) 2 /2)∣ dY ∗ 1 du (u)∣] ≈ 1 √ 2π E ⎡ ⎢ ⎢ ⎢ ⎣ exp(−(Y ∗ 1(u)) 2 /2) R R R R R R R R R R R dY ∗ 1 du (u) R R R R R R R R R R R ⎤ ⎥ ⎥ ⎥ ⎦ , (6) Y ∗ 1 (u;Z−1): the exact discontinuity; Y ∗ 1(u;Z−1): the approximated discontinuity.
  • 23. Why not Kernel Density Techniques in Multiple Dimensions? Similar to approaches based on parametric regularization as in (Giles, Nagapetyan, and Ritter 2015). This class of approaches has a pointwise error that increases exponentially with respect to the dimension of the state vector X. For a d-dimensional problem, a kernel density estimator with a bandwidth matrix, H = diag(h,...,h) MSE ≈ c1M−1 h−d + c2h4 . (7) M is the number of samples, and c1 and c2 are constants. Our approach in high dimension: For u ∈ Rd ρX(u) = E[δ(X − u)] = E[ρd (Y∗ (u))∣det(J(u))∣] ≈ E[ρd (Y ∗ (u))∣det(J(u))∣], (8) where ▸ Y∗ (u;⋅): the exact discontinuity; Y ∗ (u;⋅): the approximated discontinuity. ▸ J is the Jacobian matrix, with Jij = ∂y∗ i ∂uj ; ρd(⋅) is the multivariate Gaussian density. Thanks to the exact conditional expectation with respect to the Brownian bridge ⇒ the smoothing error in our approach is insensitive to the dimension of the problem. 18
  • 24. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 25. Notations The Haar basis functions, ψn,k, of L2 ([0,1]) with support [2−n k,2−n (k + 1)]: ψ−1(t) ∶= 1[0,1](t); ψn,k(t) ∶= 2n/2 ψ (2n t − k), n ∈ N0, k = 0,...,2n − 1, where ψ(⋅) is the Haar mother wavelet: ψ(t) ∶= ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ 1, 0 ≤ t < 1 2 , −1, 1 2 ≤ t < 1, 0, else, Grid Dn ∶= {tn ` ∣ ` = 0,...,2n+1 } with tn ` ∶= ` 2n+1 T. Observe: the Haar basis functions up to level n are piece-wise constants with points of discontinuity given by Dn . For i.i.d. standard normal rdvs Z−1, Zn,k, n ∈ N0, k = 0,...,2n − 1, we define the (truncated) standard Brownian motion WN t ∶= Z−1Ψ−1(t) + N ∑ n=0 2n −1 ∑ k=0 Zn,kΨn,k(t). with Ψ−1(⋅) and Ψn,k(⋅) are the antiderivatives of the Haar basis functions. 19
  • 26. Notations We define the corresponding increments for any function or process F as follows: ∆N ` F ∶= F(tN `+1) − F(tN ` ). The solution of the Euler scheme of dXt = b(Xt)dWt, X0 = x ∈ R, along DN as XN `+1 ∶= XN ` + b(XN ` )∆N ` W, ` = 0,...,2N − 1, (9) with XN 0 ∶= X0 = x; for convenience, we also define XN T ∶= XN 2N . We define the deterministic function HN ∶ R2N+1−1 → R as HN (zN ) ∶= EZ−1 [g (XN T (Z−1,zN ))], (10) where ZN ∶= (Zn,k)n=0,...,N, k=0,...2n−1. 20
  • 27. Assumptions Assumption 3.1 There are positive rdvs Cp with finite moments of all orders such that ∀N ∈ N, ∀`1,...,`p ∈ {0,...,2N − 1} ∶ R R R R R R R R R R R R ∂p XN T ∂XN `1 ⋯∂XN `p R R R R R R R R R R R R ≤ Cp a.s. this means thata ∂pXN T ∂XN `1 ⋯∂XN `p = Ô P(1). a For sequences of rdvs FN and GN , we write that FN = Ô P(GN ) if there is a rdv C with finite moments of all orders such that for all N, we have ∣FN ∣ ≤ C ∣GN ∣ a.s. Assumption 3.1 is natural because it is fulfilled if the diffusion coefficient b(⋅) is smooth; this situation is valid for many option pricing models. 21
  • 28. Assumptions Assumption 3.2 For any p ∈ N we obtaina ( ∂XN T ∂y (Z−1,ZN )) −p = Ô P(1). a y ∶= z−1 In (Bayer, Ben Hammouda, and Tempone 2022), we show sufficient conditions where this assumption is valid. For instance, Assumption 3.2 is valid for ▸ one-dimensional models with a linear or constant diffusion. ▸ multivariate models with a linear drift and constant diffusion, including the multivariate lognormal model (see (Bayer, Siebenmorgen, and Tempone 2018)). There may be some cases in which Assumption 3.2 is not fulfilled, e.g., XT = W2 T . However, our method works well in such cases since we have g(XT ) = G(y2 1).
  • 29. The Smoothness Theorem Theorem 3.3 ((Bayer, Ben Hammouda, and Tempone 2022)) Assume that XN T , defined by (9), satisfies Assumptions 3.1 and 3.2. Then, for any p ∈ N and indices n1,...,np and k1,...,kp (satisfying 0 ≤ kj < 2nj ), the function HN defined in (10) satisfies the following (with constants independent of nj,kj)a ∂p HN ∂zn1,k1 ⋯∂znp,kp (zN ) = Ô P (2− ∑ p j=1 nj/2 ). In particular, HN is of class C∞ . a The constants increase in p and N.
  • 30. Sketch of the Proof 1 We consider a mollified version gδ of g and the corresponding function HN δ (defined by replacing g with gδ in (10)). 2 We prove that we can interchange the integration and differentiation, which implies ∂HN δ (zN ) ∂zn,k = E [g′ δ (XN T (Z−1,zN )) ∂XN T (Z−1,zN ) ∂zn,k ]. 3 Multiplying and dividing by ∂XN T (Z−1,zN ) ∂y and replacing the expectation by an integral w.r.t. the standard normal density, we obtain ∂HN δ (zN ) ∂zn,k = ∫ R ∂gδ (XN T (y,zN )) ∂y ( ∂XN T ∂y (y,zN )) −1 ∂XN T ∂zn,k (y,zN ) 1 √ 2π e− y2 2 dy. (11) 4 We show that integration by parts is possible, and then we can discard the mollified version to obtain the smoothness of HN because ∂HN (zN ) ∂zn,k = −∫ R g (XN T (y,zN )) ∂ ∂y ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ ( ∂XN T ∂y (y,zN )) −1 ∂XN T ∂zn,k (y,zN ) 1 √ 2π e− y2 2 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ dy. 5 The proof relies on successively applying the technique above of dividing by ∂XN T ∂y and then integrating by parts. 24
  • 31. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 32. Overcoming the High Dimensionality for ASGQ and QMC We combine ASGQ and QMC with hierarchical representations, as in (Bayer, Ben Hammouda, and Tempone 2020) Brownian bridges as a Wiener path generation method ⇒ ↘ the effective dimension of the problem. Richardson extrapolation ⇒ Faster convergence of the weak error ⇒ ↘ number of time steps to achieve a certain error tolerance ⇒ smaller total dimension of the input space. 25
  • 33. Error Discussion for ASGQ QASGQ N : the ASGQ estimator E[g(X(T)] − QASGQ N = E[g(X(T))] − E[g(X ∆t (T))] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error I: bias or weak error of O(∆t) + E[I (Y−1,Z (1) −1 ,...,Z (d) −1 )] − E[I (Y−1,Z (1) −1 ,...,Z (d) −1 )] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error II: numerical smoothing error of O(M−s Lag )+O(TOLκ+1 Newton ) + E[I (Y−1,Z (1) −1 ,...,Z (d) −1 )] − QASGQ N ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error III: ASGQ error of O(M −p/2 ASGQ ) , (12) Notations MASGQ: the number of quadrature points used by the ASGQ estimator, and p > 0. y∗ 1: the approximated location of the non smoothness obtained by Newton iteration ⇒ ∣y∗ 1 − y∗ 1∣ = TOLNewton κ ≥ 0 (κ = 0: heavy-side payoff (digital option), and κ = 1: call or put payoffs). MLag is the number of points used by the Laguerre quadrature for the one dimensional pre-integration step. s > 0: Derivatives of G with respect to y1 are bounded up to order s. 26
  • 34. Work and Complexity Discussion for ASGQ An optimal performance of ASGQ is given by ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ min (MASGQ,MLag,TOLNewton) WorkASGQ ∝ MASGQ × MLag × ∆t−1 s.t. Etotal,ASGQ = TOL. (13) Etotal, ASGQ ∶= E[g(X(T)] − QASGQ N = O (∆t) + O (M −p/2 ASGQ) + O (M−s Lag) + O (TOLκ+1 Newton). We show in (Bayer, Ben Hammouda, and Tempone 2022) that under certain conditions for the regularity parameters s and p (p,s ≫ 1) ▸ WorkASGQ = O (TOL−1 ) (for the best case) compared to WorkMC = O (TOL−3 ) (for the best case of MC). 27
  • 35. Multilevel Monte Carlo with Numerical Smoothing Recall our QoI E[g(X(T))] ≈ E[g(X ∆t (T))] = ∫ Rd×N G(z)ρd×N (z)dz (1) 1 ...dz (1) N ...dz (d) 1 ...dz (d) N = ∫ RdN−1 I(y−1,z (1) −1 ,...,z (d) −1 )ρd−1(y−1)dy−1ρdN−d(z (1) −1 ,...,z (d) −1 )dz (1) −1 ...dz (d) −1 = E[I(Y−1,Z (1) −1 ,...,Z (d) −1 )] ≈ E[I(Y−1,Z (1) −1 ,...,Z (d) −1 )], I(y−1,z (1) −1 ,...,z (d) −1 ) = ∫ R G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 = ∫ y∗ 1 −∞ G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 + ∫ +∞ y∗ 1 G(y1,y−1,z (1) −1 ,...,z (d) −1 )ρy1 (y1)dy1 ≈ I(y−1,z (1) −1 ,...,z (d) −1 ) ∶= Mlag ∑ k=0 ηkG(ζk (y∗ 1),y−1,z (1) −1 ,...,z (d) −1 ), Denote by I`:= I`(y` −1,z (1),` −1 ,...,z (d),` −1 ): the level ` Euler approximation of I, computed with: step size of ∆t`; MLag,` Laguerre quadrature points; TOLNewton,` as the tolerance of the Newton method at level ` MLMC estimator ̂ QMLMC ∶= L ∑ `=0 ̂ Q`. (14) with ̂ Q0 ∶= 1 M0 M0 ∑ m0=1 I0,[m0]; ̂ Q` ∶= 1 M` M` ∑ m`=1 (I`,[m`] − I`−1,[m`]), 1 ≤ ` ≤ L 28
  • 36. Multilevel Monte Carlo with Numerical Smoothing: Variance Decay, Complexity and Robustness Corollary 4.1 ((Bayer, Hammouda, and Tempone 2022)) Under Assmuptions 3.1 and 3.2, V` ∶= Var[I` − I`−1] = O (∆t`) compared with O (∆t 1/2 ` ) for MLMC without smoothing. Corollary 4.2 ((Bayer, Hammouda, and Tempone 2022)) Under Assmuptions 3.1 and 3.2, the complexity of MLMC combined with numerical smoothing using Euler discretization is O (TOL−2 (log(TOL))2 ) compared with O (TOL−2.5 ) for MLMC without smoothing. Corollary 4.3 ((Bayer, Hammouda, and Tempone 2022)) Let κ` be the kurtosis of the random variable Y` ∶= I` − I`−1, then under Assmuptions 3.1 and 3.2, we obtain κ` = O (1) compared with O (∆t −1/2 ` ) for MLMC without smoothing.
  • 37. Work Discussion for MLMC ̂ QMLMC : the MLMC estimator, as defined in (14). E[g(X(T)] − ̂ QMLMC = E[g(X(T))] − E[g(X ∆tL (T))] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error I: bias or weak error of O(∆tL) + E[IL (Y−1,Z (1) −1 ,...,Z (d) −1 )] − E[IL (Y−1,Z (1) −1 ,...,Z (d) −1 )] ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error II: numerical smoothing error of O(M−s Lag,L )+O(TOLκ+1 Newton,L ) + E[IL (Y−1,Z (1) −1 ,...,Z (d) −1 )] − ̂ QMLMC ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ Error III: MLMC statistical error of O ⎛ ⎝ √ ∑L `=L0 √ MLag,`+log(TOL−1 Newton,` ) ⎞ ⎠ . (15) 30
  • 38. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 39. ASGQ Quadrature Error Convergence 101 102 103 104 MASGQ 10-3 10 -2 10 -1 100 101 Relative Quadrature Error ASGQ without smoothing ASGQ with numerical smoothing (a) 101 102 103 104 MASGQ 10-3 10 -2 10-1 100 101 Relative Quadrature Error ASGQ without smoothing ASGQ with numerical smoothing (b) Figure 5.1: Digital option under the Heston model: Comparison of the relative quadrature error convergence for the ASGQ method with/out numerical smoothing. (a) Without Richardson extrapolation (N = 8), (b) with Richardson extrapolation (Nfine level = 8). 31
  • 40. QMC Error Convergence 102 103 104 QMC samples 10-3 10 -2 10-1 95% Statistical error QMC without smoothing slope=-0.52 QMC with numerical smoothing slope=-0.85 (a) 102 103 104 105 QMC samples 10-5 10 -4 10 -3 10-2 10-1 95% Statistical error QMC without smoothing slope=-0.64 QMC with numerical smoothing slope=-0.92 (b) Figure 5.2: Comparison of the 95% statistical error convergence for rQMC with/out numerical smoothing. (a) Digital option under Heston, (b) digital option under GBM. 32
  • 41. Errors in the Numerical Smoothing 101 102 MLag 10-5 10 -4 10 -3 10-2 10 -1 Relative numerical smoothing error Relative numerical smoothing error slope=-4.02 (a) 10-10 10-8 10-6 10-4 10-2 100 TOLNewton 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Relative numerical smoothing error 10 -4 Relative numerical smoothing error (b) Figure 5.3: Call option under GBM with N = 4: The relative numerical smoothing error for a fixed number of ASGQ points MASGQ = 103 plotted against (a) different values of MLag with a fixed Newton tolerance TOLNewton = 10−10 , (b) different values of TOLNewton with a fixed number of Laguerre quadrature points MLag = 128. 33
  • 42. Comparing ASGQ with MC 10 -3 10 -2 10 -1 Total Relative Error 10 -1 10 0 101 10 2 10 3 Computational Work MC (without smoothing, without Richardson extrapolation) MC (without smoothing, with Richardson extrapolation) ASGQ (with smoothing, without Richardson extrapolation) ASGQ (with smoothing, with Richardson extrapolation) Figure 5.4: Digital option under Heston: Computational work comparison for ASGQ with numerical smoothing and MC with the different configurations in terms of the level of Richardson extrapolation. 34
  • 43. Numerical Results for ASGQ Consider digital/call/basket options in Heston or discretized GBM models. ASGQ with numerical smoothing 10 - 100× faster in dim. around 20 than MC. Example Total relative CPU time error (ASGQ/MC) in % Single digital option (GBM) 0.4% 0.2% Single call option (GBM) 0.5% 0.3% Single digital option (Heston) 0.4% 3.2% Single call option (Heston) 0.5% 0.4% 4-dimensional basket call option (GBM) 0.8% 7.4% Table 1: Summary of the relative errors and computational gains achieved using ASGQ with numerical smoothing compared to the MC method, to realize a certain error tolerance. The CPU time ratios are computed for the best configuration with Richardson extrapolation for each method. 35
  • 44. Numerical Results for MLMC Method κL α β γ Numerical Complexity Without smoothing for digital under GBM 709 1 1/2 1 O (TOL−2.5 ) With numerical smoothing for digital under GBM 3 1 1 1 O (TOL−2 (log(TOL)) 2 ) Without smoothing for digital under Heston 245 1 1/2 1 O (TOL−2.5 ) With numerical smoothing for digital under Heston 7 1 1 1 O (TOL−2 log(TOL)2 ) With numerical smoothing for GBM density 5 1 1 1 O (TOL−2 (log(TOL)) 2 ) With numerical smoothing for Heston density 8 1 1 1 O (TOL−2 (log(TOL)) 2 ) Table 2: Summary of the MLMC numerical results observed different examples. κL is the kurtosis at the deepest levels of MLMC, (α,β,γ) are weak, variance decay and work rates respectively. TOL is the user-selected MLMC tolerance. 36
  • 45. Digital Option under the Heston Model: MLMC Without Smoothing 0 1 2 3 4 5 6 -8 -6 -4 -2 0 1 2 3 4 5 6 -10 -5 0 0 1 2 3 4 5 6 2 4 6 0 1 2 3 4 5 6 50 100 150 200 kurtosis Figure 5.5: Digital option under Heston: Convergence plots for MLMC without smoothing. 37
  • 46. Digital Option under the Heston Model: MLMC With Numerical Smoothing 0 1 2 3 4 5 6 7 -15 -10 -5 0 1 2 3 4 5 6 7 -10 -5 0 0 1 2 3 4 5 6 7 2 4 6 8 0 1 2 3 4 5 6 7 8 9 10 11 12 kurtosis Figure 5.6: Digital option under Heston: Convergence plots for MLMC with numerical smoothing. 38
  • 47. Digital Option under the Heston Model: Numerical Complexity Comparison 10-4 10-3 10-2 10-1 TOL 1e-04 1e-02 10 2 10 3 E[W] MLMC without smoothing TOL -2.5 MLMC+ Numerical smoothing TOL-2 log(TOL)2 Figure 5.7: Digital option under Heston: Comparison of the numerical complexity of i) standard MLMC, and ii) MLMC with numerical smoothing. Numerical computational cost improved from O(TOL−5/2 ) without smoothing to O(TOL−2 ) with numerical smoothing. 39
  • 48. 1 Motivation and Framework 2 The Numerical Smoothing Idea 3 The Smoothness Theorem 4 Combining Numerical Smoothing with ASGQ, QMC and MLMC 5 Numerical Experiments and Results 6 Conclusions
  • 49. Conclusions in the Context of Deterministic Quadrature Methods 1 We introduce a a novel numerical smoothing technique, combined with ASGQ/QMC, Hierarchical Brownian Bridge and Richardson extrapolation. 2 Our analysis and numerical experiments show that our novel approach outperforms substantially the MC method for high dimensional cases and for dynamics where discretization is needed. 3 We provide a smoothness analysis for the smoothed integrand in the time stepping setting. We also provide a related error discussion of our approach. 4 We illustrate numerically that traditional schemes of the Heston dynamics have low regularity, and we use an alternative smooth scheme to simulate the volatility process based on the sum of Ornstein-Uhlenbeck (OU) processes. 5 More details can be found in Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Numerical Smoothing with Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for Efficient Option Pricing”. In: arXiv preprint arXiv:2111.01874 (2021), to appear in Quantitative Finance Journal (2022)
  • 50. Conclusions in the Context of MLMC Methods 1 We propose a numerical smoothing approach that can be combined with MLMC estimator for efficient option pricing and density estimation. 2 Compared to the case without smoothing ▸ We significantly reduce the kurtosis at the deep levels of MLMC which improves the robustness of the estimator. ▸ We improve the strong convergence rate ⇒ improvement of MLMC complexity from O (TOL−2.5 ) to O (TOL−2 log(TOL)2 ) " without the need to use higher order schemes such as Milstein scheme as in (Giles 2008a; Giles, Debrabant, and Rößler 2013) 3 Contrary to the smoothing strategy used in (Giles, Nagapetyan, and Ritter 2015), our numerical smoothing approach ▸ Does not deteriorate the strong convergence behavior. ▸ When estimating densities: our pointwise error does not increases exponentially with respect to the dimension of state vector. 4 Our approach can be extended: (i) to many model dynamics and payoff structures; (ii) computing financial Greeks; (iii) computing distribution functions; (iv) risk estimation . 5 More details can be found in Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Multilevel Monte Carlo combined with numerical smoothing for robust and efficient option pricing and density estimation”. In: arXiv preprint arXiv:2003.05708 (2022)
  • 51. Related References Thank you for your attention! [1] C. Bayer, C. Ben Hammouda, R. Tempone. Numerical Smoothing with Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for Efficient Option Pricing, arXiv:2111.01874 (2021), to appear in Quantitative Finance. [2] C. Bayer, C. Ben Hammouda, R. Tempone. Multilevel Monte Carlo combined with numerical smoothing for robust and efficient option pricing and density estimation, arXiv:2003.05708 (2022). [3] C. Bayer, C. Ben Hammouda, A. Papapantoleon, M. Samet, R. Tempone. Optimal Damping with Hierarchical Adaptive Quadrature for Efficient Fourier Pricing of Multi-Asset Options in Lévy Models, arXiv:2203.08196 (2022). [4] C. Bayer, C. Ben Hammouda, R. Tempone. Hierarchical adaptive sparse grids and quasi-Monte Carlo for option pricing under the rough Bergomi model, Quantitative Finance, 2020. 42