Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation

Numerical Smoothing and Hierarchical Approximations
for Efficient Option Pricing and Density Estimation
Chiheb Ben Hammouda
Christian Bayer Raúl Tempone
Center for Uncertainty
Quantification
Center for Un
Quantification
nter for Uncertainty Quantification Logo Lock-up
Stochastic Numerics and Statistical Learning: Theory and
Applications, KAUST
May 23, 2022

Related Manuscripts to the Talk
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone.
“Numerical Smoothing with Hierarchical Adaptive Sparse Grids
and Quasi-Monte Carlo Methods for Efficient Option Pricing”. In:
arXiv preprint arXiv:2111.01874 (2021), to appear in Quantitative
Finance Journal (2022).
“Multilevel Monte Carlo combined with numerical smoothing for
robust and efficient option pricing and density estimation”. In:
arXiv preprint arXiv:2003.05708 (2022)
1

Outline
1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions
1

1 Motivation and Framework
2 The Numerical Smoothing Idea
3 The Smoothness Theorem
4 Combining Numerical Smoothing with ASGQ, QMC and MLMC
5 Numerical Experiments and Results
6 Conclusions

Option Pricing as a High-Dimensional, non-Smooth
Numerical Integration Problem
Option pricing in finance often
correspond to high-dimensional
integration problems with non-smooth
integrands.
▸ High-dimensional:
(i) Time-discretization of an SDE; (ii)
A large number of underlying assets;
(iii) Path dependence on the whole
trajectory of the underlying.
▸ Non-smooth: E.g., call options
(ST − K)+
, digital options 1ST >K, . . .
Methods like quasi Monte Carlo
(QMC) or (adaptive) sparse grids
quadrature (ASGQ) critically rely on
smoothness of integrand and dimension
of the problem.
Monte Carlo (MC) does not care (in
terms of convergence rates) BUT
multilevel Monte Carlo (MLMC) does.
Figure 1.1: Integrand of two-dimensional
European basket option (Black–Scholes
model)

Framework
Approximate efficiently E[g(X(T))]
Given a (smooth) φ ∶ Rd
→ R, the payoff g ∶ Rd
→ R has either
jumps or kinks:
▸ Hockey-stick functions: g(x) = max(φ(x),0) (put or call payoffs,
. . . ).
▸ Indicator functions: g(x) = 1(φ(x)≥0) (digital option, financial
Greeks, distribution functions, . . . ).
▸ Dirac Delta functions: g(x) = δ(φ(x)=0) (density estimation, financial
Greeks, . . . ).
The process X is approximated (via a discretization scheme) by
X, E.g.,
▸ One/multi-dimensional geometric Brownian motion (GBM) process.
▸ Multi-dimensional stochastic volatility model: E.g., the Heston
model
dXt = µXtdt +
√
vtXtdWX
t
dvt = κ(θ − vt)dt + ξ
√
vtdWv
t ,
(WX
t ,Wv
t ): correlated Wiener processes with correlation ρ.
3

How Does Regularity Affect
QMC, ASGQ, and MLMC Performance?
3

Randomized QMC (rQMC)
A (rank-1) lattice rule (Sloan 1985; Nuyens 2014) with n points
Qn(g) ∶=
1
n
n−1
∑
k=0
g (
kz mod n
n
),
where z = (z1,...,zd) ∈ Nd
(the generating vector).
A randomly shifted lattice rule
Qn,q(g) =
1
q
q−1
∑
i=0
Q(i)
n (f) =
1
q
q−1
∑
i=0
(
1
n
n−1
∑
k=0
g (
kz + ∆(i)
mod n
n
)),
where {∆(i)
}q
i=1: independent random shifts, and MrQMC = q × n.
" See previous talk by Bruno Tuffin for further details about QMC.
4

How Does Regularity Affect QMC Performance?
The analysis in (Niederreiter 1992) shows that the convergence rate of
rQMC is O (M
− 1
2
−δ
rQMC (log MrQMC)d
), where 0 ≤ δ ≤ 1
2 is related to the
degree of regularity of the integrand g.
(Sloan and Woźniakowski 1998; Dick, Kuo, and Sloan 2013) show that
convergence rates of O (M−1
rQMC) can be observed for the lattice rule if
the integrand g belongs to Wd,γ equipped with the norm
∣∣g∣∣2
Wd,γ
= ∑
α⊆{1∶d}
1
γα
∫
[0,1]∣α∣
(∫
[0,1]d−∣α∣
∂∣α∣
∂yα
f(y)dy−α)
2
dyα, (1)
where yα ∶= (yj)j∈α and y−α ∶= (yj)j∈{1∶d}∖α.
Notation
Wd,γ: a d-dimensional weighted Sobolev space of functions with
square-integrable mixed first derivatives.
γ ∶= {γα > 0 ∶ α ⊆ {1,2,...,d}} being a given collection of weights, and
d being the dimension of the problem.
5

ASGQ (I)
Given F ∶ Rd
→ R and a multi-index β ∈ Nd
+.
Fβ ∶= Qm(β)
[F] a quadrature operator based on a Cartesian quadrature grid
(m(βn) points along yn, with m ∶ N → N a strictly increasing function).
" Approximating E[F] with Fβ is not an appropriate option due to the
well-known curse of dimensionality.
Idea: A quadrature estimate of E[F] is
MI`
[F] = ∑
β∈I`
∆[Fβ],
where
▸ The mixed (first-order tensor) difference operators: ∆[Fβ] = ⊗d
i=1∆iFβ
▸ The first-order difference operators: ∆iFβ {
Fβ − Fβ−ei , if βi > 1
Fβ if βi = 1
with ei denotes the ith d-dimensional unit vector.
For instance, when d = 2, then
∆Fβ = ∆2∆1F(β1,β2) = ∆2 (F(β1,β2) − F(β1−1,β2)) = ∆2F(β1,β2) − ∆2F(β1−1,β2)
= F(β1,β2) − F(β1,β2−1) − F(β1−1,β2) + F(β1−1,β2−1).
" See previous talk by Lorenzo Tamellini for further details about ASGQ in
relation to MISC, which is a more general version. 6

ASGQ (II)
E[F] ≈ MI`
[F] = ∑
β∈I`
∆[Fβ],
Product approach: I` = {∣∣ β ∣∣∞≤ `; β ∈ Nd
+}.
Regular sparse grids: I` = {∣∣ β ∣∣1≤ ` + d − 1; β ∈ Nd
+}
Adaptive sparse grids quadrature (ASGQ) (Gerstner and Griebel 1998): The
construction of I` = IASGQ
is done a posteriori and adaptively by profit
thresholding IASGQ
= {β ∈ Nd
+ ∶ Pβ ≥ T}.
▸ Profit of a hierarchical surplus Pβ =
∣∆Eβ∣
∆Wβ
.
▸ Error contribution: ∆Eβ = ∣MI∪{β}
− MI
∣.
▸ Work contribution: ∆Wβ = Work[MI∪{β}
] − Work[MI
].
Figure 1.2: Left are product grids ∆β1 ⊗ ∆β2
for 1 ≤ β1,β2 ≤ 3. Right is the corresponding
sparse grids construction.
Figure 1.3: Illustration of ASG grid
7

How Does Regularity Affect ASGQ Performance?
Product approach: EQ(M) = O (M−r/d
) (for functions with
bounded total derivatives up to order r).
Adaptive sparse grids quadrature (ASGQ): From the analysis in
(Chen 2018; Ernst, Sprungk, and Tamellini 2018)
EQ(M) = O (M−p/2
) (where1
p is independent from the problem
dimension, and is related to the order up to which the weighted
mixed derivatives are bounded).
Notation: M: number of quadrature points; EQ: quadrature error.
1
characterizes the relation between the regularity of the integrand and the
anisotropic property of the integrand with respect to different dimensions. 8

Multilevel Monte Carlo (MLMC)
(Heinrich 2001; Kebaier et al. 2005; Giles 2008b)
Aim: Improve MC complexity, when estimating E[g(X(T))].
Setting:
▸ A hierarchy of nested meshes of [0,T], indexed by {`}L
`=0.
▸ ∆t` = K−`
∆t0: The time steps size for levels ` ≥ 1; K>1, K ∈ N.
▸ X` ∶= X∆t`
: The approximate process generated using a step size of ∆t`.
MLMC idea
E[g(XL(T))] = E[g(X0(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
+
L
∑
`=1
E[g(X`(T)) − g(X`−1(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
(2)
Var[g(X0(T))] ≫ Var[g(X`(T)) − g(X`−1(T))] ↘ as ` ↗
M0 ≫ M` ↘ as ` ↗
MLMC estimator: ̂
Q ∶=
L
∑
`=0
̂
Q`,
̂
Q0 ∶=
1
M0
M0
∑
m0=1
g(X0,[m0](T)); ̂
Q` ∶=
1
M`
M`
∑
m`=1
(g(X`,[m`](T)) − g(X`−1,[m`](T))), 1 ≤ ` ≤ L
9

How Does Regularity Affect MLMC Complexity?
Complexity analysis for MLMC
MLMC Complexity (Cliffe, Giles,
Scheichl, and Teckentrup 2011)
O (TOL
−2−max(0, γ−β
α
)
log (TOL)2×1{β=γ}
)
(3)
i) Weak rate:
∣E[g (X`(T)) − g (X(T))]∣ ≤ c12−α`
ii) Variance decay rate:
Var[g (X`(T)) − g (X`−1(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
∶=V`
≤ c22−β`
iii) Work growth rate: W` ≤ c32γ`
(W`:
expected cost)
For forward Euler (γ = 1): If g is Lipschitz ⇒ V` ≃ ∆t` due to strong rate 1/2, that is
β = γ. Otherwise β < γ ⇒ worst-case complexity.
Higher order schemes, E.g., the Milstein scheme, may lead to better complexities
even for non-Lipschitz observables (Giles, Debrabant, and Rößler 2013; Giles 2015).
However,
▸ For moderate/high-dimensional dynamics, coupling issues may arise and the scheme
becomes computationally expensive.
▸ Deterioration of the robustness of the MLMC estimator because the kurtosis explodes
as ∆t` decreases: O (∆t−1
` ) compared with O (∆t
−1/2
` ) for Euler without smoothing
(Giles, Nagapetyan, and Ritter 2015) and O (1) in (Bayer, Ben Hammouda, and
Tempone 2022) (see next slides).
10

How Does Regularity Affect MLMC Robustness?
/ For non-lipschitz payoffs: The Kurtosis, κ` ∶=
E[(Y`−E[Y`])4
]
(Var[Y`])2 is of
O(∆t
−1/2
` ) for Euler scheme and O(∆t−1
` ) for the Milstein scheme.
Large kurtosis problem: discussed previously in (Ben Hammouda, Moraes,
and Tempone 2017; Ben Hammouda, Ben Rached, and Tempone 2020) ⇒
/ Expensive cost for reliable/robust estimates of sample statistics.
Why is large kurtosis bad?
σS2(Y`) =
Var[Y`]
√
M`
√
(κ` − 1) +
2
M` − 1
; " M` ≫ κ`.
Why are accurate variance estimates, V` = Var[Y`], important?
M∗
` ∝
√
V`W−1
`
L
∑
`=0
√
V`W`.
Notation
Y` ∶= g(X`(T)) − g(X`−1(T))
σS2(Y`): Standard deviation of the sample variance of Y`;
κ`: the kurtosis; V` = Var[Y`]; M`: number of samples;
M∗
` : Optimal number of samples per level; W`: Cost per sample path.
11

Previous Successful Smoothing Techniques
for QMC and ASGQ
Mollification: E.g., by convolution with Gaussian kernel or manually;
(-) Additional error that may scale exponentially with the dimension.
Mapping the problem to the frequency space (E.g., (Bayer, Ben Hammouda,
Papapantoleon, Samet, and Tempone 2022)): Better regularity compared to the
physical space;
(-) Applicable when the Fourier transform of the density function is available and
cheap to compute.
Bias-free mollification (E.g., (Bayer, Siebenmorgen, and Tempone 2018; Bayer,
Ben Hammouda, and Tempone 2020)) by taking conditional expectations or exact
integration over subset of integration variables;
(-) Not always possible.
Example 1.1 ((Romano and Touzi 1997))
(St,vt) stochastic. volatility. model, S driven by Bm Z ∶= ρW +
√
1 − ρ2B, v
σ(W⋅)–measurable. Then
E [g(ST )] = E [g (S0 exp(∫
T
0
√
vtdZt −
1
2
∫
T
0
vtdt))] = E [E [g(ST ) ∣ σ(W⋅)]]
= E
⎡
⎢
⎢
⎢
⎢
⎣
CBS
⎛
⎝
S0 = S0 exp(ρ∫
T
0
√
vtdWt −
ρ2
2
∫
T
0
vtdt),σ2
= (1 − ρ2
)∫
T
0
vtdt
⎞
⎠
⎤
⎥
⎥
⎥
⎥
⎦
.

Previous Successful Smoothing Techniques
for MLMC
Implicit smoothing based on conditional expectation combined with
the Milstein scheme: (E.g, (Giles, Debrabant, and Rößler 2013));
(-) Not always possible and drawbacks of Milstein scheme.
Parametric smoothing: carefully constructing a regularized version
of the observable (E.g, (Giles, Nagapetyan, and Ritter 2015));
(-) Possible deterioration of the strong convergence behavior, and
additional bias that may increase exponentially with the dimension
(see next slides).
Malliavin calculus integration by parts: (E.g, (Altmayer and
Neuenkirch 2015)): splitting the payoff function into a smooth
component (treated by standard MLMC) and a compactly
supported discontinuous part (treated via Malliavin MLMC).
Adaptive sampling: (E.g, (Haji-Ali, Spence, and Teckentrup 2021))

Numerical Smoothing Steps
Guiding example: g ∶ Rd
→ R payoff function, X
∆t
T (∆t = T
N ) Euler
discretization of d-dimensional SDE , E.g.,
dX
(i)
t = ai(Xt)dt + ∑d
j=1 bij(Xt)dW
(j)
t , where {W(j)
}d
j=1 are standard
Brownian motions.
X
∆t
T = X
∆t
T (∆W
(1)
1 ,...,∆W
(1)
N ,...,∆W
(d)
1 ,...,∆W
(d)
N )
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
∶=∆W
. E[g(X
∆t
T )] =?
1 Identify hierarchical representation of integration variables:
(a) X
∆t
T (∆W) ≡ X
∆t
T (Z), Z = (Zi)dN
i=1 ∼ N(0,IdN ):
s.t. “Z1 ∶= (Z
(1)
1 ,...,Z
(d)
1 ) substantially contributes even for ∆t → 0”.
E.g., Brownian bridges of ∆W / Haar wavelet construction of W.
" Different from previous techniques which smooth out only at the
final step with respect to ∆W ⇒ the smoothing effect vanishes as
∆t → 0.
(b) Design of a sub-optimal smoothing direction (A: rotation matrix
which is easy to construct and whose structure depends on the payoff
g.)
Y = AZ1.
14

Numerical Smoothing Steps
2
E[g(X(T))] ≈ E[g(X
∆t
(T))] = ∫
Rd×N
G(z)ρd×N (z)dz
(1)
1 ...dz
(1)
N ...dz
(d)
1 ...dz
(d)
N
= ∫
RdN−1
I(y−1,z
(1)
−1 ,...,z
(d)
−1 )ρd−1(y−1)dy−1ρdN−d(z
(1)
−1 ,...,z
(d)
−1 )dz
(1)
−1 ...dz
(d)
−1
= E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] ≈ E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )], (4)
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
= ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1 + ∫
+∞
y∗
1
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
≈ I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
Mlag
∑
k=0
ηkG(ζk (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 ), (5)
3 Compute the remaining (dN − 1)-integral in (4) by ASGQ or QMC or MLMC.
Notation
G ∶= g ○ Φ ○ (ψ(1)
,...,ψ(d)
) maps N × d Gaussian random inputs to g(X
∆t
(T)); where
▸ ψ(j)
∶ (Z
(j)
1 ,...,Z
(j)
N ) ↦ (B
(j)
1 ,...,B
(j)
N ) denotes the mapping of the Brownian bridge construction.
▸ Φ ∶ (∆t,B) ↦ X
∆t
(T) denotes the mapping of the time-stepping scheme.
y∗
1 (y−1,z
(1)
−1 ,...,z
(d)
−1 ): the exact discontinuity location s.t
φ(X
∆t
(T)) = P(y∗
1 ;y−1,z
(1)
−1 ,...,z
(d)
−1 ) = 0;
y∗
1(y−1,z
(1)
−1 ,...,z
(d)
−1 ): the approximated discontinuity location via root finding;
MLag: the number of Laguerre quadrature points ζk ∈ R, and corresponding weights ηk;
ρd×N (z) = 1
(2π)d×N/2 e−1
2
zT z
.
15

Extending Numerical Smoothing for
Multiple Discontinuities
Multiple Discontinuities: Due to the payoff structure/use of Richardson extrapolation.
R different ordered multiple roots, e.g., {y∗
i }R
i=1, the smoothed integrand is
I (y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1 + ∫
+∞
y∗
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1
+
R−1
∑
i=1
∫
y∗
i+1
y∗
i
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρ1(y1)dy1,
and its approximation I is given by
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
MLag,1
∑
k=0
ηLag
k G(ζLag
k,1 (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 )
+
MLag,R
∑
k=0
ηLag
k G(ζLag
k,R (y∗
R),y−1,z
(1)
−1 ,...,z
(d)
−1 )
+
R−1
∑
i=1
⎛
⎝
MLeg,i
∑
k=0
ηLeg
k G(ζLeg
k,i (y∗
i ,y∗
i+1),y−1,z
(1)
−1 ,...,z
(d)
−1 )
⎞
⎠
,
{y∗
i }R
i=1: the approximated discontinuities locations; MLag,1 and MLag,R: the number
of Laguerre quadrature points ζLag
.,. ∈ R with corresponding weights ηLag
. ; {MLeg,i}R−1
i=1 :
the number of Legendre quadrature points ζLeg
.,. with corresponding weights ηLeg
. .
I can be approximated further depending on (i) the decay of G × ρ1 in the semi-infinite
domains and (ii) how close the roots are to each other.

Extending Numerical Smoothing for
Density Estimation
Goal: Approximate the density ρX at u, for a stochastic process X
ρX(u) = E[δ(X − u)], δ is the Dirac delta function.
" Without any smoothing techniques (regularization, kernel
density,. . . ) MC and MLMC fail due to the infinite variance caused
by the singularity of the function δ.
Strategy: in (Bayer, Hammouda, and Tempone 2022)
1 Conditioning with respect to the Brownian bridge
ρX(u) =
1
√
2π
E[exp(−(Y ∗
1 (u))
2
/2)∣
dY ∗
1
du
(u)∣]
≈
1
√
2π
E
⎡
⎢
⎢
⎢
⎣
exp(−(Y
∗
1(u))
2
/2)
R
R
R
R
R
R
R
R
R
R
R
dY
∗
1
du
(u)
R
R
R
R
R
R
R
R
R
R
R
⎤
⎥
⎥
⎥
⎦
, (6)
Y ∗
1 (u;Z−1): the exact discontinuity; Y
∗
1(u;Z−1): the approximated
discontinuity.

Why not Kernel Density Techniques
in Multiple Dimensions?
Similar to approaches based on parametric regularization as in (Giles, Nagapetyan,
and Ritter 2015).
This class of approaches has a pointwise error that increases exponentially with
respect to the dimension of the state vector X.
For a d-dimensional problem, a kernel density estimator with a bandwidth matrix,
H = diag(h,...,h)
MSE ≈ c1M−1
h−d
+ c2h4
. (7)
M is the number of samples, and c1 and c2 are constants.
Our approach in high dimension: For u ∈ Rd
ρX(u) = E[δ(X − u)] = E[ρd (Y∗
(u))∣det(J(u))∣]
≈ E[ρd (Y
∗
(u))∣det(J(u))∣], (8)
where
▸ Y∗
(u;⋅): the exact discontinuity; Y
∗
(u;⋅): the approximated discontinuity.
▸ J is the Jacobian matrix, with Jij =
∂y∗
i
∂uj
; ρd(⋅) is the multivariate Gaussian density.
Thanks to the exact conditional expectation with respect to the Brownian bridge
⇒ the smoothing error in our approach is insensitive to the dimension of the
problem.
18

Notations
The Haar basis functions, ψn,k, of L2
([0,1]) with support
[2−n
k,2−n
(k + 1)]:
ψ−1(t) ∶= 1[0,1](t); ψn,k(t) ∶= 2n/2
ψ (2n
t − k), n ∈ N0, k = 0,...,2n
− 1,
where ψ(⋅) is the Haar mother wavelet:
ψ(t) ∶=
⎧
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎩
1, 0 ≤ t < 1
2
,
−1, 1
2
≤ t < 1,
0, else,
Grid Dn ∶= {tn
` ∣ ` = 0,...,2n+1
} with tn
` ∶= `
2n+1 T. Observe: the
Haar basis functions up to level n are piece-wise constants with
points of discontinuity given by Dn
.
For i.i.d. standard normal rdvs Z−1, Zn,k, n ∈ N0, k = 0,...,2n
− 1,
we define the (truncated) standard Brownian motion
WN
t ∶= Z−1Ψ−1(t) +
N
∑
n=0
2n
−1
∑
k=0
Zn,kΨn,k(t).
with Ψ−1(⋅) and Ψn,k(⋅) are the antiderivatives of the Haar basis
functions. 19

Notations
We define the corresponding increments for any function or
process F as follows:
∆N
` F ∶= F(tN
`+1) − F(tN
` ).
The solution of the Euler scheme of dXt = b(Xt)dWt, X0 = x ∈ R,
along DN
as
XN
`+1 ∶= XN
` + b(XN
` )∆N
` W, ` = 0,...,2N
− 1, (9)
with XN
0 ∶= X0 = x; for convenience, we also define XN
T ∶= XN
2N .
We define the deterministic function HN
∶ R2N+1−1
→ R as
HN
(zN
) ∶= EZ−1
[g (XN
T (Z−1,zN
))], (10)
where ZN
∶= (Zn,k)n=0,...,N, k=0,...2n−1.
20

Assumptions
Assumption 3.1
There are positive rdvs Cp with finite moments of all orders such that
∀N ∈ N, ∀`1,...,`p ∈ {0,...,2N
− 1} ∶
R
R
R
R
R
R
R
R
R
R
R
R
∂p
XN
T
∂XN
`1
⋯∂XN
`p
R
R
R
R
R
R
R
R
R
R
R
R
≤ Cp a.s.
this means thata ∂pXN
T
∂XN
`1
⋯∂XN
`p
= Ô
P(1).
a
For sequences of rdvs FN and GN , we write that FN = Ô
P(GN ) if there
is a rdv C with finite moments of all orders such that for all N, we have
∣FN ∣ ≤ C ∣GN ∣ a.s.
Assumption 3.1 is natural because it is fulfilled if the diffusion
coefficient b(⋅) is smooth; this situation is valid for many option pricing
models.
21

Assumptions
Assumption 3.2
For any p ∈ N we obtaina
(
∂XN
T
∂y
(Z−1,ZN
))
−p
= Ô
P(1).
a
y ∶= z−1
In (Bayer, Ben Hammouda, and Tempone 2022), we show
sufficient conditions where this assumption is valid. For instance,
Assumption 3.2 is valid for
▸ one-dimensional models with a linear or constant diffusion.
▸ multivariate models with a linear drift and constant diffusion,
including the multivariate lognormal model (see (Bayer,
Siebenmorgen, and Tempone 2018)).
There may be some cases in which Assumption 3.2 is not fulfilled,
e.g., XT = W2
T . However, our method works well in such cases
since we have g(XT ) = G(y2
1).

The Smoothness Theorem
Theorem 3.3 ((Bayer, Ben Hammouda, and Tempone 2022))
Assume that XN
T , defined by (9), satisfies Assumptions 3.1 and 3.2.
Then, for any p ∈ N and indices n1,...,np and k1,...,kp (satisfying
0 ≤ kj < 2nj ), the function HN
defined in (10) satisfies the following
(with constants independent of nj,kj)a
∂p
HN
∂zn1,k1 ⋯∂znp,kp
(zN
) = Ô
P (2− ∑
p
j=1 nj/2
).
In particular, HN
is of class C∞
.
a
The constants increase in p and N.

Sketch of the Proof
1 We consider a mollified version gδ of g and the corresponding function
HN
δ (defined by replacing g with gδ in (10)).
2 We prove that we can interchange the integration and differentiation,
which implies
∂HN
δ (zN
)
∂zn,k
= E [g′
δ (XN
T (Z−1,zN
))
∂XN
T (Z−1,zN
)
∂zn,k
].
3 Multiplying and dividing by
∂XN
T (Z−1,zN )
∂y and replacing the expectation
by an integral w.r.t. the standard normal density, we obtain
∂HN
δ (zN
)
∂zn,k
= ∫
R
∂gδ (XN
T (y,zN
))
∂y
(
∂XN
T
∂y
(y,zN
))
−1
∂XN
T
∂zn,k
(y,zN
)
1
√
2π
e− y2
2 dy.
(11)
4 We show that integration by parts is possible, and then we can discard
the mollified version to obtain the smoothness of HN
because
∂HN
(zN
)
∂zn,k
= −∫
R
g (XN
T (y,zN
))
∂
∂y
⎡
⎢
⎢
⎢
⎢
⎣
(
∂XN
T
∂y
(y,zN
))
−1
∂XN
T
∂zn,k
(y,zN
)
1
√
2π
e− y2
2
⎤
⎥
⎥
⎥
⎥
⎦
dy.
5 The proof relies on successively applying the technique above of
dividing by
∂XN
T
∂y and then integrating by parts.
24

Overcoming the High Dimensionality
for ASGQ and QMC
We combine ASGQ and QMC with hierarchical representations, as in
(Bayer, Ben Hammouda, and Tempone 2020)
Brownian bridges as a Wiener path generation method ⇒ ↘
the effective dimension of the problem.
Richardson extrapolation ⇒ Faster convergence of the weak
error ⇒ ↘ number of time steps to achieve a certain error
tolerance ⇒ smaller total dimension of the input space.
25

Error Discussion for ASGQ
QASGQ
N : the ASGQ estimator
E[g(X(T)] − QASGQ
N = E[g(X(T))] − E[g(X
∆t
(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error I: bias or weak error of O(∆t)
+ E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error II: numerical smoothing error of O(M−s
Lag
)+O(TOLκ+1
Newton
)
+ E[I (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − QASGQ
N
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error III: ASGQ error of O(M
−p/2
ASGQ
)
, (12)
Notations
MASGQ: the number of quadrature points used by the ASGQ estimator, and p > 0.
y∗
1: the approximated location of the non smoothness obtained by Newton iteration
⇒ ∣y∗
1 − y∗
1∣ = TOLNewton
κ ≥ 0 (κ = 0: heavy-side payoff (digital option), and κ = 1: call or put payoffs).
MLag is the number of points used by the Laguerre quadrature for the one
dimensional pre-integration step.
s > 0: Derivatives of G with respect to y1 are bounded up to order s.
26

Work and Complexity Discussion for ASGQ
An optimal performance of ASGQ is given by
⎧
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎩
min
(MASGQ,MLag,TOLNewton)
WorkASGQ ∝ MASGQ × MLag × ∆t−1
s.t. Etotal,ASGQ = TOL.
(13)
Etotal, ASGQ ∶= E[g(X(T)] − QASGQ
N
= O (∆t) + O (M
−p/2
ASGQ) + O (M−s
Lag) + O (TOLκ+1
Newton).
We show in (Bayer, Ben Hammouda, and Tempone 2022) that
under certain conditions for the regularity parameters s and p
(p,s ≫ 1)
▸ WorkASGQ = O (TOL−1
) (for the best case) compared to
WorkMC = O (TOL−3
) (for the best case of MC).
27

Multilevel Monte Carlo with Numerical Smoothing
Recall our QoI
E[g(X(T))] ≈ E[g(X
∆t
(T))] = ∫
Rd×N
G(z)ρd×N (z)dz
(1)
1 ...dz
(1)
N ...dz
(d)
1 ...dz
(d)
N
= ∫
RdN−1
I(y−1,z
(1)
−1 ,...,z
(d)
−1 )ρd−1(y−1)dy−1ρdN−d(z
(1)
−1 ,...,z
(d)
−1 )dz
(1)
−1 ...dz
(d)
−1
= E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] ≈ E[I(Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )],
I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) = ∫
R
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
= ∫
y∗
1
−∞
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1 + ∫
+∞
y∗
1
G(y1,y−1,z
(1)
−1 ,...,z
(d)
−1 )ρy1 (y1)dy1
≈ I(y−1,z
(1)
−1 ,...,z
(d)
−1 ) ∶=
Mlag
∑
k=0
ηkG(ζk (y∗
1),y−1,z
(1)
−1 ,...,z
(d)
−1 ),
Denote by I`:= I`(y`
−1,z
(1),`
−1 ,...,z
(d),`
−1 ): the level ` Euler approximation of I,
computed with: step size of ∆t`; MLag,` Laguerre quadrature points;
TOLNewton,` as the tolerance of the Newton method at level `
MLMC estimator
̂
QMLMC
∶=
L
∑
`=0
̂
Q`. (14)
with
̂
Q0 ∶=
1
M0
M0
∑
m0=1
I0,[m0]; ̂
Q` ∶=
1
M`
M`
∑
m`=1
(I`,[m`] − I`−1,[m`]), 1 ≤ ` ≤ L
28

Multilevel Monte Carlo with Numerical Smoothing:
Variance Decay, Complexity and Robustness
Corollary 4.1 ((Bayer, Hammouda, and Tempone 2022))
Under Assmuptions 3.1 and 3.2, V` ∶= Var[I` − I`−1] = O (∆t`) compared
with O (∆t
1/2
` ) for MLMC without smoothing.
Under Assmuptions 3.1 and 3.2, the complexity of MLMC combined with
numerical smoothing using Euler discretization is O (TOL−2
(log(TOL))2
)
compared with O (TOL−2.5
) for MLMC without smoothing.
Let κ` be the kurtosis of the random variable Y` ∶= I` − I`−1, then under
Assmuptions 3.1 and 3.2, we obtain κ` = O (1) compared with O (∆t
−1/2
` )
for MLMC without smoothing.

Work Discussion for MLMC
̂
QMLMC
: the MLMC estimator, as defined in (14).
E[g(X(T)] − ̂
QMLMC
= E[g(X(T))] − E[g(X
∆tL
(T))]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error I: bias or weak error of O(∆tL)
+ E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )]
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error II: numerical smoothing error of O(M−s
Lag,L
)+O(TOLκ+1
Newton,L
)
+ E[IL (Y−1,Z
(1)
−1 ,...,Z
(d)
−1 )] − ̂
QMLMC
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶
Error III: MLMC statistical error of O
⎛
⎝
√
∑L
`=L0
√
MLag,`+log(TOL−1
Newton,`
)
⎞
⎠
.
(15)
30

ASGQ Quadrature Error Convergence
101
102
103
104
MASGQ
10-3
10
-2
10
-1
100
101
Relative
Quadrature
Error
ASGQ without smoothing
ASGQ with numerical smoothing
(a)
101
102
103
104
MASGQ
10-3
10
-2
10-1
100
101
Relative
Quadrature
Error
ASGQ without smoothing
ASGQ with numerical smoothing
(b)
Figure 5.1: Digital option under the Heston model: Comparison of the relative
quadrature error convergence for the ASGQ method with/out numerical
smoothing. (a) Without Richardson extrapolation (N = 8), (b) with
Richardson extrapolation (Nfine level = 8).
31

QMC Error Convergence
102
103
104
QMC samples
10-3
10
-2
10-1
95%
Statistical
error
QMC without smoothing
slope=-0.52
QMC with numerical smoothing
slope=-0.85
(a)
102
103
104
105
QMC samples
10-5
10
-4
10
-3
10-2
10-1
95%
Statistical
error
QMC without smoothing
slope=-0.64
QMC with numerical smoothing
slope=-0.92
(b)
Figure 5.2: Comparison of the 95% statistical error convergence for rQMC
with/out numerical smoothing. (a) Digital option under Heston, (b) digital
option under GBM.
32

Errors in the Numerical Smoothing
101
102
MLag
10-5
10
-4
10
-3
10-2
10
-1
Relative
numerical
smoothing
error
Relative numerical smoothing error
slope=-4.02
(a)
10-10
10-8
10-6
10-4
10-2
100
TOLNewton
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Relative
numerical
smoothing
error
10
-4
Relative numerical smoothing error
(b)
Figure 5.3: Call option under GBM with N = 4: The relative numerical
smoothing error for a fixed number of ASGQ points MASGQ = 103
plotted
against (a) different values of MLag with a fixed Newton tolerance
TOLNewton = 10−10
, (b) different values of TOLNewton with a fixed number of
Laguerre quadrature points MLag = 128.
33

Comparing ASGQ with MC
10
-3
10
-2
10
-1
Total Relative Error
10
-1
10
0
101
10
2
10
3
Computational
Work
MC (without smoothing, without Richardson extrapolation)
MC (without smoothing, with Richardson extrapolation)
ASGQ (with smoothing, without Richardson extrapolation)
ASGQ (with smoothing, with Richardson extrapolation)
Figure 5.4: Digital option under Heston: Computational work comparison for
ASGQ with numerical smoothing and MC with the different configurations in
terms of the level of Richardson extrapolation.
34

Numerical Results for ASGQ
Consider digital/call/basket options in Heston or discretized GBM
models.
ASGQ with numerical smoothing 10 - 100× faster in dim. around
20 than MC.
Example Total relative CPU time
error (ASGQ/MC) in %
Single digital option (GBM) 0.4% 0.2%
Single call option (GBM) 0.5% 0.3%
Single digital option (Heston) 0.4% 3.2%
Single call option (Heston) 0.5% 0.4%
4-dimensional basket call option (GBM) 0.8% 7.4%
Table 1: Summary of the relative errors and computational gains achieved
using ASGQ with numerical smoothing compared to the MC method, to
realize a certain error tolerance. The CPU time ratios are computed for the
best configuration with Richardson extrapolation for each method.
35

Numerical Results for MLMC
Method κL α β γ Numerical Complexity
Without smoothing for digital under GBM 709 1 1/2 1 O (TOL−2.5
)
With numerical smoothing for digital under GBM 3 1 1 1 O (TOL−2
(log(TOL))
2
)
Without smoothing for digital under Heston 245 1 1/2 1 O (TOL−2.5
)
With numerical smoothing for digital under Heston 7 1 1 1 O (TOL−2
log(TOL)2
)
With numerical smoothing for GBM density 5 1 1 1 O (TOL−2
(log(TOL))
2
)
With numerical smoothing for Heston density 8 1 1 1 O (TOL−2
(log(TOL))
2
)
Table 2: Summary of the MLMC numerical results observed different examples. κL is the kurtosis at the
deepest levels of MLMC, (α,β,γ) are weak, variance decay and work rates respectively. TOL is the
user-selected MLMC tolerance.
36

Digital Option under the Heston Model:
MLMC Without Smoothing
0 1 2 3 4 5 6
-8
-6
-4
-2
0 1 2 3 4 5 6
-10
-5
0
0 1 2 3 4 5 6
2
4
6
0 1 2 3 4 5 6
50
100
150
200
kurtosis
Figure 5.5: Digital option under Heston: Convergence plots for MLMC
without smoothing.
37

MLMC With Numerical Smoothing
0 1 2 3 4 5 6 7
-15
-10
-5
0 1 2 3 4 5 6 7
-10
-5
0
0 1 2 3 4 5 6 7
2
4
6
8
0 1 2 3 4 5 6 7
8
9
10
11
12
kurtosis
Figure 5.6: Digital option under Heston: Convergence plots for MLMC with
numerical smoothing.
38

Numerical Complexity Comparison
10-4
10-3
10-2
10-1
TOL
1e-04
1e-02
10
2
10
3
E[W]
MLMC without smoothing
TOL
-2.5
MLMC+ Numerical smoothing
TOL-2
log(TOL)2
Figure 5.7: Digital option under Heston: Comparison of the numerical
complexity of i) standard MLMC, and ii) MLMC with numerical smoothing.
Numerical computational cost improved from O(TOL−5/2
) without smoothing
to O(TOL−2
) with numerical smoothing.
39

Conclusions in the Context of
Deterministic Quadrature Methods
1 We introduce a a novel numerical smoothing technique, combined with
ASGQ/QMC, Hierarchical Brownian Bridge and Richardson
extrapolation.
2 Our analysis and numerical experiments show that our novel approach
outperforms substantially the MC method for high dimensional cases
and for dynamics where discretization is needed.
3 We provide a smoothness analysis for the smoothed integrand in the
time stepping setting. We also provide a related error discussion of
our approach.
4 We illustrate numerically that traditional schemes of the Heston
dynamics have low regularity, and we use an alternative smooth
scheme to simulate the volatility process based on the sum of
Ornstein-Uhlenbeck (OU) processes.
5 More details can be found in
“Numerical Smoothing with Hierarchical Adaptive Sparse Grids and
Quasi-Monte Carlo Methods for Efficient Option Pricing”. In: arXiv
preprint arXiv:2111.01874 (2021), to appear in Quantitative Finance
Journal (2022)

Conclusions in the Context of
MLMC Methods
1 We propose a numerical smoothing approach that can be combined with
MLMC estimator for efficient option pricing and density estimation.
2 Compared to the case without smoothing
▸ We significantly reduce the kurtosis at the deep levels of MLMC which
improves the robustness of the estimator.
▸ We improve the strong convergence rate ⇒ improvement of MLMC
complexity from O (TOL−2.5
) to O (TOL−2
log(TOL)2
)
" without the need to use higher order schemes such as Milstein scheme as
in (Giles 2008a; Giles, Debrabant, and Rößler 2013)
3 Contrary to the smoothing strategy used in (Giles, Nagapetyan, and Ritter
2015), our numerical smoothing approach
▸ Does not deteriorate the strong convergence behavior.
▸ When estimating densities: our pointwise error does not increases
exponentially with respect to the dimension of state vector.
4 Our approach can be extended: (i) to many model dynamics and payoff
structures; (ii) computing financial Greeks; (iii) computing distribution
functions; (iv) risk estimation .
5 More details can be found in
Christian Bayer, Chiheb Ben Hammouda, and Raúl Tempone. “Multilevel
Monte Carlo combined with numerical smoothing for robust and efficient
option pricing and density estimation”. In: arXiv preprint
arXiv:2003.05708 (2022)

Related References
Thank you for your attention!
[1] C. Bayer, C. Ben Hammouda, R. Tempone. Numerical Smoothing with
Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for
Efficient Option Pricing, arXiv:2111.01874 (2021), to appear in
Quantitative Finance.
[2] C. Bayer, C. Ben Hammouda, R. Tempone. Multilevel Monte Carlo
combined with numerical smoothing for robust and efficient option pricing
and density estimation, arXiv:2003.05708 (2022).
[3] C. Bayer, C. Ben Hammouda, A. Papapantoleon, M. Samet, R. Tempone.
Optimal Damping with Hierarchical Adaptive Quadrature for Efficient
Fourier Pricing of Multi-Asset Options in Lévy Models, arXiv:2203.08196
(2022).
[4] C. Bayer, C. Ben Hammouda, R. Tempone. Hierarchical adaptive sparse
grids and quasi-Monte Carlo for option pricing under the rough Bergomi
model, Quantitative Finance, 2020.
42

Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation

More Related Content

Similar to Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation (20)

Recently uploaded (20)

Numerical Smoothing and Hierarchical Approximations for E cient Option Pricing and Density Estimation