SlideShare a Scribd company logo
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Using Transformations of Variables to Improve Inference
A. Charpentier ( UQAM),
joint work with J.-D. Fermanian (CREST) E. Flachaire (AMSE) G. Geenens
(UNSW), A. Oulidi (UIO), D. Paindaveine (ULB), O. Scaillet (UNIGE)
Montréal, UQAM, November 2018
@freakonometrics freakonometrics freakonometrics.hypotheses.org 1
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Side Note
Most of the contents here is based on old results (revised, and continued)
Work on genealogical trees, Étude de la démographie française du XIXe siècle à
partir de données collaboratives de généalogie and Internal Migrations in France in
the Nineteenth Century with E. Gallic.
Children Grandchildren Great-grandchildren
0.00
0.126
0.252
0.378
0.503
Children Grandchildren Great-grandchildren
0.00
0.029
0.058
0.087
0.116
@freakonometrics freakonometrics freakonometrics.hypotheses.org 2
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Motivation
2005, The Estimation of Copulas: Theory and Practice
with J.-D. Fermanian and O. Scaillet
survey on non-parametric techniques
(kernel base) to visualize
the estimator of a copula density
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Idea : beta kernels and transformed kernels
More recently, mix those techniques in the univariate case
@freakonometrics freakonometrics freakonometrics.hypotheses.org 3
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Motivation
Consider some n-i.i.d. sample {(Xi, Yi)} with cu-
mulative distribution function FXY and joint den-
sity fXY . Let FX and FY denote the marginal dis-
tributions, and C the copula,
FXY (x, y) = C(FX(x), FY (y))
so that
fXY (x, y) = fX(x)fY (y)c(FX(x), FY (y))
We want a nonparametric estimate of c on [0, 1]2
. 1e+01 1e+03 1e+05
1e+011e+021e+031e+041e+05
@freakonometrics freakonometrics freakonometrics.hypotheses.org 4
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Notations
Define uniformized n-i.i.d. sample {(Ui, Vi)}
Ui = FX(Xi) and Vi = FY (Yi)
or uniformized n-i.i.d. pseudo-sample {( ˆUi, ˆVi)}
ˆUi =
n
n + 1
ˆFXn(Xi) and ˆVi =
n
n + 1
ˆFY n(Yi)
where ˆFXn and ˆFY n denote empirical c.d.f.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 5
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Standard Kernel Estimate
The standard kernel estimator for c, say ˆc∗
, at (u, v) ∈ I would be (see Wand &
Jones (1995))
ˆc∗
(u, v) =
1
n|HUV |1/2
n
i=1
K H
−1/2
UV
u − Ui
v − Vi
, (1)
where K : R2
→ R is a kernel function and HUV is a bandwidth matrix.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Standard Kernel Estimate
However, this estimator is not consistent along
boundaries of [0, 1]2
E(ˆc∗
(u, v)) =
1
4
c(u, v) + O(h) at corners
E(ˆc∗
(u, v)) =
1
2
c(u, v) + O(h) on the borders
if K is symmetric and HUV symmetric.
Corrections have been proposed, e.g. mirror reflec-
tion Gijbels (1990) or the usage of boundary kernels
Chen (2007), but with mixed results.
Remark : the graph on the bottom is ˆc∗
on the
(first) diagonal. 0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 7
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Mirror Kernel Estimate
Use an enlarged sample : instead of only {( ˆUi, ˆVi)},
add {(− ˆUi, ˆVi)}, {( ˆUi, − ˆVi)}, {(− ˆUi, − ˆVi)},
{( ˆUi, 2 − ˆVi)}, {(2 − ˆUi, ˆVi)},{(− ˆUi, 2 − ˆVi)},
{(2 − ˆUi, − ˆVi)} and {(2 − ˆUi, 2 − ˆVi)}.
See Gijbels & Mielniczuk (1990).
That estimator will be used as a benchmark in the
simulation study.
0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 8
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Using Beta Kernels
Use a Kernel which is a product of beta kernels
Kxi
(u) ∝ x
u1
b
1,i [1 − x1,i]
u1
b · x
u2
b
2,i [1 − x2,i]
u2
b
for some b > 0, see Chen (1999).
for some observation xi in the lower left corner
0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 9
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Using Beta Kernels
@freakonometrics freakonometrics freakonometrics.hypotheses.org 10
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Probit Transformation
See Devroye & Gyöfi (1985) and Marron & Ruppert
(1994).
Define normalized n-i.i.d. sample {(Si, Ti)}
Si = Φ−1
(Ui) and Ti = Φ−1
(Vi)
or normalized n-i.i.d. pseudo-sample {( ˆSi, ˆTi)}
ˆUi = Φ−1
( ˆUi) and ˆVi = Φ−1
( ˆVi)
where Φ−1
is the quantile function of N(0, 1)
(probit transformation). −3 −2 −1 0 1 2 3
−3−2−10123
@freakonometrics freakonometrics freakonometrics.hypotheses.org 11
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Probit Transformation
FST (x, y) = C(Φ(x), Φ(y))
so that
fST (x, y) = φ(x)φ(y)c(Φ(x), Φ(y))
Thus
c(u, v) =
fST (Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
.
So use
ˆc(τ)
(u, v) =
ˆfST (Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
−3 −2 −1 0 1 2 3
−3−2−10123
@freakonometrics freakonometrics freakonometrics.hypotheses.org 12
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
The naive estimator
Since we cannot use
ˆf∗
ST (s, t) =
1
n|HST |1/2
n
i=1
K H
−1/2
ST
s − Si
t − Ti
,
where K is a kernel function and HST is a band-
width matrix, use
ˆfST (s, t) =
1
n|HST |1/2
n
i=1
K H
−1/2
ST
s − ˆSi
t − ˆTi
.
and the copula density is
ˆc(τ)
(u, v) =
ˆfST (Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 13
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
The naive estimator
ˆc(τ)
(u, v) =
1
n|HST |1/2φ(Φ−1(u))φ(Φ−1(v))
n
i=1
K H
−1/2
ST
Φ−1
(u) − Φ−1
( ˆUi)
Φ−1(v) − Φ−1( ˆVi)
as suggested in C., Fermanian & Scaillet (2007) and Lopez-Paz . et al. (2013).
Note that Omelka . et al. (2009) obtained theoretical properties on the
convergence of ˆC(τ)
(u, v) (not c).
In Probit transformation for nonparametric kernel estimation of the copula density
with G. Geenens and D. Paindaveine, we extended that estimator.
See also kdecopula R package by T. Nagler
@freakonometrics freakonometrics freakonometrics.hypotheses.org 14
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improved probit-transformation copula density estimators
When estimating a density from pseudo-sample, Loader (1996) and Hjort &
Jones (1996) define a local likelihood estimator
Around (s, t) ∈ R2
, use a polynomial approximation of order p for log fST
log fST (ˇs, ˇt) a1,0(s, t) + a1,1(s, t)(ˇs − s) + a1,2(s, t)(ˇt − t)
.
= Pa1
(ˇs − s, ˇt − t)
log fST (ˇs, ˇt) a2,0(s, t) + a2,1(s, t)(ˇs − s) + a2,2(s, t)(ˇt − t)
+ a2,3(s, t)(ˇs − s)2
+ a2,4(s, t)(ˇt − t)2
+ a2,5(s, t)(ˇs − s)(ˇt − t)
.
= Pa2
(ˇs − s, ˇt − t).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 15
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improved probit-transformation copula density estimators
Remark Vectors a1(s, t) = (a1,0(s, t), a1,1(s, t), a1,2(s, t)) and
a2(s, t)
.
= (a2,0(s, t), . . . , a2,5(s, t)) are then estimated by solving a weighted
maximum likelihood problem.
˜ap(s, t) = arg max
ap
n
i=1
K H
−1/2
ST
s − ˆSi
t − ˆTi
Pap
( ˆSi − s, ˆTi − t)
−n
R2
K H
−1/2
ST
s − ˇs
t − ˇt
exp Pap (ˇs − s, ˇt − t) dˇs dˇt ,
The estimate of fST at (s, t) is then ˜f
(p)
ST (s, t) = exp(˜ap,0(s, t)), for p = 1, 2.
The Improved probit-transformation kernel copula density estimators are
˜c(τ,p)
(u, v) =
˜f
(p)
ST (Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
@freakonometrics freakonometrics freakonometrics.hypotheses.org 16
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improved probit-transformation
copula density estimators
For the local log-linear (p = 1) approximation
˜c(τ,1)
(u, v) =
exp(˜a1,0(Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 17
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improved probit-transformation
copula density estimators
For the local log-quadratic (p = 2) approximation
˜c(τ,2)
(u, v) =
exp(˜a2,0(Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
0.0 0.2 0.4 0.6 0.8 1.0
01234567
@freakonometrics freakonometrics freakonometrics.hypotheses.org 18
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
0
2
4
6
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Asymptotic properties
A1. The sample {(Xi, Yi)} is a n- i.i.d. sample from the joint distribution FXY ,
an absolutely continuous distribution with marginals FX and FY strictly
increasing on their support ;
(uniqueness of the copula)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 19
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Asymptotic properties
A2. The copula C of FXY is such that (∂C/∂u)(u, v) and (∂2
C/∂u2
)(u, v) exist
and are continuous on {(u, v) : u ∈ (0, 1), v ∈ [0, 1]}, and (∂C/∂v)(u, v) and
(∂2
C/∂v2
)(u, v) exist and are continuous on {(u, v) : u ∈ [0, 1], v ∈ (0, 1)}. In
addition, there are constants K1 and K2 such that



∂2
C
∂u2
(u, v) ≤
K1
u(1 − u)
for (u, v) ∈ (0, 1) × [0, 1];
∂2
C
∂v2
(u, v) ≤
K2
v(1 − v)
for (u, v) ∈ [0, 1] × (0, 1);
A3. The density c of C exists, is positive and admits continuous second-order
partial derivatives on the interior of the unit square I. In addition, there is a
constant K00 such that
c(u, v) ≤ K00 min
1
u(1 − u)
,
1
v(1 − v)
∀(u, v) ∈ (0, 1)2
.
see Segers (2012).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 20
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Asymptotic properties
Assume that K(z1, z2) = φ(z1)φ(z2) and HST = h2
I with h ∼ n−a
for some
a ∈ (0, 1/4). Under Assumptions A1-A3, the ‘naive’ probit transformation kernel
copula density estimator at any (u, v) ∈ (0, 1)2
is such that
√
nh2 ˆc(τ)
(u, v) − c(u, v) − h2 b(u, v)
φ(Φ−1(u))φ(Φ−1(v))
L
−→ N 0, σ2
(u, v) ,
where b(u, v) =
1
2
∂2
c
∂u2
(u, v)φ2
(Φ−1
(u)) +
∂2
c
∂v2
(u, v)φ2
(Φ−1
(v))
− 3
∂c
∂u
(u, v)Φ−1
(u)φ(Φ−1
(u)) +
∂c
∂v
(u, v)Φ−1
(v)φ(Φ−1
(v))
+ c(u, v) {Φ−1
(u)}2
+ {Φ−1
(v)}2
− 2 (2)
and σ2
(u, v) =
c(u, v)
4πφ(Φ−1(u))φ(Φ−1(v))
.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 21
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
The Amended version
The last unbounded term in b be easily adjusted.
ˆc(τam)
(u, v) =
ˆfST (Φ−1
(u), Φ−1
(v))
φ(Φ−1(u))φ(Φ−1(v))
×
1
1 + 1
2 h2 ({Φ−1(u)}2 + {Φ−1(v)}2 − 2)
.
The asymptotic bias becomes proportional to
b(am)
(u, v) =
1
2
∂2
c
∂u2
(u, v)φ2
(Φ−1
(u)) +
∂2
c
∂v2
(u, v)φ2
(Φ−1
(v))
−3
∂c
∂u
(u, v)Φ−1
(u)φ(Φ−1
(u)) +
∂c
∂v
(u, v)Φ−1
(v)φ(Φ−1
(v)) .
@freakonometrics freakonometrics freakonometrics.hypotheses.org 22
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
A local log-linear probit-transformation kernel estimator
˜c∗(τ,1)
(u, v) = ˜f
∗(1)
ST (Φ−1
(u), Φ−1
(v))/ φ(Φ−1
(u))φ(Φ−1
(v))
Then
√
nh2 ˜c∗(τ,1)
(u, v) − c(u, v) − h2 b(1)
(u, v)
φ(Φ−1(u))φ(Φ−1(v))
L
−→ N 0, σ(1) 2
(u, v) ,
where b(1)
(u, v) =
1
2
∂2
c
∂u2
(u, v)φ2
(Φ−1
(u)) +
∂2
c
∂v2
(u, v)φ2
(Φ−1
(v))
−
1
c(u, v)
∂c
∂u
(u, v)
2
φ2
(Φ−1
(u)) +
∂c
∂v
(u, v)
2
φ2
(Φ−1
(v))
−
∂c
∂u
(u, v)Φ−1
(u)φ(Φ−1
(u)) +
∂c
∂v
(u, v)Φ−1
(v)φ(Φ−1
(v)) − 2c(u, v)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 23
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Using a higher order polynomial approximation
Locally fitting a polynomial of a higher degree is known to reduce the asymptotic
bias of the estimator, here from order O(h2
) to order O(h4
), see Loader (1996) or
Hjort (1996), under sufficient smoothness conditions.
If fST admits continuous fourth-order partial derivatives and is positive at (s, t),
then
√
nh2 ˜f
∗(2)
ST (s, t) − fST (s, t) − h4
b
(2)
ST (s, t)
L
−→ N 0, σ
(2)
ST
2
(s, t) ,
where σ
(2)
ST
2
(s, t) =
5
2
fST (s, t)
4π
and
b
(2)
ST (s, t) = −
1
8
fST (s, t)
×
∂4
g
∂s4
+
∂4
g
∂t4
+ 4
∂3
g
∂s3
∂g
∂s
+
∂3
g
∂t3
∂g
∂t
+
∂3
g
∂s2∂t
∂g
∂t
+
∂3
g
∂s∂t2
∂g
∂s
+ 2
∂4
g
∂s2∂t2
(s, t),
with g(s, t) = log fST (s, t).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 24
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Using a higher order polynomial approximation
A4. The copula density c(u, v) = (∂2
C/∂u∂v)(u, v) admits continuous
fourth-order partial derivatives on the interior of the unit square [0, 1]2
.
Then
√
nh2 ˜c∗(τ,2)
(u, v) − c(u, v) − h4 b(2)
(u, v)
φ(Φ−1(u))φ(Φ−1(v))
L
−→ N 0, σ(2) 2
(u, v)
where σ(2) 2
(u, v) =
5
2
c(u, v)
4πφ(Φ−1(u))φ(Φ−1(v))
@freakonometrics freakonometrics freakonometrics.hypotheses.org 25
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improving Bandwidth choice
Consider the principal components decomposition of the (n × 2) matrix
[ˆS, ˆT ] = M.
Let W1 = (W11, W12)T
and W2 = (W21, W22)T
be the eigenvectors of MT
M. Set
Q
R
=


W11 W12
W21 W22

 S
T
= W
S
T
which is only a linear reparametrization of R2
, so
an estimate of fST can be readily obtained from an
estimate of the density of (Q, R)
Since { ˆQi} and { ˆRi} are empirically uncorrela-
ted, consider a diagonal bandwidth matrix HQR =
diag(h2
Q, h2
R).
−4 −2 0 2 4
−3−2−1012
@freakonometrics freakonometrics freakonometrics.hypotheses.org 26
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Improving Bandwidth choice
Use univariate procedures to select hQ and hR independently
Denote ˜f
(p)
Q and ˜f
(p)
R (p = 1, 2), the local log-polynomial estimators for the
densities
hQ can be selected via cross-validation (see Section 5.3.3 in Loader (1999))
hQ = arg min
h>0
∞
−∞
˜f
(p)
Q (q)
2
dq −
2
n
n
i=1
˜f
(p)
Q(−i)( ˆQi) ,
where ˜f
(p)
Q(−i) is the ‘leave-one-out’ version of ˜f
(p)
Q .
@freakonometrics freakonometrics freakonometrics.hypotheses.org 27
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Graphical Comparison (loss ALAE dataset)
c~(τ2)
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1.25
1.25
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1
1.25
1.25
1.5
1.5
2
2
4
c^
β
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1.25
1.25
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.25
0.25
0.25 0.25
0.5
0.5
0.75
0.75
0.75
1
1
1
1
1
1.25
1.25
1.25
1.25
1.25
1.5
1.5
2
2
2
4
c^
b
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1.25
1.25
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1.25
1.25
1.5
1.5
2
2
c^
p
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1.25
1.25
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.25
0.25
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1
1.25
1.25
1.25
1.25
1.5
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
c~(τ2)
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1
1.25
1.25
1.5
1.5
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
c^
β
Loss (X)
ALAE(Y)
0.25
0.25
0.25 0.25
0.5
0.5
0.75
0.75
0.75
1
1
1
1
1
1.25
1.25
1.25
1.25
1.25
1.5
1.5
1.5
2
2
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
c^
b
Loss (X)
ALAE(Y)
0.25
0.25
0.5
0.5
0.75
0.751
1
1.25
1.25
1.5
1.5
2
2
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
c^
p
Loss (X)
ALAE(Y)
0.25
0.25
0.25
0.25
0.5
0.5
0.75
0.75
1
1
1
1.25
1.25
1.25
1.25
1.5
1.5
1.5
2
2
4
@freakonometrics freakonometrics freakonometrics.hypotheses.org 28
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Simulation Study
M = 1, 000 independent random samples {(Ui, Vi)}n
i=1 of sizes n = 200, n = 500
and n = 1000 were generated from each of the following copulas :
· the independence copula (i.e., Ui’s and Vi’s drawn independently) ;
· the Gaussian copula, with parameters ρ = 0.31, ρ = 0.59 and ρ = 0.81 ;
· the Student t-copula with 4 degrees of freedom, with parameters ρ = 0.31,
ρ = 0.59 and ρ = 0.81 ;
· the Frank copula, with parameter θ = 1.86, θ = 4.16 and θ = 7.93 ;
· the Gumbel copula, with parameter θ = 1.25, θ = 1.67 and θ = 2.5 ;
· the Clayton copula, with parameter θ = 0.5, θ = 1.67 and θ = 2.5.
(approximated) MISE relative to the MISE of the mirror-reflection estimator
(last column), n = 1000. Bold values show the minimum MISE for the
corresponding copula (non-significantly different values are highlighted as well).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 29
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
n = 1000 ˆc(τ) ˆc(τam) ˜c(τ,1) ˜c(τ,2) ˆc
(β)
1 ˆc
(β)
2 ˆc
(B)
1 ˆc
(B)
2 ˆc
(p)
1 ˆc
(p)
2 ˆc
(p)
3
Indep 3.57 2.80 2.89 1.40 7.96 11.65 1.69 3.43 1.62 0.50 0.14
Gauss2 2.03 1.52 1.60 0.76 4.63 6.06 1.10 1.82 0.98 0.66 0.89
Gauss4 0.63 0.49 0.44 0.21 1.72 1.60 0.75 0.58 0.62 0.99 2.93
Gauss6 0.21 0.20 0.11 0.05 0.74 0.33 0.77 0.37 0.72 1.21 2.83
Std(4)2 0.61 0.56 0.50 0.40 1.57 1.80 0.78 0.67 0.75 1.01 1.88
Std(4)4 0.21 0.27 0.17 0.15 0.88 0.51 0.75 0.42 0.75 1.12 2.07
Std(4)6 0.09 0.17 0.08 0.09 0.70 0.19 0.82 0.47 0.90 1.17 1.90
Frank2 3.31 2.42 2.57 1.35 7.16 9.63 1.70 2.95 1.31 0.45 0.49
Frank4 2.35 1.45 1.51 0.99 4.42 4.89 1.49 1.65 0.60 0.72 6.14
Frank6 0.96 0.52 0.45 0.44 1.51 1.19 1.35 0.76 0.65 1.58 7.25
Gumbel2 0.65 0.62 0.56 0.43 1.77 1.97 0.82 0.75 0.83 1.03 1.52
Gumbel4 0.18 0.28 0.16 0.19 0.89 0.41 0.78 0.47 0.81 1.10 1.78
Gumbel6 0.09 0.21 0.10 0.15 0.78 0.29 0.85 0.58 0.94 1.12 1.63
Clayton2 0.63 0.60 0.51 0.34 1.78 1.99 0.78 0.70 0.79 1.04 1.79
Clayton4 0.11 0.26 0.10 0.15 0.79 0.27 0.83 0.56 0.90 1.10 1.50
Clayton6 0.11 0.28 0.08 0.15 0.82 0.35 0.88 0.67 0.96 1.09 1.36
@freakonometrics freakonometrics freakonometrics.hypotheses.org 30
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Probit Transform in the Univariate Case : the log transform
See Log-Transform Kernel Density Estimation of Income Distribution with E.
Flachaire
The Gaussian kernel estimator of a density is
fZ(z) =
1
n
n
i=1
ϕ(z; zi, h)
where ϕ(·; µ, σ) is the density of the normal distribution.
Use a Gaussian kernel estimation of the density using a
logarithmic transformation of data xi’s
fX(x) =
fZ(log(x))
x
=
1
n
n
i=1
h(x; log xi, h)
where h(·; µ, σ) is the density of the lognormal distribution.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 31
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Probit Transform in the Univariate Case : the log transform
Recall that classically bias[fZ(z)] ∼
h2
2
fZ(z) and Var[fZ(z)] ∼
0.2821
nh
fZ(z)
Here, in the neighborhood of 0,
bias[fX(x)] ∼
h2
2
fX(x) + 3x · fX(x) + x2
· fX(x)
which is positive if fX(0) > 0, while
Var[fX(x)] ∼
0.2821
nhx
fX(x)
The log-transform kernel may perform
poorly when fX(0) > 0,
see Silverman (1986).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 32
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Back on the Transformed Kernel
See Devroye & Györfi (1985), and Devroye & Lugosi (2001)
... use the transformed kernel the other way, R → [0, 1] → R
@freakonometrics freakonometrics freakonometrics.hypotheses.org 33
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Back on the Transformed Kernel
Interesting point, the optimal T should be F,
thus, T can be Fθ
@freakonometrics freakonometrics freakonometrics.hypotheses.org 34
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Heavy Tailed distribution
Let X denote a (heavy-tailed) random variable with tail index α ∈ (0, ∞), i.e.
P(X > x) = x−α
L1(x)
where L1 is some regularly varying function.
Let T denote a R → [0, 1] function, such that 1 − T is regularly varying at
infinity, with tail index β ∈ (0, ∞).
Define Q(x) = T−1
(1 − x−1
) the associated tail quantile function, then
Q(x) = x1/β
L2(1/x), where L2 is some regularly varying function (the de Bruyn
conjugate of the regular variation function associated with T). Assume here that
Q(x) = bx1/β
Let U = T(X). Then, as u → 1
P(U > u) ∼ (1 − u)α/β
.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 35
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Heavy Tailed distribution
see C. & Oulidi (2010), α = 0.75−1
, T0.75−1 , T0.65−1
lighter
, T0.85−1
heavier
and Tˆα
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Density
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 36
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Heavy Tailed distribution
see C. & Oulidi (2007) Beta kernel quantile estimators of heavy-tailed loss
distributions, impact on quantile estimation ?
20 30 40 50 60 70 80
0.000.010.020.030.040.050.06
Density
@freakonometrics freakonometrics freakonometrics.hypotheses.org 37
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Heavy Tailed distribution
see C. & Oulidi (2007), impact on quantile estimation ? bias ? m.s.e. ?
@freakonometrics freakonometrics freakonometrics.hypotheses.org 38
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Bimodal distribution
Let X denote a bimodal distribution, obtained from a mixture
X ∼ FΘ



F0 if Θ = 0, (probability p0)
F1 if Θ = 1, (probability p1)
Idea : T(X) can be obtained as transformation of two distributions on [0, 1],
T(X) ∼ GΘ



G0 if Θ = 0, (probability p0)
G1 if Θ = 1, (probability p1)
→ standard for income observations...
@freakonometrics freakonometrics freakonometrics.hypotheses.org 39
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Which transformation ?
GB2 : t(y; a, b, p, q) =
|a|yap−1
bapB(p, q)[1 + (y/b)a]p+q
, for y > 0,
GB2
q→∞
~~
a=1

p=1
55
q=1
@@
GG
a→0
ÓÓ
a=1

p=1
%%
Beta2
q→∞
ww
SM
q→∞
xx
q=1
99
Dagum
p=1

Lognormal Gamma Weibull Champernowne
@freakonometrics freakonometrics freakonometrics.hypotheses.org 40
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Example, X ∼ Pareto
Pareto plot (log F(x) vs. log x), and histogram of {U1, · · · , Un}, Ui = Tˆθ(Xi)
log(x)
log(1−F(x))
1 5 10 50 100 500
5e−055e−045e−035e−025e−01
u=H(x)
densityg(u)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.01.2
@freakonometrics freakonometrics freakonometrics.hypotheses.org 41
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Example, X ∼ Pareto
Estimation of the density g of U = Tθ
(X), and estimated c.d.f of X,
Fn(x) =
x
0
fn(y)dy where fn(y) = gn(Tˆθ(y)) · tˆθ(y)
u=H(x)
densityg(u)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.01.2
Adaptative kernel
log(x)
log(1−F(x))
1 5 10 50 100 500
5e−055e−045e−035e−025e−01
@freakonometrics freakonometrics freakonometrics.hypotheses.org 42
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Beta kernel
g(u) =
n
i=1
1
n
· b u;
Ui
h
,
1 − Ui
h
u ∈ [0, 1].
with some possible boundary correction, as suggested in Chen (1999),
u
h
→ ρ(u, h) = 2h2
+ 2.5 − (4h4
+ 6h2
+ 2.25 − u2
− u/h)1/2
Problem : choice of the bandwidth h ? Standard loss function
L(h) = [gn(u) − g(u)]2
du = [gn(u)]2
du − 2 gn(u) · g(u)du
CV (h)
+ [g(u)]2
du
where
CV (h) = gn(u)du
2
−
2
n
n
i=1
g(−i)(Ui)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 43
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Beta kernel
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q
@freakonometrics freakonometrics freakonometrics.hypotheses.org 44
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Beta kernel
u=H(x)
densityg(u)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.01.2
Beta mixtures (.1)
Beta mixtures (.05)
Beta mixtures (.02)
Beta mixtures (.01)
log(x)
log(1−F(x))
1 5 10 50 100 5001e−051e−041e−031e−021e−01
@freakonometrics freakonometrics freakonometrics.hypotheses.org 45
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Mixture of Beta distributions
g(u) =
k
j=1
πj · b u; αj, βj u ∈ [0, 1].
Problem : choice the number of components k (and estimation...). Use of
stochastic EM algorithm (or sort of) see Celeux  Diebolt (1985).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 46
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Mixture of Beta distributions
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q
@freakonometrics freakonometrics freakonometrics.hypotheses.org 47
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Mixture of Beta distributions
u=H(x)
densityg(u)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.01.2
Beta mixtures
log(x)
log(1−F(x))
1 5 10 50 100 5001e−051e−041e−031e−021e−01
@freakonometrics freakonometrics freakonometrics.hypotheses.org 48
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Bernstein approximation
g(u) =
m
k=1
[mωk] · b (u; k, m − k) u ∈ [0, 1].
where ωk = G
k
m
− G
k − 1
m
.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 49
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Bernstein approximation
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q
@freakonometrics freakonometrics freakonometrics.hypotheses.org 50
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Bernstein approximation
u=H(x)
densityg(u)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.01.2
Bernstein (100)
Bernstein (60)
Bernstein (40)
Bernstein (20)
Bernstein (10)
Bernstein (5)
log(x)
log(1−F(x))
1 5 10 50 100 5001e−051e−041e−031e−021e−01
@freakonometrics freakonometrics freakonometrics.hypotheses.org 51
Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference
Quantities of interest
Standard statistical quantities
• miae,
∞
0
fn(x) − f(x) dx
• mise,
∞
0
fn(x) − f(x)
2
dx
Inequality indices and risk measures, based on F(x) =
x
0
f(t)dt,
• Gini,
1
µ
∞
0
F(t)[1 − F(t)]dt
• Theil,
∞
0
t
µ
log
t
µ
f(t)dt
• VaR-quantile, x such that F(x) = P(X ≤ x) = α, i.e. F−1
(α)
• TVaR-expected shorfall, E[X|X  F−1
(α)]
where µ =
∞
0
[1 − F(x)]dx.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 52

More Related Content

PDF
Slides econometrics-2018-graduate-4
PDF
Slides econometrics-2018-graduate-3
PDF
Side 2019 #7
PDF
Side 2019 #5
PDF
Side 2019, part 1
PDF
Econometrics, PhD Course, #1 Nonlinearities
PDF
Side 2019, part 2
PDF
Slides univ-van-amsterdam
Slides econometrics-2018-graduate-4
Slides econometrics-2018-graduate-3
Side 2019 #7
Side 2019 #5
Side 2019, part 1
Econometrics, PhD Course, #1 Nonlinearities
Side 2019, part 2
Slides univ-van-amsterdam

What's hot (20)

PDF
Varese italie #2
PDF
Slides econometrics-2017-graduate-2
PDF
Slides econometrics-2018-graduate-2
PDF
Graduate Econometrics Course, part 4, 2017
PDF
Slides erm-cea-ia
PDF
Side 2019 #9
PDF
Slides amsterdam-2013
PDF
Side 2019 #3
PDF
Slides lln-risques
PDF
Slides ACTINFO 2016
PDF
Sildes buenos aires
PDF
Slides sales-forecasting-session2-web
PDF
Slides econ-lm
PDF
Side 2019 #6
PDF
Machine Learning in Actuarial Science & Insurance
PDF
Varese italie #2
PDF
Reinforcement Learning in Economics and Finance
PDF
Side 2019 #10
PDF
Hands-On Algorithms for Predictive Modeling
PDF
Slides barcelona Machine Learning
Varese italie #2
Slides econometrics-2017-graduate-2
Slides econometrics-2018-graduate-2
Graduate Econometrics Course, part 4, 2017
Slides erm-cea-ia
Side 2019 #9
Slides amsterdam-2013
Side 2019 #3
Slides lln-risques
Slides ACTINFO 2016
Sildes buenos aires
Slides sales-forecasting-session2-web
Slides econ-lm
Side 2019 #6
Machine Learning in Actuarial Science & Insurance
Varese italie #2
Reinforcement Learning in Economics and Finance
Side 2019 #10
Hands-On Algorithms for Predictive Modeling
Slides barcelona Machine Learning
Ad

Similar to transformations and nonparametric inference (20)

PDF
slides CIRM copulas, extremes and actuarial science
PDF
Lundi 16h15-copules-charpentier
PDF
Slides smart-2015
PDF
Technical
PPTX
Knowledge Representation, Inference and Reasoning
PDF
Proba stats-r1-2017
PDF
Lesson 26
PDF
AI Lesson 26
PDF
Fixed Point Results In Fuzzy Menger Space With Common Property (E.A.)
PPT
FourierTransform detailed power point presentation
PDF
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
PDF
D023031047
PDF
D023031047
PDF
UCB 2012-02-28
PDF
Slides laval stat avril 2011
PDF
BAYSM'14, Wien, Austria
PDF
Lecture 2-Filtering.pdf
slides CIRM copulas, extremes and actuarial science
Lundi 16h15-copules-charpentier
Slides smart-2015
Technical
Knowledge Representation, Inference and Reasoning
Proba stats-r1-2017
Lesson 26
AI Lesson 26
Fixed Point Results In Fuzzy Menger Space With Common Property (E.A.)
FourierTransform detailed power point presentation
SMART Seminar Series: "A journey in the zoo of Turing patterns: the topology ...
D023031047
D023031047
UCB 2012-02-28
Slides laval stat avril 2011
BAYSM'14, Wien, Austria
Lecture 2-Filtering.pdf
Ad

More from Arthur Charpentier (19)

PDF
Family History and Life Insurance
PDF
ACT6100 introduction
PDF
Family History and Life Insurance (UConn actuarial seminar)
PDF
Control epidemics
PDF
STT5100 Automne 2020, introduction
PDF
Family History and Life Insurance
PDF
Optimal Control and COVID-19
PDF
Slides OICA 2020
PDF
Lausanne 2019 #3
PDF
Lausanne 2019 #4
PDF
Lausanne 2019 #2
PDF
Lausanne 2019 #1
PDF
Side 2019 #11
PDF
Side 2019 #12
PDF
Side 2019 #8
PDF
Side 2019 #4
PDF
Pareto Models, Slides EQUINEQ
PDF
Econ. Seminar Uqam
PDF
Mutualisation et Segmentation
Family History and Life Insurance
ACT6100 introduction
Family History and Life Insurance (UConn actuarial seminar)
Control epidemics
STT5100 Automne 2020, introduction
Family History and Life Insurance
Optimal Control and COVID-19
Slides OICA 2020
Lausanne 2019 #3
Lausanne 2019 #4
Lausanne 2019 #2
Lausanne 2019 #1
Side 2019 #11
Side 2019 #12
Side 2019 #8
Side 2019 #4
Pareto Models, Slides EQUINEQ
Econ. Seminar Uqam
Mutualisation et Segmentation

Recently uploaded (20)

PDF
Foundation of Data Science unit number two notes
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Quality review (1)_presentation of this 21
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Computer network topology notes for revision
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
Foundation of Data Science unit number two notes
IBA_Chapter_11_Slides_Final_Accessible.pptx
annual-report-2024-2025 original latest.
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
STUDY DESIGN details- Lt Col Maksud (21).pptx
Quality review (1)_presentation of this 21
Miokarditis (Inflamasi pada Otot Jantung)
Introduction-to-Cloud-ComputingFinal.pptx
Clinical guidelines as a resource for EBP(1).pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
Database Infoormation System (DBIS).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Computer network topology notes for revision
Business Ppt On Nestle.pptx huunnnhhgfvu
Business Acumen Training GuidePresentation.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
climate analysis of Dhaka ,Banglades.pptx

transformations and nonparametric inference

  • 1. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Using Transformations of Variables to Improve Inference A. Charpentier ( UQAM), joint work with J.-D. Fermanian (CREST) E. Flachaire (AMSE) G. Geenens (UNSW), A. Oulidi (UIO), D. Paindaveine (ULB), O. Scaillet (UNIGE) Montréal, UQAM, November 2018 @freakonometrics freakonometrics freakonometrics.hypotheses.org 1
  • 2. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Side Note Most of the contents here is based on old results (revised, and continued) Work on genealogical trees, Étude de la démographie française du XIXe siècle à partir de données collaboratives de généalogie and Internal Migrations in France in the Nineteenth Century with E. Gallic. Children Grandchildren Great-grandchildren 0.00 0.126 0.252 0.378 0.503 Children Grandchildren Great-grandchildren 0.00 0.029 0.058 0.087 0.116 @freakonometrics freakonometrics freakonometrics.hypotheses.org 2
  • 3. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Motivation 2005, The Estimation of Copulas: Theory and Practice with J.-D. Fermanian and O. Scaillet survey on non-parametric techniques (kernel base) to visualize the estimator of a copula density 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 Idea : beta kernels and transformed kernels More recently, mix those techniques in the univariate case @freakonometrics freakonometrics freakonometrics.hypotheses.org 3
  • 4. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Motivation Consider some n-i.i.d. sample {(Xi, Yi)} with cu- mulative distribution function FXY and joint den- sity fXY . Let FX and FY denote the marginal dis- tributions, and C the copula, FXY (x, y) = C(FX(x), FY (y)) so that fXY (x, y) = fX(x)fY (y)c(FX(x), FY (y)) We want a nonparametric estimate of c on [0, 1]2 . 1e+01 1e+03 1e+05 1e+011e+021e+031e+041e+05 @freakonometrics freakonometrics freakonometrics.hypotheses.org 4
  • 5. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Notations Define uniformized n-i.i.d. sample {(Ui, Vi)} Ui = FX(Xi) and Vi = FY (Yi) or uniformized n-i.i.d. pseudo-sample {( ˆUi, ˆVi)} ˆUi = n n + 1 ˆFXn(Xi) and ˆVi = n n + 1 ˆFY n(Yi) where ˆFXn and ˆFY n denote empirical c.d.f. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 5
  • 6. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Standard Kernel Estimate The standard kernel estimator for c, say ˆc∗ , at (u, v) ∈ I would be (see Wand & Jones (1995)) ˆc∗ (u, v) = 1 n|HUV |1/2 n i=1 K H −1/2 UV u − Ui v − Vi , (1) where K : R2 → R is a kernel function and HUV is a bandwidth matrix. @freakonometrics freakonometrics freakonometrics.hypotheses.org 6
  • 7. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Standard Kernel Estimate However, this estimator is not consistent along boundaries of [0, 1]2 E(ˆc∗ (u, v)) = 1 4 c(u, v) + O(h) at corners E(ˆc∗ (u, v)) = 1 2 c(u, v) + O(h) on the borders if K is symmetric and HUV symmetric. Corrections have been proposed, e.g. mirror reflec- tion Gijbels (1990) or the usage of boundary kernels Chen (2007), but with mixed results. Remark : the graph on the bottom is ˆc∗ on the (first) diagonal. 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 7 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 8. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Mirror Kernel Estimate Use an enlarged sample : instead of only {( ˆUi, ˆVi)}, add {(− ˆUi, ˆVi)}, {( ˆUi, − ˆVi)}, {(− ˆUi, − ˆVi)}, {( ˆUi, 2 − ˆVi)}, {(2 − ˆUi, ˆVi)},{(− ˆUi, 2 − ˆVi)}, {(2 − ˆUi, − ˆVi)} and {(2 − ˆUi, 2 − ˆVi)}. See Gijbels & Mielniczuk (1990). That estimator will be used as a benchmark in the simulation study. 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 8 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 9. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Using Beta Kernels Use a Kernel which is a product of beta kernels Kxi (u) ∝ x u1 b 1,i [1 − x1,i] u1 b · x u2 b 2,i [1 − x2,i] u2 b for some b > 0, see Chen (1999). for some observation xi in the lower left corner 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 9 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 10. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Using Beta Kernels @freakonometrics freakonometrics freakonometrics.hypotheses.org 10
  • 11. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Probit Transformation See Devroye & Gyöfi (1985) and Marron & Ruppert (1994). Define normalized n-i.i.d. sample {(Si, Ti)} Si = Φ−1 (Ui) and Ti = Φ−1 (Vi) or normalized n-i.i.d. pseudo-sample {( ˆSi, ˆTi)} ˆUi = Φ−1 ( ˆUi) and ˆVi = Φ−1 ( ˆVi) where Φ−1 is the quantile function of N(0, 1) (probit transformation). −3 −2 −1 0 1 2 3 −3−2−10123 @freakonometrics freakonometrics freakonometrics.hypotheses.org 11
  • 12. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Probit Transformation FST (x, y) = C(Φ(x), Φ(y)) so that fST (x, y) = φ(x)φ(y)c(Φ(x), Φ(y)) Thus c(u, v) = fST (Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) . So use ˆc(τ) (u, v) = ˆfST (Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) −3 −2 −1 0 1 2 3 −3−2−10123 @freakonometrics freakonometrics freakonometrics.hypotheses.org 12
  • 13. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference The naive estimator Since we cannot use ˆf∗ ST (s, t) = 1 n|HST |1/2 n i=1 K H −1/2 ST s − Si t − Ti , where K is a kernel function and HST is a band- width matrix, use ˆfST (s, t) = 1 n|HST |1/2 n i=1 K H −1/2 ST s − ˆSi t − ˆTi . and the copula density is ˆc(τ) (u, v) = ˆfST (Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 13 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 14. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference The naive estimator ˆc(τ) (u, v) = 1 n|HST |1/2φ(Φ−1(u))φ(Φ−1(v)) n i=1 K H −1/2 ST Φ−1 (u) − Φ−1 ( ˆUi) Φ−1(v) − Φ−1( ˆVi) as suggested in C., Fermanian & Scaillet (2007) and Lopez-Paz . et al. (2013). Note that Omelka . et al. (2009) obtained theoretical properties on the convergence of ˆC(τ) (u, v) (not c). In Probit transformation for nonparametric kernel estimation of the copula density with G. Geenens and D. Paindaveine, we extended that estimator. See also kdecopula R package by T. Nagler @freakonometrics freakonometrics freakonometrics.hypotheses.org 14
  • 15. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improved probit-transformation copula density estimators When estimating a density from pseudo-sample, Loader (1996) and Hjort & Jones (1996) define a local likelihood estimator Around (s, t) ∈ R2 , use a polynomial approximation of order p for log fST log fST (ˇs, ˇt) a1,0(s, t) + a1,1(s, t)(ˇs − s) + a1,2(s, t)(ˇt − t) . = Pa1 (ˇs − s, ˇt − t) log fST (ˇs, ˇt) a2,0(s, t) + a2,1(s, t)(ˇs − s) + a2,2(s, t)(ˇt − t) + a2,3(s, t)(ˇs − s)2 + a2,4(s, t)(ˇt − t)2 + a2,5(s, t)(ˇs − s)(ˇt − t) . = Pa2 (ˇs − s, ˇt − t). @freakonometrics freakonometrics freakonometrics.hypotheses.org 15
  • 16. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improved probit-transformation copula density estimators Remark Vectors a1(s, t) = (a1,0(s, t), a1,1(s, t), a1,2(s, t)) and a2(s, t) . = (a2,0(s, t), . . . , a2,5(s, t)) are then estimated by solving a weighted maximum likelihood problem. ˜ap(s, t) = arg max ap n i=1 K H −1/2 ST s − ˆSi t − ˆTi Pap ( ˆSi − s, ˆTi − t) −n R2 K H −1/2 ST s − ˇs t − ˇt exp Pap (ˇs − s, ˇt − t) dˇs dˇt , The estimate of fST at (s, t) is then ˜f (p) ST (s, t) = exp(˜ap,0(s, t)), for p = 1, 2. The Improved probit-transformation kernel copula density estimators are ˜c(τ,p) (u, v) = ˜f (p) ST (Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) @freakonometrics freakonometrics freakonometrics.hypotheses.org 16
  • 17. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improved probit-transformation copula density estimators For the local log-linear (p = 1) approximation ˜c(τ,1) (u, v) = exp(˜a1,0(Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 17 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 18. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improved probit-transformation copula density estimators For the local log-quadratic (p = 2) approximation ˜c(τ,2) (u, v) = exp(˜a2,0(Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) 0.0 0.2 0.4 0.6 0.8 1.0 01234567 @freakonometrics freakonometrics freakonometrics.hypotheses.org 18 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6
  • 19. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Asymptotic properties A1. The sample {(Xi, Yi)} is a n- i.i.d. sample from the joint distribution FXY , an absolutely continuous distribution with marginals FX and FY strictly increasing on their support ; (uniqueness of the copula) @freakonometrics freakonometrics freakonometrics.hypotheses.org 19
  • 20. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Asymptotic properties A2. The copula C of FXY is such that (∂C/∂u)(u, v) and (∂2 C/∂u2 )(u, v) exist and are continuous on {(u, v) : u ∈ (0, 1), v ∈ [0, 1]}, and (∂C/∂v)(u, v) and (∂2 C/∂v2 )(u, v) exist and are continuous on {(u, v) : u ∈ [0, 1], v ∈ (0, 1)}. In addition, there are constants K1 and K2 such that    ∂2 C ∂u2 (u, v) ≤ K1 u(1 − u) for (u, v) ∈ (0, 1) × [0, 1]; ∂2 C ∂v2 (u, v) ≤ K2 v(1 − v) for (u, v) ∈ [0, 1] × (0, 1); A3. The density c of C exists, is positive and admits continuous second-order partial derivatives on the interior of the unit square I. In addition, there is a constant K00 such that c(u, v) ≤ K00 min 1 u(1 − u) , 1 v(1 − v) ∀(u, v) ∈ (0, 1)2 . see Segers (2012). @freakonometrics freakonometrics freakonometrics.hypotheses.org 20
  • 21. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Asymptotic properties Assume that K(z1, z2) = φ(z1)φ(z2) and HST = h2 I with h ∼ n−a for some a ∈ (0, 1/4). Under Assumptions A1-A3, the ‘naive’ probit transformation kernel copula density estimator at any (u, v) ∈ (0, 1)2 is such that √ nh2 ˆc(τ) (u, v) − c(u, v) − h2 b(u, v) φ(Φ−1(u))φ(Φ−1(v)) L −→ N 0, σ2 (u, v) , where b(u, v) = 1 2 ∂2 c ∂u2 (u, v)φ2 (Φ−1 (u)) + ∂2 c ∂v2 (u, v)φ2 (Φ−1 (v)) − 3 ∂c ∂u (u, v)Φ−1 (u)φ(Φ−1 (u)) + ∂c ∂v (u, v)Φ−1 (v)φ(Φ−1 (v)) + c(u, v) {Φ−1 (u)}2 + {Φ−1 (v)}2 − 2 (2) and σ2 (u, v) = c(u, v) 4πφ(Φ−1(u))φ(Φ−1(v)) . @freakonometrics freakonometrics freakonometrics.hypotheses.org 21
  • 22. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference The Amended version The last unbounded term in b be easily adjusted. ˆc(τam) (u, v) = ˆfST (Φ−1 (u), Φ−1 (v)) φ(Φ−1(u))φ(Φ−1(v)) × 1 1 + 1 2 h2 ({Φ−1(u)}2 + {Φ−1(v)}2 − 2) . The asymptotic bias becomes proportional to b(am) (u, v) = 1 2 ∂2 c ∂u2 (u, v)φ2 (Φ−1 (u)) + ∂2 c ∂v2 (u, v)φ2 (Φ−1 (v)) −3 ∂c ∂u (u, v)Φ−1 (u)φ(Φ−1 (u)) + ∂c ∂v (u, v)Φ−1 (v)φ(Φ−1 (v)) . @freakonometrics freakonometrics freakonometrics.hypotheses.org 22
  • 23. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference A local log-linear probit-transformation kernel estimator ˜c∗(τ,1) (u, v) = ˜f ∗(1) ST (Φ−1 (u), Φ−1 (v))/ φ(Φ−1 (u))φ(Φ−1 (v)) Then √ nh2 ˜c∗(τ,1) (u, v) − c(u, v) − h2 b(1) (u, v) φ(Φ−1(u))φ(Φ−1(v)) L −→ N 0, σ(1) 2 (u, v) , where b(1) (u, v) = 1 2 ∂2 c ∂u2 (u, v)φ2 (Φ−1 (u)) + ∂2 c ∂v2 (u, v)φ2 (Φ−1 (v)) − 1 c(u, v) ∂c ∂u (u, v) 2 φ2 (Φ−1 (u)) + ∂c ∂v (u, v) 2 φ2 (Φ−1 (v)) − ∂c ∂u (u, v)Φ−1 (u)φ(Φ−1 (u)) + ∂c ∂v (u, v)Φ−1 (v)φ(Φ−1 (v)) − 2c(u, v) @freakonometrics freakonometrics freakonometrics.hypotheses.org 23
  • 24. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Using a higher order polynomial approximation Locally fitting a polynomial of a higher degree is known to reduce the asymptotic bias of the estimator, here from order O(h2 ) to order O(h4 ), see Loader (1996) or Hjort (1996), under sufficient smoothness conditions. If fST admits continuous fourth-order partial derivatives and is positive at (s, t), then √ nh2 ˜f ∗(2) ST (s, t) − fST (s, t) − h4 b (2) ST (s, t) L −→ N 0, σ (2) ST 2 (s, t) , where σ (2) ST 2 (s, t) = 5 2 fST (s, t) 4π and b (2) ST (s, t) = − 1 8 fST (s, t) × ∂4 g ∂s4 + ∂4 g ∂t4 + 4 ∂3 g ∂s3 ∂g ∂s + ∂3 g ∂t3 ∂g ∂t + ∂3 g ∂s2∂t ∂g ∂t + ∂3 g ∂s∂t2 ∂g ∂s + 2 ∂4 g ∂s2∂t2 (s, t), with g(s, t) = log fST (s, t). @freakonometrics freakonometrics freakonometrics.hypotheses.org 24
  • 25. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Using a higher order polynomial approximation A4. The copula density c(u, v) = (∂2 C/∂u∂v)(u, v) admits continuous fourth-order partial derivatives on the interior of the unit square [0, 1]2 . Then √ nh2 ˜c∗(τ,2) (u, v) − c(u, v) − h4 b(2) (u, v) φ(Φ−1(u))φ(Φ−1(v)) L −→ N 0, σ(2) 2 (u, v) where σ(2) 2 (u, v) = 5 2 c(u, v) 4πφ(Φ−1(u))φ(Φ−1(v)) @freakonometrics freakonometrics freakonometrics.hypotheses.org 25
  • 26. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improving Bandwidth choice Consider the principal components decomposition of the (n × 2) matrix [ˆS, ˆT ] = M. Let W1 = (W11, W12)T and W2 = (W21, W22)T be the eigenvectors of MT M. Set Q R =   W11 W12 W21 W22   S T = W S T which is only a linear reparametrization of R2 , so an estimate of fST can be readily obtained from an estimate of the density of (Q, R) Since { ˆQi} and { ˆRi} are empirically uncorrela- ted, consider a diagonal bandwidth matrix HQR = diag(h2 Q, h2 R). −4 −2 0 2 4 −3−2−1012 @freakonometrics freakonometrics freakonometrics.hypotheses.org 26
  • 27. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Improving Bandwidth choice Use univariate procedures to select hQ and hR independently Denote ˜f (p) Q and ˜f (p) R (p = 1, 2), the local log-polynomial estimators for the densities hQ can be selected via cross-validation (see Section 5.3.3 in Loader (1999)) hQ = arg min h>0 ∞ −∞ ˜f (p) Q (q) 2 dq − 2 n n i=1 ˜f (p) Q(−i)( ˆQi) , where ˜f (p) Q(−i) is the ‘leave-one-out’ version of ˜f (p) Q . @freakonometrics freakonometrics freakonometrics.hypotheses.org 27
  • 28. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Graphical Comparison (loss ALAE dataset) c~(τ2) Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1 1.25 1.25 1.5 1.5 2 2 4 c^ β Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.25 0.25 0.25 0.25 0.5 0.5 0.75 0.75 0.75 1 1 1 1 1 1.25 1.25 1.25 1.25 1.25 1.5 1.5 2 2 2 4 c^ b Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 2 2 c^ p Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.25 0.25 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1 1.25 1.25 1.25 1.25 1.5 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 c~(τ2) Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1 1.25 1.25 1.5 1.5 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 c^ β Loss (X) ALAE(Y) 0.25 0.25 0.25 0.25 0.5 0.5 0.75 0.75 0.75 1 1 1 1 1 1.25 1.25 1.25 1.25 1.25 1.5 1.5 1.5 2 2 2 4 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 c^ b Loss (X) ALAE(Y) 0.25 0.25 0.5 0.5 0.75 0.751 1 1.25 1.25 1.5 1.5 2 2 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 c^ p Loss (X) ALAE(Y) 0.25 0.25 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1 1.25 1.25 1.25 1.25 1.5 1.5 1.5 2 2 4 @freakonometrics freakonometrics freakonometrics.hypotheses.org 28
  • 29. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Simulation Study M = 1, 000 independent random samples {(Ui, Vi)}n i=1 of sizes n = 200, n = 500 and n = 1000 were generated from each of the following copulas : · the independence copula (i.e., Ui’s and Vi’s drawn independently) ; · the Gaussian copula, with parameters ρ = 0.31, ρ = 0.59 and ρ = 0.81 ; · the Student t-copula with 4 degrees of freedom, with parameters ρ = 0.31, ρ = 0.59 and ρ = 0.81 ; · the Frank copula, with parameter θ = 1.86, θ = 4.16 and θ = 7.93 ; · the Gumbel copula, with parameter θ = 1.25, θ = 1.67 and θ = 2.5 ; · the Clayton copula, with parameter θ = 0.5, θ = 1.67 and θ = 2.5. (approximated) MISE relative to the MISE of the mirror-reflection estimator (last column), n = 1000. Bold values show the minimum MISE for the corresponding copula (non-significantly different values are highlighted as well). @freakonometrics freakonometrics freakonometrics.hypotheses.org 29
  • 30. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference n = 1000 ˆc(τ) ˆc(τam) ˜c(τ,1) ˜c(τ,2) ˆc (β) 1 ˆc (β) 2 ˆc (B) 1 ˆc (B) 2 ˆc (p) 1 ˆc (p) 2 ˆc (p) 3 Indep 3.57 2.80 2.89 1.40 7.96 11.65 1.69 3.43 1.62 0.50 0.14 Gauss2 2.03 1.52 1.60 0.76 4.63 6.06 1.10 1.82 0.98 0.66 0.89 Gauss4 0.63 0.49 0.44 0.21 1.72 1.60 0.75 0.58 0.62 0.99 2.93 Gauss6 0.21 0.20 0.11 0.05 0.74 0.33 0.77 0.37 0.72 1.21 2.83 Std(4)2 0.61 0.56 0.50 0.40 1.57 1.80 0.78 0.67 0.75 1.01 1.88 Std(4)4 0.21 0.27 0.17 0.15 0.88 0.51 0.75 0.42 0.75 1.12 2.07 Std(4)6 0.09 0.17 0.08 0.09 0.70 0.19 0.82 0.47 0.90 1.17 1.90 Frank2 3.31 2.42 2.57 1.35 7.16 9.63 1.70 2.95 1.31 0.45 0.49 Frank4 2.35 1.45 1.51 0.99 4.42 4.89 1.49 1.65 0.60 0.72 6.14 Frank6 0.96 0.52 0.45 0.44 1.51 1.19 1.35 0.76 0.65 1.58 7.25 Gumbel2 0.65 0.62 0.56 0.43 1.77 1.97 0.82 0.75 0.83 1.03 1.52 Gumbel4 0.18 0.28 0.16 0.19 0.89 0.41 0.78 0.47 0.81 1.10 1.78 Gumbel6 0.09 0.21 0.10 0.15 0.78 0.29 0.85 0.58 0.94 1.12 1.63 Clayton2 0.63 0.60 0.51 0.34 1.78 1.99 0.78 0.70 0.79 1.04 1.79 Clayton4 0.11 0.26 0.10 0.15 0.79 0.27 0.83 0.56 0.90 1.10 1.50 Clayton6 0.11 0.28 0.08 0.15 0.82 0.35 0.88 0.67 0.96 1.09 1.36 @freakonometrics freakonometrics freakonometrics.hypotheses.org 30
  • 31. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Probit Transform in the Univariate Case : the log transform See Log-Transform Kernel Density Estimation of Income Distribution with E. Flachaire The Gaussian kernel estimator of a density is fZ(z) = 1 n n i=1 ϕ(z; zi, h) where ϕ(·; µ, σ) is the density of the normal distribution. Use a Gaussian kernel estimation of the density using a logarithmic transformation of data xi’s fX(x) = fZ(log(x)) x = 1 n n i=1 h(x; log xi, h) where h(·; µ, σ) is the density of the lognormal distribution. @freakonometrics freakonometrics freakonometrics.hypotheses.org 31
  • 32. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Probit Transform in the Univariate Case : the log transform Recall that classically bias[fZ(z)] ∼ h2 2 fZ(z) and Var[fZ(z)] ∼ 0.2821 nh fZ(z) Here, in the neighborhood of 0, bias[fX(x)] ∼ h2 2 fX(x) + 3x · fX(x) + x2 · fX(x) which is positive if fX(0) > 0, while Var[fX(x)] ∼ 0.2821 nhx fX(x) The log-transform kernel may perform poorly when fX(0) > 0, see Silverman (1986). @freakonometrics freakonometrics freakonometrics.hypotheses.org 32
  • 33. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Back on the Transformed Kernel See Devroye & Györfi (1985), and Devroye & Lugosi (2001) ... use the transformed kernel the other way, R → [0, 1] → R @freakonometrics freakonometrics freakonometrics.hypotheses.org 33
  • 34. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Back on the Transformed Kernel Interesting point, the optimal T should be F, thus, T can be Fθ @freakonometrics freakonometrics freakonometrics.hypotheses.org 34
  • 35. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Heavy Tailed distribution Let X denote a (heavy-tailed) random variable with tail index α ∈ (0, ∞), i.e. P(X > x) = x−α L1(x) where L1 is some regularly varying function. Let T denote a R → [0, 1] function, such that 1 − T is regularly varying at infinity, with tail index β ∈ (0, ∞). Define Q(x) = T−1 (1 − x−1 ) the associated tail quantile function, then Q(x) = x1/β L2(1/x), where L2 is some regularly varying function (the de Bruyn conjugate of the regular variation function associated with T). Assume here that Q(x) = bx1/β Let U = T(X). Then, as u → 1 P(U > u) ∼ (1 − u)α/β . @freakonometrics freakonometrics freakonometrics.hypotheses.org 35
  • 36. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Heavy Tailed distribution see C. & Oulidi (2010), α = 0.75−1 , T0.75−1 , T0.65−1 lighter , T0.85−1 heavier and Tˆα Density 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Density 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 36
  • 37. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Heavy Tailed distribution see C. & Oulidi (2007) Beta kernel quantile estimators of heavy-tailed loss distributions, impact on quantile estimation ? 20 30 40 50 60 70 80 0.000.010.020.030.040.050.06 Density @freakonometrics freakonometrics freakonometrics.hypotheses.org 37
  • 38. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Heavy Tailed distribution see C. & Oulidi (2007), impact on quantile estimation ? bias ? m.s.e. ? @freakonometrics freakonometrics freakonometrics.hypotheses.org 38
  • 39. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Bimodal distribution Let X denote a bimodal distribution, obtained from a mixture X ∼ FΘ    F0 if Θ = 0, (probability p0) F1 if Θ = 1, (probability p1) Idea : T(X) can be obtained as transformation of two distributions on [0, 1], T(X) ∼ GΘ    G0 if Θ = 0, (probability p0) G1 if Θ = 1, (probability p1) → standard for income observations... @freakonometrics freakonometrics freakonometrics.hypotheses.org 39
  • 40. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Which transformation ? GB2 : t(y; a, b, p, q) = |a|yap−1 bapB(p, q)[1 + (y/b)a]p+q , for y > 0, GB2 q→∞ ~~ a=1 p=1 55 q=1 @@ GG a→0 ÓÓ a=1 p=1 %% Beta2 q→∞ ww SM q→∞ xx q=1 99 Dagum p=1 Lognormal Gamma Weibull Champernowne @freakonometrics freakonometrics freakonometrics.hypotheses.org 40
  • 41. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Example, X ∼ Pareto Pareto plot (log F(x) vs. log x), and histogram of {U1, · · · , Un}, Ui = Tˆθ(Xi) log(x) log(1−F(x)) 1 5 10 50 100 500 5e−055e−045e−035e−025e−01 u=H(x) densityg(u) 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.01.2 @freakonometrics freakonometrics freakonometrics.hypotheses.org 41
  • 42. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Example, X ∼ Pareto Estimation of the density g of U = Tθ (X), and estimated c.d.f of X, Fn(x) = x 0 fn(y)dy where fn(y) = gn(Tˆθ(y)) · tˆθ(y) u=H(x) densityg(u) 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.01.2 Adaptative kernel log(x) log(1−F(x)) 1 5 10 50 100 500 5e−055e−045e−035e−025e−01 @freakonometrics freakonometrics freakonometrics.hypotheses.org 42
  • 43. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Beta kernel g(u) = n i=1 1 n · b u; Ui h , 1 − Ui h u ∈ [0, 1]. with some possible boundary correction, as suggested in Chen (1999), u h → ρ(u, h) = 2h2 + 2.5 − (4h4 + 6h2 + 2.25 − u2 − u/h)1/2 Problem : choice of the bandwidth h ? Standard loss function L(h) = [gn(u) − g(u)]2 du = [gn(u)]2 du − 2 gn(u) · g(u)du CV (h) + [g(u)]2 du where CV (h) = gn(u)du 2 − 2 n n i=1 g(−i)(Ui) @freakonometrics freakonometrics freakonometrics.hypotheses.org 43
  • 44. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Beta kernel qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q @freakonometrics freakonometrics freakonometrics.hypotheses.org 44
  • 45. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Beta kernel u=H(x) densityg(u) 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.01.2 Beta mixtures (.1) Beta mixtures (.05) Beta mixtures (.02) Beta mixtures (.01) log(x) log(1−F(x)) 1 5 10 50 100 5001e−051e−041e−031e−021e−01 @freakonometrics freakonometrics freakonometrics.hypotheses.org 45
  • 46. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Mixture of Beta distributions g(u) = k j=1 πj · b u; αj, βj u ∈ [0, 1]. Problem : choice the number of components k (and estimation...). Use of stochastic EM algorithm (or sort of) see Celeux Diebolt (1985). @freakonometrics freakonometrics freakonometrics.hypotheses.org 46
  • 47. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Mixture of Beta distributions qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q @freakonometrics freakonometrics freakonometrics.hypotheses.org 47
  • 48. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Mixture of Beta distributions u=H(x) densityg(u) 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.01.2 Beta mixtures log(x) log(1−F(x)) 1 5 10 50 100 5001e−051e−041e−031e−021e−01 @freakonometrics freakonometrics freakonometrics.hypotheses.org 48
  • 49. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Bernstein approximation g(u) = m k=1 [mωk] · b (u; k, m − k) u ∈ [0, 1]. where ωk = G k m − G k − 1 m . @freakonometrics freakonometrics freakonometrics.hypotheses.org 49
  • 50. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Bernstein approximation qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 q q q qq q qqqq qq qq qq q qq q qq qq q qq q qq q q @freakonometrics freakonometrics freakonometrics.hypotheses.org 50
  • 51. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Bernstein approximation u=H(x) densityg(u) 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.01.2 Bernstein (100) Bernstein (60) Bernstein (40) Bernstein (20) Bernstein (10) Bernstein (5) log(x) log(1−F(x)) 1 5 10 50 100 5001e−051e−041e−031e−021e−01 @freakonometrics freakonometrics freakonometrics.hypotheses.org 51
  • 52. Arthur CHARPENTIER - Using Transformations of Variables to Improve Inference Quantities of interest Standard statistical quantities • miae, ∞ 0 fn(x) − f(x) dx • mise, ∞ 0 fn(x) − f(x) 2 dx Inequality indices and risk measures, based on F(x) = x 0 f(t)dt, • Gini, 1 µ ∞ 0 F(t)[1 − F(t)]dt • Theil, ∞ 0 t µ log t µ f(t)dt • VaR-quantile, x such that F(x) = P(X ≤ x) = α, i.e. F−1 (α) • TVaR-expected shorfall, E[X|X F−1 (α)] where µ = ∞ 0 [1 − F(x)]dx. @freakonometrics freakonometrics freakonometrics.hypotheses.org 52