SlideShare a Scribd company logo
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Big Data for Economics # 3*
A. Charpentier (Universit´e de Rennes 1)
https://github.com/freakonometrics/ub
UB School of Economics Summer School, 2018.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 1
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
#3 Loss Functions : from OLS to Quantile Regression*
@freakonometrics freakonometrics freakonometrics.hypotheses.org 2
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
References
Motivation
Machado & Mata (2005). Counterfactual decomposition of changes in wage
distributions using quantile regression, JAE.
References
Givord & d’Haultfœuillle (2013) La r´egression quantile en pratique, INSEE
Koenker & Bassett (1978) Regression Quantiles, Econometrica.
Koenker (2005). Quantile Regression. Cambridge University Press.
Newey & Powell (1987) Asymmetric Least Squares Estimation and Testing,
Econometrica.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 3
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles
Let Y denote a random variable with cumulative distribution function F,
F(y) = P[Y ≤ y]. The quantile is
Q(u) = inf x ∈ R, F(x) > u .
qqq
DECESSURVIE
60 70 80 90 100 110 120
FRCAR
q
DECESSURVIE
1.0 1.5 2.0 2.5 3.0
INCAR
q
q q
DECESSURVIE
10 20 30 40 50
INSYS
q q
DECESSURVIE
10 15 20 25 30 35 40 45
PAPUL
DECESSURVIE
5 10 15 20
PVENT
DECESSURVIE
500 1000 1500 2000 2500 3000
REPUL
Box plot, from Tukey (1977, Exploratory Data Analysis).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 4
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Defining halfspace depth
Given y ∈ Rd
, and a direction u ∈ Rd
, define the closed half space
Hy,u = {x ∈ Rd
such that u x ≤ u y}
and define depth at point y by
depth(y) = inf
u,u=0
{P(Hy,u)}
i.e. the smallest probability of a closed half space containing y.
The empirical version is (see Tukey (1975))
depth(y) = min
u,u=0
1
n
n
i=1
1(Xi ∈ Hy,u)
For α > 0.5, define the depth set as
Dα = {y ∈ R ∈ Rd
such that ≥ 1 − α}.
The empirical version is can be related to the bagplot, Rousseeuw et al., 1999.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 5
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Empirical sets extremely sentive to the algorithm
−2 −1 0 1
−1.5−1.0−0.50.00.51.0
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
−2 −1 0 1
−1.5−1.0−0.50.00.51.0
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
where the blue set is the empirical estimation for Dα, α = 0.5.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 6
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
The bagplot tool
The depth function introduced here is the multivariate extension of standard
univariate depth measures, e.g.
depth(x) = min{F(x), 1 − F(x−
)}
which satisfies depth(Qα) = min{α, 1 − α}. But one can also consider
depth(x) = 2 · F(x) · [1 − F(x−
)] or depth(x) = 1 −
1
2
− F(x) .
Possible extensions to functional bagplot. Consider a set of functions fi(x),
i = 1, · · · , n, such that
fi(x) = µ(x) +
n−1
k=1
zi,kϕk(x)
(i.e. principal component decomposition) where ϕk(·) represents the
eigenfunctions. Rousseeuw et al., 1999 considered bivariate depth on the first two
scores, xi = (zi,1, zi,2). See Ferraty & Vieu (2006).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 7
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and Quantile Regressions
Quantiles are important quantities in many
areas (inequalities, risk, health, sports, etc).
Quantiles of the N(0, 1) distribution.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 8
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
−3 0 1 2 3
−1.645
5%
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
A First Model for Conditional Quantiles
Consider a location model, y = β0 + xT
β + ε i.e.
E[Y |X = x] = xT
β
then one can consider
Q(τ|X = x) = β0 + Qε(τ) + xT
β
where Qε(·) is the quantile function of the residuals.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 9
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
5 10 15 20 25 30
020406080100120
speed
dist
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
OLS Regression, 2 norm and Expected Value
Let y ∈ Rd
, y = argmin
m∈R



n
i=1
1
n
yi − m
εi
2



. It is the empirical version of
E[Y ] = argmin
m∈R



y − m
ε
2
dF(y)



= argmin
m∈R



E Y − m
ε
2



where Y is a random variable.
Thus, argmin
m(·):Rk→R



n
i=1
1
n
yi − m(xi)
εi
2



is the empirical version of E[Y |X = x].
See Legendre (1805) Nouvelles m´ethodes pour la d´etermination des orbites des
com`etes and Gauβ (1809) Theoria motus corporum coelestium in sectionibus conicis
solem ambientium.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 10
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
OLS Regression, 2 norm and Expected Value
Sketch of proof: (1) Let h(x) =
d
i=1
(x − yi)2
, then
h (x) =
d
i=1
2(x − yi)
and the FOC yields x =
1
n
d
i=1
yi = y.
(2) If Y is continuous, let h(x) =
R
(x − y)f(y)dy and
h (x) =
∂
∂x R
(x − y)2
f(y)dy =
R
∂
∂x
(x − y)2
f(y)dy
i.e. x =
R
xf(y)dy =
R
yf(y)dy = E[Y ]
0.0 0.2 0.4 0.6 0.8 1.0
0.51.01.52.02.5
0.0 0.2 0.4 0.6 0.8 1.0
0.51.01.52.02.5
@freakonometrics freakonometrics freakonometrics.hypotheses.org 11
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Median Regression, 1 norm and Median
Let y ∈ Rd
, median[y] ∈ argmin
m∈R



n
i=1
1
n
yi − m
εi



. It is the empirical version of
median[Y ] ∈ argmin
m∈R



y − m
ε
dF(y)



= argmin
m∈R



E Y − m
ε
1



where Y is a random variable, P[Y ≤ median[Y ]] ≥
1
2
and P[Y ≥ median[Y ]] ≥
1
2
.
argmin
m(·):Rk→R



n
i=1
1
n
yi − m(xi)
εi



is the empirical version of median[Y |X = x].
See Boscovich (1757) De Litteraria expeditione per pontificiam ditionem ad
dimetiendos duos meridiani and Laplace (1793) Sur quelques points du syst`eme du
monde.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 12
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Median Regression, 1 norm and Median
Sketch of proof: (1) Let h(x) =
d
i=1
|x − yi|
(2) If F is absolutely continuous, dF(x) = f(x)dx, and the
median m is solution of
m
−∞
f(x)dx =
1
2
.
Set h(y) =
+∞
−∞
|x − y|f(x)dx
=
y
−∞
(−x + y)f(x)dx +
+∞
y
(x − y)f(x)dx
Then h (y) =
y
−∞
f(x)dx −
+∞
y
f(x)dx, and FOC yields
y
−∞
f(x)dx =
+∞
y
f(x)dx = 1 −
y
−∞
f(x)dx =
1
2
0.0 0.2 0.4 0.6 0.8 1.0
1.52.02.53.03.54.0
0.0 0.2 0.4 0.6 0.8 1.0
2.02.53.03.54.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 13
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
In statistics, consider some functional seen as a distance between θ and θ, e.g.
squared loss : 2(θ, θ) = (θ − θ)2
absolute loss : 1(θ, θ) = |θ − θ|
zero/one loss ◦|(θ, θ) = 1(|θ − θ| > ε) for some ε > 0
Define risk as the expected loss,
R(θ, θ) = (θ(y), θ) f(y|θ)
L(θ,y)
dy
(where the average is over the sample space).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 14
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
Bayes’Rule: Minimize average risk
R(θ) = R(θ, θ)π(θ)dθ
and set
θ = argmin R(θ)
Hence
θ = argmin
Θ Rn
(θ(y), θ)f(y|θ)dyπ(θ)dθ
θ = argmin
Rn Θ
(θ(y), θ)π(θ|y)dθf(y)dy
@freakonometrics freakonometrics freakonometrics.hypotheses.org 15
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
If = 2, θ is the posterior mean, θ = E[θ|y] : to solve
argmin
Θ
(θ(y), θ)2
π(θ|y)dθ
consider the first order condition
2 (θ(y), θ)π(θ|y)dθ = 0
i.e. θ π(θ|y)dθ = θπ(θ|y)dθ
If = 1, θ is the posterior median, θ = median[θ|y]
If = ◦|, θ is the posterior mode
@freakonometrics freakonometrics freakonometrics.hypotheses.org 16
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
Application : head/tails Bernoulli with uniform prior
f(y) = θy
(1 − θ)1−y
, where y ∈ {0, 1}
and π(θ) = 1(θ ∈ [0, 1]). Then likelihood is
L(θ, y) = f(y|θ) = θ yi
(1 − θ)n− yi
From Baye’s Theorem,
π(θ|y) ∝ π(θ) · f(y|θ) ∝ θ yi
(1 − θ)n− yi
which is a Beta distribution. Recall that B(α, β)
g(u|α, β) =
uα−1
(1 − u)β−1
B(α, β)
on [0, 1]
@freakonometrics freakonometrics freakonometrics.hypotheses.org 17
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
and E[U] =
α
α + β
, Var[U] =
αβ
(α + β)2(α + β + 1)
, while
median[U] ∼
3α − 1
3α + 3β − 2
and mode[U] =
α − 1
α + β − 2
if α, β > 1.
Here, posterior distribition is B(ny + 1, n(1 − y) + 1).
Here with 2, θ2 = E[θ|y] =
ny + 1
n + 2
while with ◦|, θ = mode[θ|y] = y (which is
also the maximum likelihood estimator).
With a Beta B(a, b) prior, posterior distribition is B(ny + a + 1, n(1 − y) + b + 1).
Here with 2, θ2 = E[θ|y] =
ny + a
n + a + b
while with ◦|,
θ = mode[θ|y] =
ny + a − 1
n + a + b − 2
(which is no longer the maximum likelihood
estimator).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 18
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Bayesian Statistics and Loss Functions
Example: Gaussian case (with known variance), Yi ∼ N(µ, σ2
0). Consider
Gaussian prior, µ ∼ N(m, s2
). We are uncertain here about the value of µ. Then,
the posterior distribution for µ is a Gaussian distribution µ|y ∼ N(my, s2
y) where
my = s2
y
m
s2
+
ny
σ2
0
and s2
y =
1
s2
+
n
σ2
0
−1
Here my is Bayes estimator of µ under loss functions 1, 2 and ◦|.
When σ2
0 is no longer known, but is just a nuisance parameter, the natural
approach is to consider a joint posterior density for θ = (µ, σ2
) and then
marginalize by integrating out the nuisance parameters
@freakonometrics freakonometrics freakonometrics.hypotheses.org 19
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
OLS vs. Median Regression (Least Absolute Deviation)
Consider some linear model, yi = β0 + xT
i β + εi ,and define
(βols
0 , β
ols
) = argmin
n
i=1
yi − β0 − xT
i β
2
(βlad
0 , β
lad
) = argmin
n
i=1
yi − β0 − xT
i β
Assume that ε|X has a symmetric distribution, E[ε|X] = median[ε|X] = 0, then
(βols
0 , β
ols
) and (βlad
0 , β
lad
) are consistent estimators of (β0, β).
Assume that ε|X does not have a symmetric distribution, but E[ε|X] = 0, then
β
ols
and β
lad
are consistent estimators of the slopes β.
If median[ε|X] = γ, then βlad
0 converges to β0 + γ.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 20
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
OLS vs. Median Regression
Median regression is stable by monotonic transformation. If
log[yi] = β0 + xT
i β + εi with median[ε|X] = 0,
then
median[Y |X = x] = exp median[log(Y )|X = x] = exp β0 + xT
i β
while
E[Y |X = x] = exp E[log(Y )|X = x] (= exp E[log(Y )|X = x] ·[exp(ε)|X = x]
1 > ols <- lm(y˜x, data=df)
2 > library(quantreg)
3 > lad <- rq(y˜x, data=df , tau =.5)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 21
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Notations
Cumulative distribution function FY (y) = P[Y ≤ y].
Quantile function QX(u) = inf y ∈ R : FY (y) ≥ u ,
also noted QX(u) = F−1
X u.
One can consider QX(u) = sup y ∈ R : FY (y) < u
For any increasing transformation t, Qt(Y )(τ) = t QY (τ)
F(y|x) = P[Y ≤ y|X = x]
QY |x(u) = F−1
(u|x)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 22
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0 1 2 3 4 5
0.00.20.40.60.81.0
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0 1 2 3 4 5
0.00.20.40.60.81.0
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
0 1 2 3 4 5
0.00.20.40.60.81.0
q
q
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Empirical Quantile
@freakonometrics freakonometrics freakonometrics.hypotheses.org 23
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and optimization : numerical aspects
Consider the case of the median. Consider a sample {y1, · · · , yn}.
To compute the median, solve min
µ
n
i=1
|yi − µ| which can be solved using
linear programming techniques.
More precisely, this problem is equivalent to min
µ,a,b
n
i=1
ai + bi with ai, bi ≥ 0
and yi − µ = ai − bi, ∀i = 1, · · · , n.
Consider a sample obtained from a lognormal distribution
1 n = 101
2 set.seed (1)
3 y = rlnorm(n)
4 median(y)
5 [1] 1.077415
@freakonometrics freakonometrics freakonometrics.hypotheses.org 24
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and optimization : numerical aspects
Here, one can use a standard optimization routine
1 md=Vectorize(function(m) sum(abs(y-m)))
2 optim(mean(y),md)
3 $par
4 [1] 1.077416
or a linear programing technique : use the matrix form, with 3n constraints, and
2n + 1 parameters,
1 library(lpSolve)
2 A1 = cbind(diag (2*n) ,0)
3 A2 = cbind(diag(n), -diag(n), 1)
4 r = lp("min", c(rep (1,2*n) ,0),
5 rbind(A1 , A2),c(rep("&gt;=", 2*n), rep("=", n)), c(rep (0,2*n), y))
6 tail(r$solution ,1)
7 [1] 1.077415
@freakonometrics freakonometrics freakonometrics.hypotheses.org 25
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and optimization : numerical aspects
More generally, consider here some quantile,
1 tau = .3
2 quantile(x,tau)
3 30%
4 0.6741586
The linear program is now min
µ,a,b
n
i=1
τai + (1 − τ)bi with ai, bi ≥ 0 and
yi − µ = ai − bi, ∀i = 1, · · · , n.
1 A1 = cbind(diag (2*n) ,0)
2 A2 = cbind(diag(n), -diag(n), 1)
3 r = lp("min", c(rep(tau ,n),rep(1-tau ,n) ,0),
4 rbind(A1 , A2),c(rep("&gt;=", 2*n), rep("=", n)), c(rep (0,2*n), y))
5 tail(r$solution ,1)
6 [1] 0.6741586
@freakonometrics freakonometrics freakonometrics.hypotheses.org 26
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile regression ?
In OLS regression, we try to evaluate E[Y |X = x] =
R
ydFY |X=x(y)
In quantile regression, we try to evaluate
Qu(Y |X = x) = inf y : FY |X=x(y) ≥ u
as introduced in Newey & Powell (1987) Asymmetric Least Squares Estimation and
Testing.
Li & Racine (2007) Nonparametric Econometrics: Theory and Practice suggested
Qu(Y |X = x) = inf y : FY |X=x(y) ≥ u
where FY |X=x(y) can be some kernel-based estimator.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 27
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and Expectiles
Consider the following risk functions
Rq
τ (u) = u · τ − 1(u < 0) , τ ∈ [0, 1]
with Rq
1/2(u) ∝ |u| = u 1
, and
Re
τ (u) = u2
· τ − 1(u < 0) , τ ∈ [0, 1]
with Re
1/2(u) ∝ u2
= u 2
2
.
QY (τ) = argmin
m
E Rq
τ (Y − m)
which is the median when τ = 1/2,
EY (τ) = argmin
m
E Re
τ (X − m) }
which is the expected value when τ = 1/2.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 28
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.00.51.01.5
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.00.51.01.5
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and Expectiles
One can also write
quantile: argmin



n
i=1
ωq
τ (εi) yi − qi
εi



where ωq
τ ( ) =



1 − τ if ≤ 0
τ if > 0
expectile: argmin



n
i=1
ωe
τ (εi) yi − qi
εi
2



where ωe
τ ( ) =



1 − τ if ≤ 0
τ if > 0
Expectiles are unique, not quantiles...
Quantiles satisfy E[sign(Y − QY (τ))] = 0
Expectiles satisfy τE (Y − eY (τ))+ = (1 − τ)E (Y − eY (τ))−
(those are actually the first order conditions of the optimization problem).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 29
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles and M-Estimators
There are connections with M-estimators, as introduced in Serfling (1980)
Approximation Theorems of Mathematical Statistics, chapter 7.
For any function h(·, ·), the M-functional is the solution β of
h(y, β)dFY (y) = 0
, and the M-estimator is the solution of
h(y, β)dFn(y) =
1
n
n
i=1
h(yi, β) = 0
Hence, if h(y, β) = y − β, β = E[Y ] and β = y.
And if h(y, β) = 1(y < β) − τ, with τ ∈ (0, 1), then β = F−1
Y (τ).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 30
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantiles, Maximal Correlation and Hardy-Littlewood-Polya
If x1 ≤ · · · ≤ xn and y1 ≤ · · · ≤ yn, then
n
i=1
xiyi ≥
n
i=1
xiyσ(i), ∀σ ∈ Sn, and x
and y are said to be comonotonic.
The continuous version is that X and Y are comonotonic if
E[XY ] ≥ E[X ˜Y ] where ˜Y
L
= Y,
One can prove that
Y = QY (FX(X)) = argmax
˜Y ∼FY
E[X ˜Y ]
@freakonometrics freakonometrics freakonometrics.hypotheses.org 31
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectiles as Quantiles
For every Y ∈ L1
, τ → eY (τ) is continuous, and striclty increasing
if Y is absolutely continuous,
∂eY (τ)
∂τ
=
E[|X − eY (τ)|]
(1 − τ)FY (eY (τ)) + τ(1 − FY (eY (τ)))
if X ≤ Y , then eX(τ) ≤ eY (τ) ∀τ ∈ (0, 1)
“Expectiles have properties that are similar to quantiles” Newey & Powell (1987)
Asymmetric Least Squares Estimation and Testing. The reason is that expectiles of
a distribution F are quantiles a distribution G which is related to F, see Jones
(1994) Expectiles and M-quantiles are quantiles: let
G(t) =
P(t) − tF(t)
2[P(t) − tF(t)] + t − µ
where P(s) =
s
−∞
ydF(y).
The expectiles of F are the quantiles of G.
1 > x <- rnorm (99)
2 > library(expectreg)
3 > e <- expectile(x, probs = seq(0, 1, 0.1))
@freakonometrics freakonometrics freakonometrics.hypotheses.org 32
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectiles as Quantiles
0.0 0.2 0.4 0.6 0.8 1.0
−2−1012
0.0 0.2 0.4 0.6 0.8 1.0
0246810
0.0 0.2 0.4 0.6 0.8 1.0
02468
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 33
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Elicitable Measures
“elicitable” means “being a minimizer of a suitable expected score”
T is an elicatable function if there exits a scoring function S : R × R → [0, ∞)
such that
T(Y ) = argmin
x∈R R
S(x, y)dF(y) = argmin
x∈R
E S(x, Y ) where Y ∼ F.
see Gneiting (2011) Making and evaluating point forecasts.
Example: mean, T(Y ) = E[Y ] is elicited by S(x, y) = x − y 2
2
Example: median, T(Y ) = median[Y ] is elicited by S(x, y) = x − y 1
Example: quantile, T(Y ) = QY (τ) is elicited by
S(x, y) = τ(y − x)+ + (1 − τ)(y − x)−
Example: expectile, T(Y ) = EY (τ) is elicited by
S(x, y) = τ(y − x)2
+ + (1 − τ)(y − x)2
−
@freakonometrics freakonometrics freakonometrics.hypotheses.org 34
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Elicitable Measures
Remark: all functionals are not necessarily elicitable, see Osband (1985)
Providing incentives for better cost forecasting
The variance is not elicitable
The elicitability property implies a property which is known as convexity of the
level sets with respect to mixtures (also called Betweenness property) : if two
lotteries F, and G are equivalent, then any mixture of the two lotteries is also
equivalent with F and G.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 35
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Empirical Quantiles
Consider some i.id. sample {y1, · · · , yn} with distribution F. Set
Qτ = argmin E Rq
τ (Y − q) where Y ∼ F and Qτ ∈ argmin
n
i=1
Rq
τ (yi − q)
Then as n → ∞
√
n Qτ − Qτ
L
→ N 0,
τ(1 − τ)
f2(Qτ )
Sketch of the proof: yi = Qτ + εi, set hn(q) =
1
n
n
i=1
1(yi < q) − τ , which is a
non-decreasing function, with
E Qτ +
u
√
n
= FY Qτ +
u
√
n
∼ fY (Qτ )
u
√
n
Var Qτ +
u
√
n
∼
FY (Qτ )[1 − FY (Qτ )]
n
=
τ(1 − τ)
n
.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 36
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Empirical Expectiles
Consider some i.id. sample {y1, · · · , yn} with distribution F. Set
µτ = argmin E Re
τ (Y − m) where Y ∼ F and µτ = argmin
n
i=1
Re
τ (yi − m)
Then as n → ∞
√
n µτ − µτ
L
→ N 0, s2
for some s2
, if Var[Y ] < ∞. Define the identification function
Iτ (x, y) = τ(y − x)+ + (1 − τ)(y − x)− (elicitable score for quantiles)
so that µτ is solution of E I(µτ , Y ) = 0. Then
s2
=
E[I(µτ , Y )2
]
(τ[1 − F(µτ )] + [1 − τ]F(µτ ))2
.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 37
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression
We want to solve, here, min
n
i=1
Rq
τ (yi − xT
i β)
yi = xT
i β + εi so that Qy|x(τ) = xT
β + F−1
ε (τ)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 38
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
5 10 15 20 25
020406080100120
speed
dist
10%
90%
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
qqq
qqqq
qqqqqqqqqq
qqq
qqqqq
q
qqq
q
qqq
q
qqqq
qqqq
qqqq
q
qqqqqq
qq
q
qqqq
qq
qqqq
q
qq
q
qqqqqqqqqqqq
qq
qq
qq
q
20 40 60 80
23456
probability level (%)
slopeofquantileregression
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Geometric Properties of the Quantile Regression
Observe that the median regression will always have
two supporting observations.
Start with some regression line, yi = β0 + β1xi
Consider small translations yi = (β0 ± ) + β1xi
We minimize
n
i=1
yi − (β0 + β1xi)
From line blue, a shift up decrease the sum by
until we meet point on the left
an additional shift up will increase the sum
We will necessarily pass through one point
(observe that the sum is piecwise linear in )
−4 −2 0 2 4 6
51015
H
D
@freakonometrics freakonometrics freakonometrics.hypotheses.org 39
q
q
q
0 1 2 3 4
0246810
x
y
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Geometric Properties of the Quantile Regression
Consider now rotations of the line around the support
point
If we rotate up, we increase the sum of absolute differ-
ence (large impact on the point on the right)
If we rotate down, we decrease the sum, until we reach
the point on the right
Thus, the median regression will always have two sup-
portting observations.
1 > library(quantreg)
2 > fit <- rq(dist˜speed , data=cars , tau =.5)
3 > which(predict(fit)== cars$dist)
4 1 21 46
5 1 21 46
−4 −2 0 2 4 6
5101520
H
D
q
q
q
0 1 2 3 4
0246810
x
y
@freakonometrics freakonometrics freakonometrics.hypotheses.org 40
q
q
q
0 1 2 3 4
0246810
x
y
q
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Numerical Aspects
To illustrate numerical computations, use
1 base=read.table("http:// freakonometrics .free.fr/rent98_00. txt",header
=TRUE)
The linear program for the quantile regression is now
min
µ,a,b
n
i=1
τai + (1 − τ)bi
with ai, bi ≥ 0 and yi − [βτ
0 + βτ
1 xi] = ai − bi, ∀i = 1, · · · , n.
1 require(lpSolve)
2 tau = .3
3 n=nrow(base)
4 X = cbind( 1, base$area)
5 y = base$rent_euro
6 A1 = cbind(diag (2*n), 0,0)
7 A2 = cbind(diag(n), -diag(n), X)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 41
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Numerical Aspects
1 r = lp("min",
2 c(rep(tau ,n), rep(1-tau ,n) ,0,0), rbind(A1 , A2),
3 c(rep("&gt;=", 2*n), rep("=", n)), c(rep(0,2*n), y))
4 tail(r$solution ,2)
5 [1] 148.946864 3.289674
see
1 library(quantreg)
2 rq(rent_euro˜area , tau=tau , data=base)
3 Coefficients :
4 (Intercept) area
5 148.946864 3.289674
@freakonometrics freakonometrics freakonometrics.hypotheses.org 42
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Numerical Aspects
1 plot(base$area ,base$rent_euro ,xlab=expression(
paste("surface (",mˆ2,")")),
2 ylab="rent (euros/month)",col=rgb
(0 ,0 ,1 ,.4),cex =.5)
3 sf =0:250
4 yr=r$solution [2*n+1]+r$solution [2*n+2]*sf
5 lines(sf ,yr ,lwd=2,col="blue")
6 tau = .9
7 r = lp("min",c(rep(tau ,n), rep(1-tau ,n) ,0,0),
rbind(A1 , A2),c(rep(" >=", 2*n), rep("=", n)
), c(rep (0,2*n), y))
8 yr=r$solution [2*n+1]+r$solution [2*n+2]*sf
9 lines(sf ,yr ,lwd=2,col="blue")
@freakonometrics freakonometrics freakonometrics.hypotheses.org 43
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Numerical Aspects
For multiple regression, we should consider some trick (R function assumes all
variables are nonnegative)
1 tau = 0.3
2 n = nrow(base)
3 X = cbind (1, base$area , base$yearc)
4 y = base$rent_euro
5 r = lp("min",
6 c(rep(tau , n), rep(1 - tau , n), rep(0, 2 * 3)),
7 cbind(diag(n), -diag(n), X, -X),
8 rep("=", n),
9 y)
10 beta = tail(r$solution , 6)
11 beta = beta [1:3] - beta [3 + 1:3]
12 beta
13 [1] -5542.503252 3.978135 2.887234
@freakonometrics freakonometrics freakonometrics.hypotheses.org 44
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Numerical Aspects
which is consistant with
1 library(quantreg)
2 rq(rent_euro˜area+yearc , tau=tau , data=base)
3 Coefficients :
4 (Intercept) area yearc
5 -5542.503252 3.978135 2.887234
@freakonometrics freakonometrics freakonometrics.hypotheses.org 45
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Distributional Aspects
OLS are equivalent to MLE when Y − m(x) ∼ N(0, σ2
), with density
g( ) =
1
σ
√
2π
exp −
2
2σ2
Quantile regression is equivalent to Maximum Likelihood Estimation when
Y − m(x) has an asymmetric Laplace distribution
g( ) =
√
2
σ
κ
1 + κ2
exp −
√
2κ1( >0)
σκ1( <0)
| |
@freakonometrics freakonometrics freakonometrics.hypotheses.org 46
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression and Iterative Least Squares
start with some β(0)
e.g. βols
at stage k :
let ε
(k)
i = yi − xT
i β(k−1)
define weights ω
(k)
i = Rτ (ε
(k)
i )
compute weighted least square to estimate β(k)
One can also consider a smooth approximation of Rq
τ (·), and then use
Newton-Raphson.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 47
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
5 10 15 20 25
020406080100120
speed
dist
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Optimization Algorithm
Primal problem is
min
β,u,v
τ1T
u + (1 − τ)1T
v s.t. y = Xβ + u − v, with u, v ∈ Rn
+
and the dual version is
max
d
yT
d s.t. XT
d = (1 − τ)XT
1 with d ∈ [0, 1]n
Koenker & D’Orey (1994) A Remark on Algorithm AS 229: Computing Dual
Regression Quantiles and Regression Rank Scores suggest to use the simplex
method (default method in R)
Portnoy & Koenker (1997) The Gaussian hare and the Laplacian tortoise suggest to
use the interior point method *
@freakonometrics freakonometrics freakonometrics.hypotheses.org 48
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Interior Point Method
See Vanderbei et al. (1986) A modification of Karmarkar’s linear programming
algorithm for a presentation of the algorithm, Potra & Wright (2000) Interior-point
methods for a general survey, and and Meketon (1986) Least absolute value
regression for an application of the algorithm in the context of median regression.
Running time is of order n1+δ
k3
for some δ > 0 and k = dim(β)
(it is (n + k)k2
for OLS, see wikipedia).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 49
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression Estimators
OLS estimator β
ols
is solution of
β
ols
= argmin E E[Y |X = x] − xT
β
2
and Angrist, Chernozhukov & Fernandez-Val (2006) Quantile Regression under
Misspecification proved that
βτ = argmin E ωτ (β) Qτ [Y |X = x] − xT
β
2
(under weak conditions) where
ωτ (β) =
1
0
(1 − u)fy|x(uxT
β + (1 − u)Qτ [Y |X = x])du
βτ is the best weighted mean square approximation of the tru quantile function,
where the weights depend on an average of the conditional density of Y over xT
β
and the true quantile regression function.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 50
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Assumptions to get Consistency of Quantile Regression Estimators
As always, we need some assumptions to have consistency of estimators.
• observations (Yi, Xi) must (conditionnaly) i.id.
• regressors must have a bounded second moment, E Xi
2
< ∞
• error terms ε are continuously distributed given Xi, centered in the sense
that their median should be 0,
0
−∞
fε( )d =
1
2
.
• “local identification” property : fε(0)XXT
is positive definite
@freakonometrics freakonometrics freakonometrics.hypotheses.org 51
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression Estimators
Under those weak conditions, βτ is asymptotically normal:
√
n(βτ − βτ )
L
→ N(0, τ(1 − τ)D−1
τ ΩxD−1
τ ),
where
Dτ = E fε(0)XXT
and Ωx = E XT
X .
hence, the asymptotic variance of β is
Var βτ =
τ(1 − τ)
[fε(0)]2
1
n
n
i=1
xT
i xi
−1
where fε(0) is estimated using (e.g.) an histogram, as suggested in Powell (1991)
Estimation of monotonic regression models under quantile restrictions, since
Dτ = lim
h↓0
E
1(|ε| ≤ h)
2h
XXT
∼
1
2nh
n
i=1
1(|εi| ≤ h)xixT
i = Dτ .
@freakonometrics freakonometrics freakonometrics.hypotheses.org 52
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression Estimators
There is no first order condition, in the sense ∂Vn(β, τ)/∂β = 0 where
Vn(β, τ) =
n
i=1
Rq
τ (yi − xT
i β)
There is an asymptotic first order condition,
1
√
n
n
i=1
xiψτ (yi − xT
i β) = O(1), as n → ∞,
where ψτ (·) = 1(· < 0) − τ, see Huber (1967) The behavior of maximum likelihood
estimates under nonstandard conditions.
One can also define a Wald test, a Likelihood Ratio test, etc.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 53
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression Estimators
Then the confidence interval of level 1 − α is then
βτ ± z1−α/2 Var βτ
An alternative is to use a boostrap strategy (see #2)
• generate a sample (y
(b)
i , x
(b)
i ) from (yi, xi)
• estimate β(b)
τ by
β
(b)
τ = argmin Rq
τ y
(b)
i − x
(b)T
i β
• set Var βτ =
1
B
B
b=1
β
(b)
τ − βτ
2
For confidence intervals, we can either use Gaussian-type confidence intervals, or
empirical quantiles from bootstrap estimates.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 54
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression Estimators
If τ = (τ1, · · · , τm), one can prove that
√
n(βτ − βτ )
L
→ N(0, Στ ),
where Στ is a block matrix, with
Στi,τj = (min{τi, τj} − τiτj)D−1
τi
ΩxD−1
τj
see Kocherginsky et al. (2005) Practical Confidence Intervals for Regression
Quantiles for more details.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 55
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression: Transformations
Scale equivariance
For any a > 0 and τ ∈ [0, 1]
ˆβτ (aY, X) = aˆβτ (Y, X) and ˆβτ (−aY, X) = −aˆβ1−τ (Y, X)
Equivariance to reparameterization of design
Let A be any p × p nonsingular matrix and τ ∈ [0, 1]
ˆβτ (Y, XA) = A−1 ˆβτ (Y, X)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 56
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Visualization, τ → βτ
See Abreveya (2001) The effects of demographics and maternal behavior...
1 > base=read.table("http:// freakonometrics .free.fr/ natality2005 .txt")
20 40 60 80
−6−4−20246
probability level (%)
AGE
10 20 30 40 50
01000200030004000500060007000
Age (of the mother) AGE
BirthWeight(ing.) 1%
5%
10%
25%
50%
75%
90%
95%
@freakonometrics freakonometrics freakonometrics.hypotheses.org 57
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Visualization, τ → βτ
1 > base=read.table("http:// freakonometrics .free.fr/ natality2005 .txt",
header=TRUE ,sep=";")
2 > u=seq (.05 ,.95 , by =.01)
3 > library(quantreg)
4 > coefstd=function(u) summary(rq(WEIGHT˜SEX+SMOKER+WEIGHTGAIN+
BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,tau=u))$
coefficients [,2]
5 > coefest=function(u) summary(rq(WEIGHT˜SEX+SMOKER+WEIGHTGAIN+
BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,tau=u))$
coefficients [,1]
6 CS=Vectorize(coefstd)(u)
7 CE=Vectorize(coefest)(u)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 58
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Visualization, τ → βτ
See Abreveya (2001) The effects of demographics and maternal behavior on the
distribution of birth outcomes
20 40 60 80
−6−4−20246
probability level (%)
AGE
20 40 60 80
708090100110120130140
probability level (%)
SEXM
20 40 60 80
−200−180−160−140−120
probability level (%)
SMOKERTRUE
20 40 60 80
3.54.04.5
probability level (%)
WEIGHTGAIN
20 40 60 80
20406080
probability level (%)
COLLEGETRUE
@freakonometrics freakonometrics freakonometrics.hypotheses.org 59
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Visualization, τ → βτ
See Abreveya (2001) The effects of demographics and maternal behavior...
1 > base=read.table("http:// freakonometrics .free.fr/BWeight.csv")
20 40 60 80
−202468
probability level (%)
mom_age
20 40 60 80
406080100120140
probability level (%)
boy
20 40 60 80
−190−180−170−160−150−140
probability level (%)
smoke
20 40 60 80
−350−300−250−200−150
probability level (%)
black
20 40 60 80
−10−505
probability level (%)
ed
@freakonometrics freakonometrics freakonometrics.hypotheses.org 60
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression, with Non-Linear Effects
Rents in Munich, as a function of the area, from Fahrmeir et al. (2013)
Regression: Models, Methods and Applications
1 > base=read.table("http:// freakonometrics .free.fr/rent98_00. txt")
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
@freakonometrics freakonometrics freakonometrics.hypotheses.org 61
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression, with Non-Linear Effects
Rents in Munich, as a function of the year of construction, from Fahrmeir et al.
(2013) Regression: Models, Methods and Applications
1920 1940 1960 1980 2000
050010001500
Year of Construction
Rent(euros)
50%
10%
25%
75%
90%
1920 1940 1960 1980 2000
050010001500
Year of Construction
Rent(euros)
50%
10%
25%
75%
90%
@freakonometrics freakonometrics freakonometrics.hypotheses.org 62
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression, with Non-Linear Effects
BMI as a function of the age, in New-Zealand, from Yee (2015) Vector Generalized
Linear and Additive Models, for Women and Men
1 > library(VGAMdata); data(xs.nz)
20 40 60 80 100
15202530354045
Age (Women, ethnicity = European)
BMI
5%
25%
50%
75%
95%
20 40 60 80 100
15202530354045
Age (Men, ethnicity = European)
BMI 5%
25%
50%
75%
95%
@freakonometrics freakonometrics freakonometrics.hypotheses.org 63
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression, with Non-Linear Effects
BMI as a function of the age, in New-Zealand, from Yee (2015) Vector Generalized
Linear and Additive Models, for Women and Men
20 40 60 80 100
15202530354045
Age (Women)
BMI
50%
95%
50%
95%
Maori
European
20 40 60 80 100
15202530354045
Age (Men)
BMI
50%
95%
Maori
European
50%
95%
@freakonometrics freakonometrics freakonometrics.hypotheses.org 64
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression, with Non-Linear Effects
One can consider some local polynomial quantile regression, e.g.
min
n
i=1
ωi(x)Rq
τ yi − β0 − (xi − x)T
β1
for some weights ωi(x) = H−1
K(H−1
(xi − x)), see Fan, Hu & Truong (1994)
Robust Non-Parametric Function Estimation.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 65
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Asymmetric Maximum Likelihood Estimation
Introduced by Efron (1991) Regression percentiles using asymmetric squared error
loss. Consider a linear model, yi = xT
i β + εi. Let
S(β) =
n
i=1
Qω(yi − xT
i β), where Qω( ) =



2
if ≤ 0
w 2
if > 0
where w =
ω
1 − ω
One might consider ωα = 1 +
zα
ϕ(zα) + (1 − α)zα
where zα = Φ−1
(α).
Efron (1992) Poisson overdispersion estimates based on the method of asymmetric
maximum likelihood introduced asymmetric maximum likelihood (AML)
estimation, considering
S(β) =
n
i=1
Qω(yi − xT
i β), where Qω( ) =



D(yi, xT
i β) if yi ≤ xT
i β
wD(yi, xT
i β) if yi > xT
i β
where D(·, ·) is the deviance. Estimation is based on Newton-Raphson (gradient
descent).
@freakonometrics freakonometrics freakonometrics.hypotheses.org 66
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Noncrossing Solutions
See Bondell et al. (2010) Non-crossing quantile regression curve estimation.
Consider probabilities τ = (τ1, · · · , τq) with 0 < τ1 < · · · < τq < 1.
Use parallelism : add constraints in the optimization problem, such that
xT
i βτj
≥ xT
i βτj−1
∀i ∈ {1, · · · , n}, j ∈ {2, · · · , q}.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 67
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression on Panel Data
In the context of panel data, consider some fixed effect, αi so that
yi,t = xT
i,tβτ + αi + εi,t where Qτ (εi,t|Xi) = 0
Canay (2011) A simple approach to quantile regression for panel data suggests an
estimator in two steps,
• use a standard OLS fixed-effect model yi,t = xT
i,tβ + αi + ui,t, i.e. consider a
within transformation, and derive the fixed effect estimate β
(yi,t − yi) = xi,t − xi,t
T
β + (ui,t − ui)
• estimate fixed effects as αi =
1
T
T
t=1
yi,t − xT
i,tβ
• finally, run a standard quantile regression of yi,t − αi on xi,t’s.
See rqpd package.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 68
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression with Fixed Effects (QRFE)
In a panel linear regression model, yi,t = xT
i,tβ + ui + εi,t,
where u is an unobserved individual specific effect.
In a fixed effects models, u is treated as a parameter. Quantile Regression is
min
β,u



i,t
Rq
α(yi,t − [xT
i,tβ + ui])



Consider Penalized QRFE, as in Koenker & Bilias (2001) Quantile regression for
duration data,
min
β1,··· ,βκ,u



k,i,t
ωkRq
αk
(yi,t − [xT
i,tβk + ui]) + λ
i
|ui|



where ωk is a relative weight associated with quantile of level αk.
@freakonometrics freakonometrics freakonometrics.hypotheses.org 69
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression with Random Effects (QRRE)
Assume here that yi,t = xT
i,tβ + ui + εi,t
=ηi,t
.
Quantile Regression Random Effect (QRRE) yields solving
min
β



i,t
Rq
α(yi,t − xT
i,tβ)



which is a weighted asymmetric least square deviation estimator.
Let Σ = [σs,t(α)] denote the matrix
σts(α) =



α(1 − α) if t = s
E[1{εit(α) < 0, εis(α) < 0}] − α2
if t = s
If (nT)−1
XT
{In ⊗ ΣT ×T (α)}X → D0 as n → ∞ and (nT)−1
XT
Ωf X = D1,
then √
nT β
Q
(α) − βQ
(α)
L
−→ N 0, D−1
1 D0D−1
1 .
@freakonometrics freakonometrics freakonometrics.hypotheses.org 70
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Treatment Effects
Doksum (1974) Empirical Probability Plots and Statistical Inference for Nonlinear
Models introduced QTE - Quantile Treatement Effect - when a person might have
two Y ’s : either Y0 (without treatment, D = 0) or Y1 (with treatement, D = 1),
δτ = QY1 (τ) − QY0 (τ)
which can be studied on the context of covariates.
Run a quantile regression of y on (d, x),
y = β0 + δd + xT
i β + εi : shifting effect
y = β0 + xT
i β + δd + εi : scaling effect
−4 −2 0 2 4
0.00.20.40.60.81.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 71
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression for Time Series
Consider some GARCH(1,1) financial time series,
yt = σtεt where σt = α0 + α1 · |yt−1| + β1σt−1.
The quantile function conditional on the past - Ft−1 = Y t−1 - is
Qy|Ft−1
(τ) = α0F−1
ε (τ)
˜α0
+ α1F−1
ε (τ)
˜α1
·|yt−1| + β1Qy|Ft−2
(τ)
i.e. the conditional quantile has a GARCH(1,1) form, see Conditional
Autoregressive Value-at-Risk, see Manganelli & Engle (2004) CAViaR: Conditional
Autoregressive Value at Risk by Regression Quantiles
@freakonometrics freakonometrics freakonometrics.hypotheses.org 72
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Quantile Regression for Spatial Data
1 > library(McSpatial)
2 > data(cookdata)
3 > fit <- qregcpar(LNFAR˜DCBD , nonpar=˜LATITUDE+LONGITUDE , taumat=c
(.10 ,.90) , kern="bisq", window =.30 , distance="LATLONG", data=
cookdata)
10% Quantiles
−2.0
−1.5
−1.0
−0.5
0.0
0.5
90% Quantiles
−2.0
−1.5
−1.0
−0.5
0.0
0.5
Difference between .10 and.90 Quantiles
0.5
0.6
0.7
0.8
0.9
1.0
@freakonometrics freakonometrics freakonometrics.hypotheses.org 73
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression
Quantile regression vs. Expectile regression, on the same dataset (cars)
20 40 60 80
23456
probability level (%)
Slope(quantileregression)
20 40 60 80
23456
probability level (%)
Slope(expectileregression)
see Koenker (2014) Living Beyond our Means for a comparison quantiles-expectiles
@freakonometrics freakonometrics freakonometrics.hypotheses.org 74
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression
Solve here min
β
n
i=1
Re
τ (yi − xT
i β) where Re
τ (u) = u2
· τ − 1(u < 0)
“this estimator can be interpreted as a maximum likelihood estimator when the
disturbances arise from a normal distribution with unequal weight placed on
positive and negative disturbances” Aigner, Amemiya & Poirier (1976)
Formulation and Estimation of Stochastic Frontier Production Function Models.
See Holzmann & Klar (2016) Expectile Asymptotics for statistical properties.
Expectiles can (also) be related to Breckling & Chambers (1988) M-Quantiles.
Comparison quantile regression and expectile regression, see Schulze-Waltrup et
al. (2014) Expectile and quantile regression - David and Goliath?
@freakonometrics freakonometrics freakonometrics.hypotheses.org 75
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression, with Linear Effects
Zhang (1994) Nonparametric regression expectiles
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
Quantile Regressions Expectile Regressions
@freakonometrics freakonometrics freakonometrics.hypotheses.org 76
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression, with Non-Linear Effects
See Zhang (1994) Nonparametric regression expectiles
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
50 100 150 200 250
050010001500
Area (m2
)
Rent(euros)
50%
10%
25%
75%
90%
Quantile Regressions Expectile Regressions
@freakonometrics freakonometrics freakonometrics.hypotheses.org 77
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression, with Linear Effects
1 > library(expectreg)
2 > coefstd=function(u) summary(expectreg.ls(WEIGHT˜SEX+SMOKER+
WEIGHTGAIN+BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,
expectiles=u,ci = TRUE))[,2]
3 > coefest=function(u) summary(expectreg.ls(WEIGHT˜SEX+SMOKER+
WEIGHTGAIN+BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,
expectiles=u,ci = TRUE))[,1]
4 > CS=Vectorize(coefstd)(u)
5 > CE=Vectorize(coefest)(u)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 78
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression, with Random Effects (ERRE)
Quantile Regression Random Effect (QRRE) yields solving
min
β



i,t
Re
α(yi,t − xT
i,tβ)



One can prove that
β
e
(τ) =
n
i=1
T
t=1
ωi,t(τ)xitxT
it
−1 n
i=1
T
t=1
ωi,t(τ)xityit ,
where ωit(τ) = τ − 1(yit < xT
itβ
e
(τ)) .
@freakonometrics freakonometrics freakonometrics.hypotheses.org 79
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Expectile Regression with Random Effects (ERRE)
If W = diag(ω11(τ), . . . ωnT (τ)), set
W = E(W), H = XT
WX and Σ = XT
E(WεεT
W)X.
and then
√
nT β
e
(τ) − βe
(τ)
L
−→ N(0, H−1
ΣH−1
),
see Barry et al. (2016) Quantile and Expectile Regression for random effects model.
See, for expectile regressions, with R,
1 > library(expectreg)
2 > fit <- expectreg.ls(rent_euro ˜ area , data=munich , expectiles =.75)
3 > fit <- expectreg.ls(rent_euro ˜ rb(area ,"pspline"), data=munich ,
expectiles =.75)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 80
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Application to Real Data
@freakonometrics freakonometrics freakonometrics.hypotheses.org 81
Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics
Extensions
The mean of Y is ν(FY ) =
+∞
−∞
ydFY (y)
The quantile of level τ for Y is ντ (FY ) = F−1
Y (τ)
More generaly, consider some functional ν(F) (Gini or Theil index, entropy, etc),
see Foresi & Peracchi (1995) The Conditional Distribution of Excess Returns
Can we estimate ν(FY |x) ?
Firpo et al. (2009) Unconditional Quantile Regressions suggested to use influence
function regression
Machado & Mata (2005) Counterfactual decomposition of changes in wage
distributions and Chernozhukov et al. (2013) Inference on counterfactual
distributions suggested indirect distribution function.
Influence function of index ν(F) at y is
IF(y, ν, F) = lim
ε↓0
ν((1 − )F + δy) − ν(F)
@freakonometrics freakonometrics freakonometrics.hypotheses.org 82

More Related Content

PDF
Slides ub-1
PDF
Slides ub-2
PDF
Slides ub-7
PDF
Slides econometrics-2018-graduate-1
PDF
Lausanne 2019 #1
PDF
Varese italie seminar
PDF
Mutualisation et Segmentation
PDF
Side 2019 #6
Slides ub-1
Slides ub-2
Slides ub-7
Slides econometrics-2018-graduate-1
Lausanne 2019 #1
Varese italie seminar
Mutualisation et Segmentation
Side 2019 #6

What's hot (20)

PDF
Slides econometrics-2018-graduate-4
PDF
Side 2019 #10
PDF
Machine Learning for Actuaries
PDF
Side 2019 #8
PDF
Varese italie #2
PDF
Rencontres Mutualistes
PDF
Hands-On Algorithms for Predictive Modeling
PDF
Side 2019 #4
PDF
Slides networks-2017-2
PDF
Econ. Seminar Uqam
PDF
Actuarial Pricing Game
PDF
Slides econometrics-2017-graduate-2
PDF
Slides econ-lm
PDF
Slides lln-risques
PDF
Slides simplexe
PDF
Pareto Models, Slides EQUINEQ
PDF
Classification
PDF
Slides barcelona Machine Learning
PDF
Slides econ-lm
PDF
Slides ineq-4
Slides econometrics-2018-graduate-4
Side 2019 #10
Machine Learning for Actuaries
Side 2019 #8
Varese italie #2
Rencontres Mutualistes
Hands-On Algorithms for Predictive Modeling
Side 2019 #4
Slides networks-2017-2
Econ. Seminar Uqam
Actuarial Pricing Game
Slides econometrics-2017-graduate-2
Slides econ-lm
Slides lln-risques
Slides simplexe
Pareto Models, Slides EQUINEQ
Classification
Slides barcelona Machine Learning
Slides econ-lm
Slides ineq-4
Ad

Similar to Slides ub-3 (20)

PDF
Econometrics 2017-graduate-3
PDF
Side 2019, part 2
PDF
Slides econometrics-2018-graduate-2
PDF
Slides ACTINFO 2016
PDF
Varese italie #2
PDF
Side 2019 #9
PDF
Side 2019, part 1
PDF
Slides Bank England
PDF
Graduate Econometrics Course, part 4, 2017
PDF
Slides ensae-2016-9
PDF
Slides erasmus
PDF
Slides econometrics-2018-graduate-3
PDF
Slides ensae 9
PDF
Side 2019 #3
PDF
Reinforcement Learning in Economics and Finance
PDF
Side 2019 #5
PDF
20130928 automated theorem_proving_harrison
PDF
Sildes buenos aires
PDF
Insufficient Gibbs sampling (A. Luciano, C.P. Robert and R. Ryder)
PDF
Bayesian Experimental Design for Stochastic Kinetic Models
Econometrics 2017-graduate-3
Side 2019, part 2
Slides econometrics-2018-graduate-2
Slides ACTINFO 2016
Varese italie #2
Side 2019 #9
Side 2019, part 1
Slides Bank England
Graduate Econometrics Course, part 4, 2017
Slides ensae-2016-9
Slides erasmus
Slides econometrics-2018-graduate-3
Slides ensae 9
Side 2019 #3
Reinforcement Learning in Economics and Finance
Side 2019 #5
20130928 automated theorem_proving_harrison
Sildes buenos aires
Insufficient Gibbs sampling (A. Luciano, C.P. Robert and R. Ryder)
Bayesian Experimental Design for Stochastic Kinetic Models
Ad

More from Arthur Charpentier (15)

PDF
Family History and Life Insurance
PDF
ACT6100 introduction
PDF
Family History and Life Insurance (UConn actuarial seminar)
PDF
Control epidemics
PDF
STT5100 Automne 2020, introduction
PDF
Family History and Life Insurance
PDF
Machine Learning in Actuarial Science & Insurance
PDF
Optimal Control and COVID-19
PDF
Slides OICA 2020
PDF
Lausanne 2019 #3
PDF
Lausanne 2019 #4
PDF
Lausanne 2019 #2
PDF
Side 2019 #11
PDF
Side 2019 #12
PDF
Side 2019 #7
Family History and Life Insurance
ACT6100 introduction
Family History and Life Insurance (UConn actuarial seminar)
Control epidemics
STT5100 Automne 2020, introduction
Family History and Life Insurance
Machine Learning in Actuarial Science & Insurance
Optimal Control and COVID-19
Slides OICA 2020
Lausanne 2019 #3
Lausanne 2019 #4
Lausanne 2019 #2
Side 2019 #11
Side 2019 #12
Side 2019 #7

Recently uploaded (20)

PPTX
Business Acumen Training GuidePresentation.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Foundation of Data Science unit number two notes
PDF
annual-report-2024-2025 original latest.
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Lecture1 pattern recognition............
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Quality review (1)_presentation of this 21
oil_refinery_comprehensive_20250804084928 (1).pptx
IB Computer Science - Internal Assessment.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
climate analysis of Dhaka ,Banglades.pptx
Clinical guidelines as a resource for EBP(1).pdf
Qualitative Qantitative and Mixed Methods.pptx
Foundation of Data Science unit number two notes
annual-report-2024-2025 original latest.
Reliability_Chapter_ presentation 1221.5784
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Lecture1 pattern recognition............
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx

Slides ub-3

  • 1. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Big Data for Economics # 3* A. Charpentier (Universit´e de Rennes 1) https://github.com/freakonometrics/ub UB School of Economics Summer School, 2018. @freakonometrics freakonometrics freakonometrics.hypotheses.org 1
  • 2. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics #3 Loss Functions : from OLS to Quantile Regression* @freakonometrics freakonometrics freakonometrics.hypotheses.org 2
  • 3. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics References Motivation Machado & Mata (2005). Counterfactual decomposition of changes in wage distributions using quantile regression, JAE. References Givord & d’Haultfœuillle (2013) La r´egression quantile en pratique, INSEE Koenker & Bassett (1978) Regression Quantiles, Econometrica. Koenker (2005). Quantile Regression. Cambridge University Press. Newey & Powell (1987) Asymmetric Least Squares Estimation and Testing, Econometrica. @freakonometrics freakonometrics freakonometrics.hypotheses.org 3
  • 4. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles Let Y denote a random variable with cumulative distribution function F, F(y) = P[Y ≤ y]. The quantile is Q(u) = inf x ∈ R, F(x) > u . qqq DECESSURVIE 60 70 80 90 100 110 120 FRCAR q DECESSURVIE 1.0 1.5 2.0 2.5 3.0 INCAR q q q DECESSURVIE 10 20 30 40 50 INSYS q q DECESSURVIE 10 15 20 25 30 35 40 45 PAPUL DECESSURVIE 5 10 15 20 PVENT DECESSURVIE 500 1000 1500 2000 2500 3000 REPUL Box plot, from Tukey (1977, Exploratory Data Analysis). @freakonometrics freakonometrics freakonometrics.hypotheses.org 4
  • 5. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Defining halfspace depth Given y ∈ Rd , and a direction u ∈ Rd , define the closed half space Hy,u = {x ∈ Rd such that u x ≤ u y} and define depth at point y by depth(y) = inf u,u=0 {P(Hy,u)} i.e. the smallest probability of a closed half space containing y. The empirical version is (see Tukey (1975)) depth(y) = min u,u=0 1 n n i=1 1(Xi ∈ Hy,u) For α > 0.5, define the depth set as Dα = {y ∈ R ∈ Rd such that ≥ 1 − α}. The empirical version is can be related to the bagplot, Rousseeuw et al., 1999. @freakonometrics freakonometrics freakonometrics.hypotheses.org 5
  • 6. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Empirical sets extremely sentive to the algorithm −2 −1 0 1 −1.5−1.0−0.50.00.51.0 q q q q q q q q q q q q q q q q q q q q −2 −1 0 1 −1.5−1.0−0.50.00.51.0 q q q q q q q q q q q q q q q q q q q q where the blue set is the empirical estimation for Dα, α = 0.5. @freakonometrics freakonometrics freakonometrics.hypotheses.org 6
  • 7. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics The bagplot tool The depth function introduced here is the multivariate extension of standard univariate depth measures, e.g. depth(x) = min{F(x), 1 − F(x− )} which satisfies depth(Qα) = min{α, 1 − α}. But one can also consider depth(x) = 2 · F(x) · [1 − F(x− )] or depth(x) = 1 − 1 2 − F(x) . Possible extensions to functional bagplot. Consider a set of functions fi(x), i = 1, · · · , n, such that fi(x) = µ(x) + n−1 k=1 zi,kϕk(x) (i.e. principal component decomposition) where ϕk(·) represents the eigenfunctions. Rousseeuw et al., 1999 considered bivariate depth on the first two scores, xi = (zi,1, zi,2). See Ferraty & Vieu (2006). @freakonometrics freakonometrics freakonometrics.hypotheses.org 7
  • 8. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and Quantile Regressions Quantiles are important quantities in many areas (inequalities, risk, health, sports, etc). Quantiles of the N(0, 1) distribution. @freakonometrics freakonometrics freakonometrics.hypotheses.org 8 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q −3 0 1 2 3 −1.645 5%
  • 9. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics A First Model for Conditional Quantiles Consider a location model, y = β0 + xT β + ε i.e. E[Y |X = x] = xT β then one can consider Q(τ|X = x) = β0 + Qε(τ) + xT β where Qε(·) is the quantile function of the residuals. @freakonometrics freakonometrics freakonometrics.hypotheses.org 9 q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q 5 10 15 20 25 30 020406080100120 speed dist
  • 10. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics OLS Regression, 2 norm and Expected Value Let y ∈ Rd , y = argmin m∈R    n i=1 1 n yi − m εi 2    . It is the empirical version of E[Y ] = argmin m∈R    y − m ε 2 dF(y)    = argmin m∈R    E Y − m ε 2    where Y is a random variable. Thus, argmin m(·):Rk→R    n i=1 1 n yi − m(xi) εi 2    is the empirical version of E[Y |X = x]. See Legendre (1805) Nouvelles m´ethodes pour la d´etermination des orbites des com`etes and Gauβ (1809) Theoria motus corporum coelestium in sectionibus conicis solem ambientium. @freakonometrics freakonometrics freakonometrics.hypotheses.org 10
  • 11. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics OLS Regression, 2 norm and Expected Value Sketch of proof: (1) Let h(x) = d i=1 (x − yi)2 , then h (x) = d i=1 2(x − yi) and the FOC yields x = 1 n d i=1 yi = y. (2) If Y is continuous, let h(x) = R (x − y)f(y)dy and h (x) = ∂ ∂x R (x − y)2 f(y)dy = R ∂ ∂x (x − y)2 f(y)dy i.e. x = R xf(y)dy = R yf(y)dy = E[Y ] 0.0 0.2 0.4 0.6 0.8 1.0 0.51.01.52.02.5 0.0 0.2 0.4 0.6 0.8 1.0 0.51.01.52.02.5 @freakonometrics freakonometrics freakonometrics.hypotheses.org 11
  • 12. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Median Regression, 1 norm and Median Let y ∈ Rd , median[y] ∈ argmin m∈R    n i=1 1 n yi − m εi    . It is the empirical version of median[Y ] ∈ argmin m∈R    y − m ε dF(y)    = argmin m∈R    E Y − m ε 1    where Y is a random variable, P[Y ≤ median[Y ]] ≥ 1 2 and P[Y ≥ median[Y ]] ≥ 1 2 . argmin m(·):Rk→R    n i=1 1 n yi − m(xi) εi    is the empirical version of median[Y |X = x]. See Boscovich (1757) De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani and Laplace (1793) Sur quelques points du syst`eme du monde. @freakonometrics freakonometrics freakonometrics.hypotheses.org 12
  • 13. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Median Regression, 1 norm and Median Sketch of proof: (1) Let h(x) = d i=1 |x − yi| (2) If F is absolutely continuous, dF(x) = f(x)dx, and the median m is solution of m −∞ f(x)dx = 1 2 . Set h(y) = +∞ −∞ |x − y|f(x)dx = y −∞ (−x + y)f(x)dx + +∞ y (x − y)f(x)dx Then h (y) = y −∞ f(x)dx − +∞ y f(x)dx, and FOC yields y −∞ f(x)dx = +∞ y f(x)dx = 1 − y −∞ f(x)dx = 1 2 0.0 0.2 0.4 0.6 0.8 1.0 1.52.02.53.03.54.0 0.0 0.2 0.4 0.6 0.8 1.0 2.02.53.03.54.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 13
  • 14. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions In statistics, consider some functional seen as a distance between θ and θ, e.g. squared loss : 2(θ, θ) = (θ − θ)2 absolute loss : 1(θ, θ) = |θ − θ| zero/one loss ◦|(θ, θ) = 1(|θ − θ| > ε) for some ε > 0 Define risk as the expected loss, R(θ, θ) = (θ(y), θ) f(y|θ) L(θ,y) dy (where the average is over the sample space). @freakonometrics freakonometrics freakonometrics.hypotheses.org 14
  • 15. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions Bayes’Rule: Minimize average risk R(θ) = R(θ, θ)π(θ)dθ and set θ = argmin R(θ) Hence θ = argmin Θ Rn (θ(y), θ)f(y|θ)dyπ(θ)dθ θ = argmin Rn Θ (θ(y), θ)π(θ|y)dθf(y)dy @freakonometrics freakonometrics freakonometrics.hypotheses.org 15
  • 16. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions If = 2, θ is the posterior mean, θ = E[θ|y] : to solve argmin Θ (θ(y), θ)2 π(θ|y)dθ consider the first order condition 2 (θ(y), θ)π(θ|y)dθ = 0 i.e. θ π(θ|y)dθ = θπ(θ|y)dθ If = 1, θ is the posterior median, θ = median[θ|y] If = ◦|, θ is the posterior mode @freakonometrics freakonometrics freakonometrics.hypotheses.org 16
  • 17. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions Application : head/tails Bernoulli with uniform prior f(y) = θy (1 − θ)1−y , where y ∈ {0, 1} and π(θ) = 1(θ ∈ [0, 1]). Then likelihood is L(θ, y) = f(y|θ) = θ yi (1 − θ)n− yi From Baye’s Theorem, π(θ|y) ∝ π(θ) · f(y|θ) ∝ θ yi (1 − θ)n− yi which is a Beta distribution. Recall that B(α, β) g(u|α, β) = uα−1 (1 − u)β−1 B(α, β) on [0, 1] @freakonometrics freakonometrics freakonometrics.hypotheses.org 17
  • 18. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions and E[U] = α α + β , Var[U] = αβ (α + β)2(α + β + 1) , while median[U] ∼ 3α − 1 3α + 3β − 2 and mode[U] = α − 1 α + β − 2 if α, β > 1. Here, posterior distribition is B(ny + 1, n(1 − y) + 1). Here with 2, θ2 = E[θ|y] = ny + 1 n + 2 while with ◦|, θ = mode[θ|y] = y (which is also the maximum likelihood estimator). With a Beta B(a, b) prior, posterior distribition is B(ny + a + 1, n(1 − y) + b + 1). Here with 2, θ2 = E[θ|y] = ny + a n + a + b while with ◦|, θ = mode[θ|y] = ny + a − 1 n + a + b − 2 (which is no longer the maximum likelihood estimator). @freakonometrics freakonometrics freakonometrics.hypotheses.org 18
  • 19. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Bayesian Statistics and Loss Functions Example: Gaussian case (with known variance), Yi ∼ N(µ, σ2 0). Consider Gaussian prior, µ ∼ N(m, s2 ). We are uncertain here about the value of µ. Then, the posterior distribution for µ is a Gaussian distribution µ|y ∼ N(my, s2 y) where my = s2 y m s2 + ny σ2 0 and s2 y = 1 s2 + n σ2 0 −1 Here my is Bayes estimator of µ under loss functions 1, 2 and ◦|. When σ2 0 is no longer known, but is just a nuisance parameter, the natural approach is to consider a joint posterior density for θ = (µ, σ2 ) and then marginalize by integrating out the nuisance parameters @freakonometrics freakonometrics freakonometrics.hypotheses.org 19
  • 20. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics OLS vs. Median Regression (Least Absolute Deviation) Consider some linear model, yi = β0 + xT i β + εi ,and define (βols 0 , β ols ) = argmin n i=1 yi − β0 − xT i β 2 (βlad 0 , β lad ) = argmin n i=1 yi − β0 − xT i β Assume that ε|X has a symmetric distribution, E[ε|X] = median[ε|X] = 0, then (βols 0 , β ols ) and (βlad 0 , β lad ) are consistent estimators of (β0, β). Assume that ε|X does not have a symmetric distribution, but E[ε|X] = 0, then β ols and β lad are consistent estimators of the slopes β. If median[ε|X] = γ, then βlad 0 converges to β0 + γ. @freakonometrics freakonometrics freakonometrics.hypotheses.org 20
  • 21. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics OLS vs. Median Regression Median regression is stable by monotonic transformation. If log[yi] = β0 + xT i β + εi with median[ε|X] = 0, then median[Y |X = x] = exp median[log(Y )|X = x] = exp β0 + xT i β while E[Y |X = x] = exp E[log(Y )|X = x] (= exp E[log(Y )|X = x] ·[exp(ε)|X = x] 1 > ols <- lm(y˜x, data=df) 2 > library(quantreg) 3 > lad <- rq(y˜x, data=df , tau =.5) @freakonometrics freakonometrics freakonometrics.hypotheses.org 21
  • 22. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Notations Cumulative distribution function FY (y) = P[Y ≤ y]. Quantile function QX(u) = inf y ∈ R : FY (y) ≥ u , also noted QX(u) = F−1 X u. One can consider QX(u) = sup y ∈ R : FY (y) < u For any increasing transformation t, Qt(Y )(τ) = t QY (τ) F(y|x) = P[Y ≤ y|X = x] QY |x(u) = F−1 (u|x) @freakonometrics freakonometrics freakonometrics.hypotheses.org 22 q q q q q q q q q q q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0 1 2 3 4 5 0.00.20.40.60.81.0 q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0 1 2 3 4 5 0.00.20.40.60.81.0 q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 0 1 2 3 4 5 0.00.20.40.60.81.0 q q
  • 23. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Empirical Quantile @freakonometrics freakonometrics freakonometrics.hypotheses.org 23
  • 24. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and optimization : numerical aspects Consider the case of the median. Consider a sample {y1, · · · , yn}. To compute the median, solve min µ n i=1 |yi − µ| which can be solved using linear programming techniques. More precisely, this problem is equivalent to min µ,a,b n i=1 ai + bi with ai, bi ≥ 0 and yi − µ = ai − bi, ∀i = 1, · · · , n. Consider a sample obtained from a lognormal distribution 1 n = 101 2 set.seed (1) 3 y = rlnorm(n) 4 median(y) 5 [1] 1.077415 @freakonometrics freakonometrics freakonometrics.hypotheses.org 24
  • 25. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and optimization : numerical aspects Here, one can use a standard optimization routine 1 md=Vectorize(function(m) sum(abs(y-m))) 2 optim(mean(y),md) 3 $par 4 [1] 1.077416 or a linear programing technique : use the matrix form, with 3n constraints, and 2n + 1 parameters, 1 library(lpSolve) 2 A1 = cbind(diag (2*n) ,0) 3 A2 = cbind(diag(n), -diag(n), 1) 4 r = lp("min", c(rep (1,2*n) ,0), 5 rbind(A1 , A2),c(rep("&gt;=", 2*n), rep("=", n)), c(rep (0,2*n), y)) 6 tail(r$solution ,1) 7 [1] 1.077415 @freakonometrics freakonometrics freakonometrics.hypotheses.org 25
  • 26. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and optimization : numerical aspects More generally, consider here some quantile, 1 tau = .3 2 quantile(x,tau) 3 30% 4 0.6741586 The linear program is now min µ,a,b n i=1 τai + (1 − τ)bi with ai, bi ≥ 0 and yi − µ = ai − bi, ∀i = 1, · · · , n. 1 A1 = cbind(diag (2*n) ,0) 2 A2 = cbind(diag(n), -diag(n), 1) 3 r = lp("min", c(rep(tau ,n),rep(1-tau ,n) ,0), 4 rbind(A1 , A2),c(rep("&gt;=", 2*n), rep("=", n)), c(rep (0,2*n), y)) 5 tail(r$solution ,1) 6 [1] 0.6741586 @freakonometrics freakonometrics freakonometrics.hypotheses.org 26
  • 27. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile regression ? In OLS regression, we try to evaluate E[Y |X = x] = R ydFY |X=x(y) In quantile regression, we try to evaluate Qu(Y |X = x) = inf y : FY |X=x(y) ≥ u as introduced in Newey & Powell (1987) Asymmetric Least Squares Estimation and Testing. Li & Racine (2007) Nonparametric Econometrics: Theory and Practice suggested Qu(Y |X = x) = inf y : FY |X=x(y) ≥ u where FY |X=x(y) can be some kernel-based estimator. @freakonometrics freakonometrics freakonometrics.hypotheses.org 27
  • 28. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and Expectiles Consider the following risk functions Rq τ (u) = u · τ − 1(u < 0) , τ ∈ [0, 1] with Rq 1/2(u) ∝ |u| = u 1 , and Re τ (u) = u2 · τ − 1(u < 0) , τ ∈ [0, 1] with Re 1/2(u) ∝ u2 = u 2 2 . QY (τ) = argmin m E Rq τ (Y − m) which is the median when τ = 1/2, EY (τ) = argmin m E Re τ (X − m) } which is the expected value when τ = 1/2. @freakonometrics freakonometrics freakonometrics.hypotheses.org 28 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.00.51.01.5 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.00.51.01.5
  • 29. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and Expectiles One can also write quantile: argmin    n i=1 ωq τ (εi) yi − qi εi    where ωq τ ( ) =    1 − τ if ≤ 0 τ if > 0 expectile: argmin    n i=1 ωe τ (εi) yi − qi εi 2    where ωe τ ( ) =    1 − τ if ≤ 0 τ if > 0 Expectiles are unique, not quantiles... Quantiles satisfy E[sign(Y − QY (τ))] = 0 Expectiles satisfy τE (Y − eY (τ))+ = (1 − τ)E (Y − eY (τ))− (those are actually the first order conditions of the optimization problem). @freakonometrics freakonometrics freakonometrics.hypotheses.org 29
  • 30. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles and M-Estimators There are connections with M-estimators, as introduced in Serfling (1980) Approximation Theorems of Mathematical Statistics, chapter 7. For any function h(·, ·), the M-functional is the solution β of h(y, β)dFY (y) = 0 , and the M-estimator is the solution of h(y, β)dFn(y) = 1 n n i=1 h(yi, β) = 0 Hence, if h(y, β) = y − β, β = E[Y ] and β = y. And if h(y, β) = 1(y < β) − τ, with τ ∈ (0, 1), then β = F−1 Y (τ). @freakonometrics freakonometrics freakonometrics.hypotheses.org 30
  • 31. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantiles, Maximal Correlation and Hardy-Littlewood-Polya If x1 ≤ · · · ≤ xn and y1 ≤ · · · ≤ yn, then n i=1 xiyi ≥ n i=1 xiyσ(i), ∀σ ∈ Sn, and x and y are said to be comonotonic. The continuous version is that X and Y are comonotonic if E[XY ] ≥ E[X ˜Y ] where ˜Y L = Y, One can prove that Y = QY (FX(X)) = argmax ˜Y ∼FY E[X ˜Y ] @freakonometrics freakonometrics freakonometrics.hypotheses.org 31
  • 32. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectiles as Quantiles For every Y ∈ L1 , τ → eY (τ) is continuous, and striclty increasing if Y is absolutely continuous, ∂eY (τ) ∂τ = E[|X − eY (τ)|] (1 − τ)FY (eY (τ)) + τ(1 − FY (eY (τ))) if X ≤ Y , then eX(τ) ≤ eY (τ) ∀τ ∈ (0, 1) “Expectiles have properties that are similar to quantiles” Newey & Powell (1987) Asymmetric Least Squares Estimation and Testing. The reason is that expectiles of a distribution F are quantiles a distribution G which is related to F, see Jones (1994) Expectiles and M-quantiles are quantiles: let G(t) = P(t) − tF(t) 2[P(t) − tF(t)] + t − µ where P(s) = s −∞ ydF(y). The expectiles of F are the quantiles of G. 1 > x <- rnorm (99) 2 > library(expectreg) 3 > e <- expectile(x, probs = seq(0, 1, 0.1)) @freakonometrics freakonometrics freakonometrics.hypotheses.org 32
  • 33. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectiles as Quantiles 0.0 0.2 0.4 0.6 0.8 1.0 −2−1012 0.0 0.2 0.4 0.6 0.8 1.0 0246810 0.0 0.2 0.4 0.6 0.8 1.0 02468 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 33
  • 34. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Elicitable Measures “elicitable” means “being a minimizer of a suitable expected score” T is an elicatable function if there exits a scoring function S : R × R → [0, ∞) such that T(Y ) = argmin x∈R R S(x, y)dF(y) = argmin x∈R E S(x, Y ) where Y ∼ F. see Gneiting (2011) Making and evaluating point forecasts. Example: mean, T(Y ) = E[Y ] is elicited by S(x, y) = x − y 2 2 Example: median, T(Y ) = median[Y ] is elicited by S(x, y) = x − y 1 Example: quantile, T(Y ) = QY (τ) is elicited by S(x, y) = τ(y − x)+ + (1 − τ)(y − x)− Example: expectile, T(Y ) = EY (τ) is elicited by S(x, y) = τ(y − x)2 + + (1 − τ)(y − x)2 − @freakonometrics freakonometrics freakonometrics.hypotheses.org 34
  • 35. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Elicitable Measures Remark: all functionals are not necessarily elicitable, see Osband (1985) Providing incentives for better cost forecasting The variance is not elicitable The elicitability property implies a property which is known as convexity of the level sets with respect to mixtures (also called Betweenness property) : if two lotteries F, and G are equivalent, then any mixture of the two lotteries is also equivalent with F and G. @freakonometrics freakonometrics freakonometrics.hypotheses.org 35
  • 36. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Empirical Quantiles Consider some i.id. sample {y1, · · · , yn} with distribution F. Set Qτ = argmin E Rq τ (Y − q) where Y ∼ F and Qτ ∈ argmin n i=1 Rq τ (yi − q) Then as n → ∞ √ n Qτ − Qτ L → N 0, τ(1 − τ) f2(Qτ ) Sketch of the proof: yi = Qτ + εi, set hn(q) = 1 n n i=1 1(yi < q) − τ , which is a non-decreasing function, with E Qτ + u √ n = FY Qτ + u √ n ∼ fY (Qτ ) u √ n Var Qτ + u √ n ∼ FY (Qτ )[1 − FY (Qτ )] n = τ(1 − τ) n . @freakonometrics freakonometrics freakonometrics.hypotheses.org 36
  • 37. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Empirical Expectiles Consider some i.id. sample {y1, · · · , yn} with distribution F. Set µτ = argmin E Re τ (Y − m) where Y ∼ F and µτ = argmin n i=1 Re τ (yi − m) Then as n → ∞ √ n µτ − µτ L → N 0, s2 for some s2 , if Var[Y ] < ∞. Define the identification function Iτ (x, y) = τ(y − x)+ + (1 − τ)(y − x)− (elicitable score for quantiles) so that µτ is solution of E I(µτ , Y ) = 0. Then s2 = E[I(µτ , Y )2 ] (τ[1 − F(µτ )] + [1 − τ]F(µτ ))2 . @freakonometrics freakonometrics freakonometrics.hypotheses.org 37
  • 38. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression We want to solve, here, min n i=1 Rq τ (yi − xT i β) yi = xT i β + εi so that Qy|x(τ) = xT β + F−1 ε (τ) @freakonometrics freakonometrics freakonometrics.hypotheses.org 38 q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q 5 10 15 20 25 020406080100120 speed dist 10% 90% q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qqq qqqq qqqqqqqqqq qqq qqqqq q qqq q qqq q qqqq qqqq qqqq q qqqqqq qq q qqqq qq qqqq q qq q qqqqqqqqqqqq qq qq qq q 20 40 60 80 23456 probability level (%) slopeofquantileregression
  • 39. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Geometric Properties of the Quantile Regression Observe that the median regression will always have two supporting observations. Start with some regression line, yi = β0 + β1xi Consider small translations yi = (β0 ± ) + β1xi We minimize n i=1 yi − (β0 + β1xi) From line blue, a shift up decrease the sum by until we meet point on the left an additional shift up will increase the sum We will necessarily pass through one point (observe that the sum is piecwise linear in ) −4 −2 0 2 4 6 51015 H D @freakonometrics freakonometrics freakonometrics.hypotheses.org 39 q q q 0 1 2 3 4 0246810 x y
  • 40. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Geometric Properties of the Quantile Regression Consider now rotations of the line around the support point If we rotate up, we increase the sum of absolute differ- ence (large impact on the point on the right) If we rotate down, we decrease the sum, until we reach the point on the right Thus, the median regression will always have two sup- portting observations. 1 > library(quantreg) 2 > fit <- rq(dist˜speed , data=cars , tau =.5) 3 > which(predict(fit)== cars$dist) 4 1 21 46 5 1 21 46 −4 −2 0 2 4 6 5101520 H D q q q 0 1 2 3 4 0246810 x y @freakonometrics freakonometrics freakonometrics.hypotheses.org 40 q q q 0 1 2 3 4 0246810 x y q
  • 41. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Numerical Aspects To illustrate numerical computations, use 1 base=read.table("http:// freakonometrics .free.fr/rent98_00. txt",header =TRUE) The linear program for the quantile regression is now min µ,a,b n i=1 τai + (1 − τ)bi with ai, bi ≥ 0 and yi − [βτ 0 + βτ 1 xi] = ai − bi, ∀i = 1, · · · , n. 1 require(lpSolve) 2 tau = .3 3 n=nrow(base) 4 X = cbind( 1, base$area) 5 y = base$rent_euro 6 A1 = cbind(diag (2*n), 0,0) 7 A2 = cbind(diag(n), -diag(n), X) @freakonometrics freakonometrics freakonometrics.hypotheses.org 41
  • 42. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Numerical Aspects 1 r = lp("min", 2 c(rep(tau ,n), rep(1-tau ,n) ,0,0), rbind(A1 , A2), 3 c(rep("&gt;=", 2*n), rep("=", n)), c(rep(0,2*n), y)) 4 tail(r$solution ,2) 5 [1] 148.946864 3.289674 see 1 library(quantreg) 2 rq(rent_euro˜area , tau=tau , data=base) 3 Coefficients : 4 (Intercept) area 5 148.946864 3.289674 @freakonometrics freakonometrics freakonometrics.hypotheses.org 42
  • 43. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Numerical Aspects 1 plot(base$area ,base$rent_euro ,xlab=expression( paste("surface (",mˆ2,")")), 2 ylab="rent (euros/month)",col=rgb (0 ,0 ,1 ,.4),cex =.5) 3 sf =0:250 4 yr=r$solution [2*n+1]+r$solution [2*n+2]*sf 5 lines(sf ,yr ,lwd=2,col="blue") 6 tau = .9 7 r = lp("min",c(rep(tau ,n), rep(1-tau ,n) ,0,0), rbind(A1 , A2),c(rep(" >=", 2*n), rep("=", n) ), c(rep (0,2*n), y)) 8 yr=r$solution [2*n+1]+r$solution [2*n+2]*sf 9 lines(sf ,yr ,lwd=2,col="blue") @freakonometrics freakonometrics freakonometrics.hypotheses.org 43
  • 44. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Numerical Aspects For multiple regression, we should consider some trick (R function assumes all variables are nonnegative) 1 tau = 0.3 2 n = nrow(base) 3 X = cbind (1, base$area , base$yearc) 4 y = base$rent_euro 5 r = lp("min", 6 c(rep(tau , n), rep(1 - tau , n), rep(0, 2 * 3)), 7 cbind(diag(n), -diag(n), X, -X), 8 rep("=", n), 9 y) 10 beta = tail(r$solution , 6) 11 beta = beta [1:3] - beta [3 + 1:3] 12 beta 13 [1] -5542.503252 3.978135 2.887234 @freakonometrics freakonometrics freakonometrics.hypotheses.org 44
  • 45. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Numerical Aspects which is consistant with 1 library(quantreg) 2 rq(rent_euro˜area+yearc , tau=tau , data=base) 3 Coefficients : 4 (Intercept) area yearc 5 -5542.503252 3.978135 2.887234 @freakonometrics freakonometrics freakonometrics.hypotheses.org 45
  • 46. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Distributional Aspects OLS are equivalent to MLE when Y − m(x) ∼ N(0, σ2 ), with density g( ) = 1 σ √ 2π exp − 2 2σ2 Quantile regression is equivalent to Maximum Likelihood Estimation when Y − m(x) has an asymmetric Laplace distribution g( ) = √ 2 σ κ 1 + κ2 exp − √ 2κ1( >0) σκ1( <0) | | @freakonometrics freakonometrics freakonometrics.hypotheses.org 46
  • 47. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression and Iterative Least Squares start with some β(0) e.g. βols at stage k : let ε (k) i = yi − xT i β(k−1) define weights ω (k) i = Rτ (ε (k) i ) compute weighted least square to estimate β(k) One can also consider a smooth approximation of Rq τ (·), and then use Newton-Raphson. @freakonometrics freakonometrics freakonometrics.hypotheses.org 47 q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q 5 10 15 20 25 020406080100120 speed dist
  • 48. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Optimization Algorithm Primal problem is min β,u,v τ1T u + (1 − τ)1T v s.t. y = Xβ + u − v, with u, v ∈ Rn + and the dual version is max d yT d s.t. XT d = (1 − τ)XT 1 with d ∈ [0, 1]n Koenker & D’Orey (1994) A Remark on Algorithm AS 229: Computing Dual Regression Quantiles and Regression Rank Scores suggest to use the simplex method (default method in R) Portnoy & Koenker (1997) The Gaussian hare and the Laplacian tortoise suggest to use the interior point method * @freakonometrics freakonometrics freakonometrics.hypotheses.org 48
  • 49. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Interior Point Method See Vanderbei et al. (1986) A modification of Karmarkar’s linear programming algorithm for a presentation of the algorithm, Potra & Wright (2000) Interior-point methods for a general survey, and and Meketon (1986) Least absolute value regression for an application of the algorithm in the context of median regression. Running time is of order n1+δ k3 for some δ > 0 and k = dim(β) (it is (n + k)k2 for OLS, see wikipedia). @freakonometrics freakonometrics freakonometrics.hypotheses.org 49
  • 50. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression Estimators OLS estimator β ols is solution of β ols = argmin E E[Y |X = x] − xT β 2 and Angrist, Chernozhukov & Fernandez-Val (2006) Quantile Regression under Misspecification proved that βτ = argmin E ωτ (β) Qτ [Y |X = x] − xT β 2 (under weak conditions) where ωτ (β) = 1 0 (1 − u)fy|x(uxT β + (1 − u)Qτ [Y |X = x])du βτ is the best weighted mean square approximation of the tru quantile function, where the weights depend on an average of the conditional density of Y over xT β and the true quantile regression function. @freakonometrics freakonometrics freakonometrics.hypotheses.org 50
  • 51. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Assumptions to get Consistency of Quantile Regression Estimators As always, we need some assumptions to have consistency of estimators. • observations (Yi, Xi) must (conditionnaly) i.id. • regressors must have a bounded second moment, E Xi 2 < ∞ • error terms ε are continuously distributed given Xi, centered in the sense that their median should be 0, 0 −∞ fε( )d = 1 2 . • “local identification” property : fε(0)XXT is positive definite @freakonometrics freakonometrics freakonometrics.hypotheses.org 51
  • 52. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression Estimators Under those weak conditions, βτ is asymptotically normal: √ n(βτ − βτ ) L → N(0, τ(1 − τ)D−1 τ ΩxD−1 τ ), where Dτ = E fε(0)XXT and Ωx = E XT X . hence, the asymptotic variance of β is Var βτ = τ(1 − τ) [fε(0)]2 1 n n i=1 xT i xi −1 where fε(0) is estimated using (e.g.) an histogram, as suggested in Powell (1991) Estimation of monotonic regression models under quantile restrictions, since Dτ = lim h↓0 E 1(|ε| ≤ h) 2h XXT ∼ 1 2nh n i=1 1(|εi| ≤ h)xixT i = Dτ . @freakonometrics freakonometrics freakonometrics.hypotheses.org 52
  • 53. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression Estimators There is no first order condition, in the sense ∂Vn(β, τ)/∂β = 0 where Vn(β, τ) = n i=1 Rq τ (yi − xT i β) There is an asymptotic first order condition, 1 √ n n i=1 xiψτ (yi − xT i β) = O(1), as n → ∞, where ψτ (·) = 1(· < 0) − τ, see Huber (1967) The behavior of maximum likelihood estimates under nonstandard conditions. One can also define a Wald test, a Likelihood Ratio test, etc. @freakonometrics freakonometrics freakonometrics.hypotheses.org 53
  • 54. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression Estimators Then the confidence interval of level 1 − α is then βτ ± z1−α/2 Var βτ An alternative is to use a boostrap strategy (see #2) • generate a sample (y (b) i , x (b) i ) from (yi, xi) • estimate β(b) τ by β (b) τ = argmin Rq τ y (b) i − x (b)T i β • set Var βτ = 1 B B b=1 β (b) τ − βτ 2 For confidence intervals, we can either use Gaussian-type confidence intervals, or empirical quantiles from bootstrap estimates. @freakonometrics freakonometrics freakonometrics.hypotheses.org 54
  • 55. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression Estimators If τ = (τ1, · · · , τm), one can prove that √ n(βτ − βτ ) L → N(0, Στ ), where Στ is a block matrix, with Στi,τj = (min{τi, τj} − τiτj)D−1 τi ΩxD−1 τj see Kocherginsky et al. (2005) Practical Confidence Intervals for Regression Quantiles for more details. @freakonometrics freakonometrics freakonometrics.hypotheses.org 55
  • 56. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression: Transformations Scale equivariance For any a > 0 and τ ∈ [0, 1] ˆβτ (aY, X) = aˆβτ (Y, X) and ˆβτ (−aY, X) = −aˆβ1−τ (Y, X) Equivariance to reparameterization of design Let A be any p × p nonsingular matrix and τ ∈ [0, 1] ˆβτ (Y, XA) = A−1 ˆβτ (Y, X) @freakonometrics freakonometrics freakonometrics.hypotheses.org 56
  • 57. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Visualization, τ → βτ See Abreveya (2001) The effects of demographics and maternal behavior... 1 > base=read.table("http:// freakonometrics .free.fr/ natality2005 .txt") 20 40 60 80 −6−4−20246 probability level (%) AGE 10 20 30 40 50 01000200030004000500060007000 Age (of the mother) AGE BirthWeight(ing.) 1% 5% 10% 25% 50% 75% 90% 95% @freakonometrics freakonometrics freakonometrics.hypotheses.org 57
  • 58. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Visualization, τ → βτ 1 > base=read.table("http:// freakonometrics .free.fr/ natality2005 .txt", header=TRUE ,sep=";") 2 > u=seq (.05 ,.95 , by =.01) 3 > library(quantreg) 4 > coefstd=function(u) summary(rq(WEIGHT˜SEX+SMOKER+WEIGHTGAIN+ BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,tau=u))$ coefficients [,2] 5 > coefest=function(u) summary(rq(WEIGHT˜SEX+SMOKER+WEIGHTGAIN+ BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase ,tau=u))$ coefficients [,1] 6 CS=Vectorize(coefstd)(u) 7 CE=Vectorize(coefest)(u) @freakonometrics freakonometrics freakonometrics.hypotheses.org 58
  • 59. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Visualization, τ → βτ See Abreveya (2001) The effects of demographics and maternal behavior on the distribution of birth outcomes 20 40 60 80 −6−4−20246 probability level (%) AGE 20 40 60 80 708090100110120130140 probability level (%) SEXM 20 40 60 80 −200−180−160−140−120 probability level (%) SMOKERTRUE 20 40 60 80 3.54.04.5 probability level (%) WEIGHTGAIN 20 40 60 80 20406080 probability level (%) COLLEGETRUE @freakonometrics freakonometrics freakonometrics.hypotheses.org 59
  • 60. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Visualization, τ → βτ See Abreveya (2001) The effects of demographics and maternal behavior... 1 > base=read.table("http:// freakonometrics .free.fr/BWeight.csv") 20 40 60 80 −202468 probability level (%) mom_age 20 40 60 80 406080100120140 probability level (%) boy 20 40 60 80 −190−180−170−160−150−140 probability level (%) smoke 20 40 60 80 −350−300−250−200−150 probability level (%) black 20 40 60 80 −10−505 probability level (%) ed @freakonometrics freakonometrics freakonometrics.hypotheses.org 60
  • 61. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression, with Non-Linear Effects Rents in Munich, as a function of the area, from Fahrmeir et al. (2013) Regression: Models, Methods and Applications 1 > base=read.table("http:// freakonometrics .free.fr/rent98_00. txt") 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% @freakonometrics freakonometrics freakonometrics.hypotheses.org 61
  • 62. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression, with Non-Linear Effects Rents in Munich, as a function of the year of construction, from Fahrmeir et al. (2013) Regression: Models, Methods and Applications 1920 1940 1960 1980 2000 050010001500 Year of Construction Rent(euros) 50% 10% 25% 75% 90% 1920 1940 1960 1980 2000 050010001500 Year of Construction Rent(euros) 50% 10% 25% 75% 90% @freakonometrics freakonometrics freakonometrics.hypotheses.org 62
  • 63. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression, with Non-Linear Effects BMI as a function of the age, in New-Zealand, from Yee (2015) Vector Generalized Linear and Additive Models, for Women and Men 1 > library(VGAMdata); data(xs.nz) 20 40 60 80 100 15202530354045 Age (Women, ethnicity = European) BMI 5% 25% 50% 75% 95% 20 40 60 80 100 15202530354045 Age (Men, ethnicity = European) BMI 5% 25% 50% 75% 95% @freakonometrics freakonometrics freakonometrics.hypotheses.org 63
  • 64. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression, with Non-Linear Effects BMI as a function of the age, in New-Zealand, from Yee (2015) Vector Generalized Linear and Additive Models, for Women and Men 20 40 60 80 100 15202530354045 Age (Women) BMI 50% 95% 50% 95% Maori European 20 40 60 80 100 15202530354045 Age (Men) BMI 50% 95% Maori European 50% 95% @freakonometrics freakonometrics freakonometrics.hypotheses.org 64
  • 65. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression, with Non-Linear Effects One can consider some local polynomial quantile regression, e.g. min n i=1 ωi(x)Rq τ yi − β0 − (xi − x)T β1 for some weights ωi(x) = H−1 K(H−1 (xi − x)), see Fan, Hu & Truong (1994) Robust Non-Parametric Function Estimation. @freakonometrics freakonometrics freakonometrics.hypotheses.org 65
  • 66. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Asymmetric Maximum Likelihood Estimation Introduced by Efron (1991) Regression percentiles using asymmetric squared error loss. Consider a linear model, yi = xT i β + εi. Let S(β) = n i=1 Qω(yi − xT i β), where Qω( ) =    2 if ≤ 0 w 2 if > 0 where w = ω 1 − ω One might consider ωα = 1 + zα ϕ(zα) + (1 − α)zα where zα = Φ−1 (α). Efron (1992) Poisson overdispersion estimates based on the method of asymmetric maximum likelihood introduced asymmetric maximum likelihood (AML) estimation, considering S(β) = n i=1 Qω(yi − xT i β), where Qω( ) =    D(yi, xT i β) if yi ≤ xT i β wD(yi, xT i β) if yi > xT i β where D(·, ·) is the deviance. Estimation is based on Newton-Raphson (gradient descent). @freakonometrics freakonometrics freakonometrics.hypotheses.org 66
  • 67. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Noncrossing Solutions See Bondell et al. (2010) Non-crossing quantile regression curve estimation. Consider probabilities τ = (τ1, · · · , τq) with 0 < τ1 < · · · < τq < 1. Use parallelism : add constraints in the optimization problem, such that xT i βτj ≥ xT i βτj−1 ∀i ∈ {1, · · · , n}, j ∈ {2, · · · , q}. @freakonometrics freakonometrics freakonometrics.hypotheses.org 67
  • 68. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression on Panel Data In the context of panel data, consider some fixed effect, αi so that yi,t = xT i,tβτ + αi + εi,t where Qτ (εi,t|Xi) = 0 Canay (2011) A simple approach to quantile regression for panel data suggests an estimator in two steps, • use a standard OLS fixed-effect model yi,t = xT i,tβ + αi + ui,t, i.e. consider a within transformation, and derive the fixed effect estimate β (yi,t − yi) = xi,t − xi,t T β + (ui,t − ui) • estimate fixed effects as αi = 1 T T t=1 yi,t − xT i,tβ • finally, run a standard quantile regression of yi,t − αi on xi,t’s. See rqpd package. @freakonometrics freakonometrics freakonometrics.hypotheses.org 68
  • 69. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression with Fixed Effects (QRFE) In a panel linear regression model, yi,t = xT i,tβ + ui + εi,t, where u is an unobserved individual specific effect. In a fixed effects models, u is treated as a parameter. Quantile Regression is min β,u    i,t Rq α(yi,t − [xT i,tβ + ui])    Consider Penalized QRFE, as in Koenker & Bilias (2001) Quantile regression for duration data, min β1,··· ,βκ,u    k,i,t ωkRq αk (yi,t − [xT i,tβk + ui]) + λ i |ui|    where ωk is a relative weight associated with quantile of level αk. @freakonometrics freakonometrics freakonometrics.hypotheses.org 69
  • 70. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression with Random Effects (QRRE) Assume here that yi,t = xT i,tβ + ui + εi,t =ηi,t . Quantile Regression Random Effect (QRRE) yields solving min β    i,t Rq α(yi,t − xT i,tβ)    which is a weighted asymmetric least square deviation estimator. Let Σ = [σs,t(α)] denote the matrix σts(α) =    α(1 − α) if t = s E[1{εit(α) < 0, εis(α) < 0}] − α2 if t = s If (nT)−1 XT {In ⊗ ΣT ×T (α)}X → D0 as n → ∞ and (nT)−1 XT Ωf X = D1, then √ nT β Q (α) − βQ (α) L −→ N 0, D−1 1 D0D−1 1 . @freakonometrics freakonometrics freakonometrics.hypotheses.org 70
  • 71. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Treatment Effects Doksum (1974) Empirical Probability Plots and Statistical Inference for Nonlinear Models introduced QTE - Quantile Treatement Effect - when a person might have two Y ’s : either Y0 (without treatment, D = 0) or Y1 (with treatement, D = 1), δτ = QY1 (τ) − QY0 (τ) which can be studied on the context of covariates. Run a quantile regression of y on (d, x), y = β0 + δd + xT i β + εi : shifting effect y = β0 + xT i β + δd + εi : scaling effect −4 −2 0 2 4 0.00.20.40.60.81.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 71
  • 72. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression for Time Series Consider some GARCH(1,1) financial time series, yt = σtεt where σt = α0 + α1 · |yt−1| + β1σt−1. The quantile function conditional on the past - Ft−1 = Y t−1 - is Qy|Ft−1 (τ) = α0F−1 ε (τ) ˜α0 + α1F−1 ε (τ) ˜α1 ·|yt−1| + β1Qy|Ft−2 (τ) i.e. the conditional quantile has a GARCH(1,1) form, see Conditional Autoregressive Value-at-Risk, see Manganelli & Engle (2004) CAViaR: Conditional Autoregressive Value at Risk by Regression Quantiles @freakonometrics freakonometrics freakonometrics.hypotheses.org 72
  • 73. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Quantile Regression for Spatial Data 1 > library(McSpatial) 2 > data(cookdata) 3 > fit <- qregcpar(LNFAR˜DCBD , nonpar=˜LATITUDE+LONGITUDE , taumat=c (.10 ,.90) , kern="bisq", window =.30 , distance="LATLONG", data= cookdata) 10% Quantiles −2.0 −1.5 −1.0 −0.5 0.0 0.5 90% Quantiles −2.0 −1.5 −1.0 −0.5 0.0 0.5 Difference between .10 and.90 Quantiles 0.5 0.6 0.7 0.8 0.9 1.0 @freakonometrics freakonometrics freakonometrics.hypotheses.org 73
  • 74. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression Quantile regression vs. Expectile regression, on the same dataset (cars) 20 40 60 80 23456 probability level (%) Slope(quantileregression) 20 40 60 80 23456 probability level (%) Slope(expectileregression) see Koenker (2014) Living Beyond our Means for a comparison quantiles-expectiles @freakonometrics freakonometrics freakonometrics.hypotheses.org 74
  • 75. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression Solve here min β n i=1 Re τ (yi − xT i β) where Re τ (u) = u2 · τ − 1(u < 0) “this estimator can be interpreted as a maximum likelihood estimator when the disturbances arise from a normal distribution with unequal weight placed on positive and negative disturbances” Aigner, Amemiya & Poirier (1976) Formulation and Estimation of Stochastic Frontier Production Function Models. See Holzmann & Klar (2016) Expectile Asymptotics for statistical properties. Expectiles can (also) be related to Breckling & Chambers (1988) M-Quantiles. Comparison quantile regression and expectile regression, see Schulze-Waltrup et al. (2014) Expectile and quantile regression - David and Goliath? @freakonometrics freakonometrics freakonometrics.hypotheses.org 75
  • 76. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression, with Linear Effects Zhang (1994) Nonparametric regression expectiles 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% Quantile Regressions Expectile Regressions @freakonometrics freakonometrics freakonometrics.hypotheses.org 76
  • 77. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression, with Non-Linear Effects See Zhang (1994) Nonparametric regression expectiles 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% 50 100 150 200 250 050010001500 Area (m2 ) Rent(euros) 50% 10% 25% 75% 90% Quantile Regressions Expectile Regressions @freakonometrics freakonometrics freakonometrics.hypotheses.org 77
  • 78. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression, with Linear Effects 1 > library(expectreg) 2 > coefstd=function(u) summary(expectreg.ls(WEIGHT˜SEX+SMOKER+ WEIGHTGAIN+BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase , expectiles=u,ci = TRUE))[,2] 3 > coefest=function(u) summary(expectreg.ls(WEIGHT˜SEX+SMOKER+ WEIGHTGAIN+BIRTHRECORD+AGE+ BLACKM+ BLACKF+COLLEGE ,data=sbase , expectiles=u,ci = TRUE))[,1] 4 > CS=Vectorize(coefstd)(u) 5 > CE=Vectorize(coefest)(u) @freakonometrics freakonometrics freakonometrics.hypotheses.org 78
  • 79. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression, with Random Effects (ERRE) Quantile Regression Random Effect (QRRE) yields solving min β    i,t Re α(yi,t − xT i,tβ)    One can prove that β e (τ) = n i=1 T t=1 ωi,t(τ)xitxT it −1 n i=1 T t=1 ωi,t(τ)xityit , where ωit(τ) = τ − 1(yit < xT itβ e (τ)) . @freakonometrics freakonometrics freakonometrics.hypotheses.org 79
  • 80. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Expectile Regression with Random Effects (ERRE) If W = diag(ω11(τ), . . . ωnT (τ)), set W = E(W), H = XT WX and Σ = XT E(WεεT W)X. and then √ nT β e (τ) − βe (τ) L −→ N(0, H−1 ΣH−1 ), see Barry et al. (2016) Quantile and Expectile Regression for random effects model. See, for expectile regressions, with R, 1 > library(expectreg) 2 > fit <- expectreg.ls(rent_euro ˜ area , data=munich , expectiles =.75) 3 > fit <- expectreg.ls(rent_euro ˜ rb(area ,"pspline"), data=munich , expectiles =.75) @freakonometrics freakonometrics freakonometrics.hypotheses.org 80
  • 81. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Application to Real Data @freakonometrics freakonometrics freakonometrics.hypotheses.org 81
  • 82. Arthur Charpentier, UB School of Economics Summer School 2018 - Big Data for Economics Extensions The mean of Y is ν(FY ) = +∞ −∞ ydFY (y) The quantile of level τ for Y is ντ (FY ) = F−1 Y (τ) More generaly, consider some functional ν(F) (Gini or Theil index, entropy, etc), see Foresi & Peracchi (1995) The Conditional Distribution of Excess Returns Can we estimate ν(FY |x) ? Firpo et al. (2009) Unconditional Quantile Regressions suggested to use influence function regression Machado & Mata (2005) Counterfactual decomposition of changes in wage distributions and Chernozhukov et al. (2013) Inference on counterfactual distributions suggested indirect distribution function. Influence function of index ν(F) at y is IF(y, ν, F) = lim ε↓0 ν((1 − )F + δy) − ν(F) @freakonometrics freakonometrics freakonometrics.hypotheses.org 82