Slides ineq-3b

Arthur CHARPENTIER - Welfare, Inequality and Poverty
Arthur Charpentier
charpentier.arthur@gmail.com
http ://freakonometrics.hypotheses.org/
Université de Rennes 1, January 2015
Welfare, Inequality & Poverty, # 3
1

Inequality Comparisons (2-person Economy)
not much to say... any measure of dispersion is appropriate
– income gap x2 − x1
– proportional gap
x2
x1
– any functional of the distance
|x2 − x1|
graphs are from Amiel & Cowell (1999,
ebooks.cambridge.org )
2

Consider any 3-person economy, with incomes x = {x1, x2, x3}. This point can be
visualized in Kolm triangle.
3

1 kolm=function (p=c (200 ,300 ,500) ) {
2 p1=p/sum(p)
3 y0=p1 [ 2 ]
4 x0=(2∗p1 [1]+ y0 ) / sqrt (3)
5 plot ( 0 : 1 , 0 : 1 , c o l=" white " , xlab=" " , ylab=" " ,
6 axes=FALSE, ylim=c (0 ,1) )
7 polygon ( c ( 0 , . 5 , 1 , 0 ) , c ( 0 , . 5 ∗ sqrt (3) ,0 ,0) )
8 points ( x0 , y0 , pch=19, c o l=" red " ) }
4

Inequality Comparisons (n-person Economy)
In a n-person economy, comparison are clearly more diﬃcult
5

Why not look at inequality per subgroups,
If we focus at the top of the distribution
(same holds for the bottom),
→ rising inequality
If we focus at the middle of the distri-
bution,
→ falling inequality
6

To measure inequality, we usually
– deﬁne ‘equality’ based on some reference point / distribution
– deﬁne a distance to the reference point / distribution
– aggregate individual distances
We want to visualize the distribution of incomes
1 > income <− read . csv ( " http : //www. vchar ite . univ−mrs . f r /pp/ lubrano /
cours / f e s 9 6 . csv " , sep=" ; " , header=FALSE) $V1
F(x) = P(X ≤ x) =
x
0
f(t)dt
7

Densities are usually diﬃcult to com-
pare,
1 > h i s t ( income ,
2 + breaks=seq (min( income ) −1,max(
income ) +50,by=50) ,
3 + p r o b a b i l i t y=TRUE)
4 > l i n e s ( density ( income ) , c o l=" red "
, lwd=2)
Histogram of income
income
Density
0 500 1000 1500 2000 2500 3000
0.0000.0010.0020.0030.004
8

It is more convenient, compare cumu-
lative distribution functions of income,
wealth, consumption, grades, etc.
1 > plot ( ecdf ( income ) )
0 1000 2000 3000
0.00.20.40.60.81.0
ecdf(income)
x
Fn(x)
9

The Parade of Dwarfs
An alternative is to use Pen’s parade, also called the parade of dwarfs (and a few
giants), “parade van dwergen en een enkele reus”.
The height of each person is stretched in the proportion to his or her income
everyone is line up in order of height, shortest (poorest) are on the left and
tallest (richest) are on the right let them walk some time, like a procession.
10

c.d.f., quantiles and Lorenz
1 > Pen( income )
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
10
Pen's Parade
i n
x(i)x
11

This parade of the Dwarfs function is just the quantile function.
1 > q <− function (u) qua nti le (
income , u)
see also
1 > n <− length ( income )
2 > u <− seq (1 / (2 ∗n) ,1−1/ (2 ∗n) ,
length=n)
3 > plot (u , s o r t ( income ) , type=" l " )
plot ( ecdf ( income ) ) 0.0 0.2 0.4 0.6 0.8 1.0
050010001500200025003000
u
sort(income)
12

To get Lorentz curve, we substitute on the y-axis proportion of incomes to
incomes.
1 > l i b r a r y ( ineq )
2 > Lc ( income )
3 > L <− function (u) Lc ( income ) $L [
round (u∗ length ( income ) ) ]
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
13

x-axis y-axis
c.d.f. income proportion of population
Pen’s parade
(quantile)
proportion of population income
Lorenz curve proportion of population proportion of income
14

Standard statistical measure of dispersion
The variance for a sample X = {x1, · · · , xn} is
Var(X) =
1
n
n
i=1
[xi − x]2
where the baseline (reference) is x =
1
n
n
i=1
xi.
1 > var ( income )
2 [ 1 ] 34178.43
problem it is a quadratic function, Var(αX) = α2
Var(X).
15

An alternative is the coeﬃcient of variation,
cv(X) =
Var(X)
x
But not a good measure to capture inequality overall, very sensitive to very high
incomes
1 > cv <− function ( x) sd (x ) /mean( x)
2 > cv ( income )
3 [ 1 ] 0.6154011
16

An alternative is to use a logarithmic transformation. Use the logarithmic
variance
Varlog(X) =
1
n
n
i=1
[log(xi) − log(x)]2
1 > var_log <− function ( x ) var ( log (x ) )
2 > var_log ( income )
3 [ 1 ] 0.2921022
Those measures are distances on the x-axis.
17

Other inequality measures can be derived from Pen’s parade of the Dwarfs, where
measures are based on distances on the y-axis, i.e. distances between quantiles.
Qp = F−1
(p) i.e. F(Qp) = p
e.g. the median is the quantile when p = 50%, the first quartile is the quantile
when p = 25%, the first quintile is the quantile when p = 20%, the first decile is
the quantile when p = 10%, the first percentile is the quantile when p = 1%
1 > qua n t ile ( income , c ( . 1 , . 5 , . 9 , . 9 9 ) )
2 10% 50% 90% 99%
3 137.6294 253.9090 519.6887 933.9211
18

Deﬁne the quantile ratio as
Rp =
Q1−p
Qp
In case of perfect equality, Rp = 1.
The most popular one is probably the
90/10 ratio.
1 > R_p <− function (x , p) q uant ile (x
,1−p) / q uantil e (x , p)
2 > R_p( income , . 1 )
3 90%
4 3.776
0.0 0.2 0.4 0.6 0.8 1.0
051015
probability
R
This index measures the gap between the rich and the poor.
19

E.g. R0.1 = 10 means that top 10% incomes are more than 10 times higher than
the bottom 10% incomes.
Ignores the distribution (apart from the two points), violates transfer principle.
An alternative measure might be Kuznets Ratio, deﬁned from Lorenz curve as
the ratio of the share of income earned by the poorest p share of the population
and the richest r share of the population,
I(p, r) =
L(p)
1 − L(1 − r)
But here again, it ignores the distribution between the cutoﬀs and therefore
violates the transfer principle.
20

An alternative measure can be the IQR,
interquantile ratio,
IQRp =
Q1−p − Qp
Q0.5
1 > IQR_p <− function (x , p) (
qu a ntile (x,1−p)−qua ntile (x , p)
) / quant ile (x , . 5 )
2 > IQR_p( income , . 1 )
3 90%
4 1.504709
0.0 0.1 0.2 0.3 0.4 0.5
01234
probability
IQR
Problem only focuses on top (1 − p)-th and bottom p-th proportion. Does not
care about what happens between those quantiles.
21

Pen’s parade suggest to measure the
green area, for some p ∈ (0, 1), Mp,
1 > M_p <− function (x , p) {
2 a <− seq (0 , p , length =251)
3 b <− seq (p , 1 , length =251)
4 ya <− qua ntil e (x , p)−q ua nt ile (x ,
a )
5 a1 <− sum (( ya [1:250]+ ya [ 2 : 2 5 1 ] )
/2∗p/ 250)
6 yb <− qua ntile (x , b)−q ua nt ile (x ,
p)
7 a2 <− sum (( yb [1:250]+ yb [ 2 : 2 5 1 ] )
/2∗(1−p) / 250)
8 return ( a1+a2 ) }
22

Use also the relative mean deviation
M(X) =
1
n
n
i=1
xi
x
− 1
1 > M <− function ( x) mean( abs ( x/mean( x ) −1))
2 > M( income )
3 [ 1 ] 0.429433
in case of perfect equality, M = 0
23

Finally, why not use Lorenz curve.
It can be defined using order statistics as
G =
2
n(n − 1)x
n
i=1
i · xi:n −
n + 1
n − 1
1 > n <− length ( income )
2 > mu <− mean( income )
3 2∗sum ( ( 1 : n) ∗ s o r t ( income ) ) / (mu∗n∗ (n−1))−(n
+1)/ (n−1)
4 [ 1 ] 0.2976282
Gini index is defined as the area below the first diagonal and above Lorenz curve
24
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
p
L(p)
q
q
q
qA
B

G(X) =
1
2n2x
n
i,j=1
|xi − xj|
Perfect equality is obtained when G = 0.
Remark Gini index can be related to the variance or the coeﬃcient of variation,
since
Var(X) =
1
n
n
i=1
[xi − x]2
=
1
n2
n
i,j=1
(xi − xj)2
Here,
G(X) =
∆(X)
2x
with ∆(X) =
1
n2
n
i,j=1
|xi − xj|
1 > ineq ( income , " Gini " )
2 [ 1 ] 0.2975789
25

Axiomatic Approach for Inequality Indices
Need some rules to say if a principle used to divide a cake of ﬁxed size amongst a
ﬁxed number of people is fair, on not.
A standard one is the Anonymity Principle. Let X = {x1, · · · , xn}, then
I(x1, x2, · · · , xn) = I(x2, x1, · · · , xn)
also called Replication Invariance Principle
The Transfert Principle
for any given income distribution if you take a small amount of income from one
person and give it to a richer person then income inequality must increase
Pigou (1912) and Dalton (1920), a transfer from a richer to a poorer person will
decrease inequality. Let X = {x1, · · · , xn} with x1 ≤ · · · ≤ xn, then
I(x1, · · · , xi, · · · , xj, · · · , xn) I(x1, · · · , xi+δ, · · · , xj−δ, · · · , xn)
26

Nevertheless, not easy to compare,
compare e.g. Monday and Tuesday
An important concept behind is the idea of mean preserving spread : with those
±δ preserve the total wealth.
The Scale Independence Principle
What if double everyone’s income ? if
standards of living are determined by
real income and there is inﬂation : in-
equality is unchanged
27

Let X = {x1, · · · , xn}, then
I(λx1, · · · , λxn) = I(x1, · · · , xn)
also called Zero-Degree Homogeneity property.
The Population Principle
Consider clones of the economy
I(x1, · · · , x1
k times
, · · · , xn, · · · , xn
k times
) = I(x1, · · · , xn)
28

Is it really that simple ?
The Decomposability Principle
Assume that we can decompose inequality by subgroups (based on gender, race,
coutries, etc)
According to this principle, if inequality increases in a subgroup, it increases in
the whole population, ceteris paribus
I(x1, · · · , xn, y1, · · · , yn) ≤ I(x1, · · · , xn, y1, · · · , yn)
as long as I(x1, · · · , xn) ≤ I(x1, · · · , xn).
29

Consider two groups, X and X
Then add the same subgroup Y to both
X and X
30

Axiomatic Approach for Inequality Indices
Any inequality measure that simultaneously satisﬁes the properties of the
principle of transfers, scale independence, population principle and
decomposability must be expressible in the form
Eξ =
1
ξ2 − ξ
1
n
n
i=1
xi
x
ξ
− 1
for some ξ ∈ R. This is the generalized entropy measure.
1 > entropy ( income , 0 )
2 [ 1 ] 0.1456604
3 > entropy ( income , . 5 )
4 [ 1 ] 0.1446105
6 [ 1 ] 0.1506973
8 [ 1 ] 0.1893279
31

The higher ξ, the more sensitive to high incomes.
Remark rule of thumb, take ξ ∈ [−1, +2].
When ξ = 0, the mean logarithmic deviation (MLD),
MLD = E0 = −
1
n
n
i=1
log
xi
x
When ξ = 1, the Theil index
T = E1 =
1
n
n
i=1
xi
x
log
xi
x
1 > Theil ( income )
2 [ 1 ] 0.1506973
When ξ = 2, the index can be related to the coeﬃcient of variation
E2 =
[coeﬃcient of variation]2
2
32

In a 3-person economy, it is possible to visualize curve of iso-indices,
A related index is Atkinson inequality index,
A = 1 −
1
n
n
i=1
xi
x
1−
1
1−
33

with ≥ 0.
1 > Atkinson ( income , 0 . 5 )
2 [ 1 ] 0.07099824
3 > Atkinson ( income , 1 )
4 [ 1 ] 0.1355487
In the case where ε → 1, we obtain
A1 = 1 −
n
i=1
xi
x
)
1
n
is usually interpreted as an aversion to inequality index.
Observe that
A = 1 − [( 2
− )E1− + 1]
1
1−
and the limiting case A1 = 1 − exp[−E0].
Thus, the Atkinson index is ordinally equivalent to the GE index, since they
produce the same ranking of diﬀerent distributions.
34

Consider indices obtained when X is
obtained from a LN(0, σ2
) distribution
and from a P(α) distribution.
35

Changing the Axioms
Is there an agreement about the axioms ?
For instance, no unanimous agreement on the scale independence axiom,
Why not a translation independence axiom ?
Translation Independence Principle : if every incomes are increased by the same
amount, the inequality measure is unchanged
Given X = (x1, · · · , xn),
I(x1, · · · , xn) = I(x1 + h, · · · , xn + h)
If we change the scale independence principle by this translation independence,
we get other indices.
36

Changing the Axioms
Kolm indices satisfy the principle of transfers, translation independence,
population principle and decomposability
Kθ = log
1
n
n
i=1
eθ[xi−x]
1 > Kolm( income , 1 )
2 [ 1 ] 291.5878
3 > Kolm( income , . 5 )
4 [ 1 ] 283.9989
37

From Measuring to Ordering
Over time, between countries, before/after tax, etc.
X is said to be Lorenz-dominated by Y if LX ≤ LY . In that case Y is more
equal, or less inequal.
In such a case, X can be reached from Y by a sequence of poorer-to-richer
pairwiser income transfers.
In that case, any inequality measure satisfying the population principle, scale
independence, anonymity and principle of transfers axioms are consistent with
the Lorenz dominance (namely Theil, Gini, MLD, Generalized Entropy and
Atkinson).
Remark A regressive transfer will move the Lorenz curve further away from the
diagonal. So satisﬁes transfer principle. And it satisﬁes also the scale invariance
property.
38

Example if Xi ∼ P(αi, xi),
LX1 ≤ LX2 ←→ α1 ≤ α2
and if Xi ∼ LN(µi, σ2
i ),
LX1
≤ LX2
←→ σ2
1 ≥ σ2
2
Lorenz dominance is a relation that is incomplete : when Lorenz curves cross, the
criterion cannot decide between the two distributions.
→ the ranking is considered unambiguous.
Further, one should take into account possible random noise.
Consider some sample {x1, · · · , xn} from a LN(0, 1) distribution, with n = 100.
The 95% conﬁdence interval is
39

0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
Consider some sample {x1, · · · , xn} from a LN(0, 1) distribution, with
n = 1, 000. The 95% conﬁdence interval is
40

0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
pL(p)
41

Looking for Conﬁdence
See e.g. http ://myweb.uiowa.edu/fsolt/swiid/, for the estimation of Gini index
over time + over several countries.
29
31
33
35
37
39
1980 1990 2000 2010
Year
SWIIDGiniIndex,NetIncome
United States
Gini Index, Net Income
Note: Solid lines indicate mean estimates; shaded regions indicate the associated 95% confidence intervals.
Source: Standardized World Income Inequality Database v5.0 (Solt 2014).
27
28
29
30
31
32
1980 1990 2000 2010
Year
Canada
27
30
33
36
39
1980 1990 2000 2010
Year
Canada
United States
25.0
27.5
30.0
32.5
1980 1990 2000 2010
Year
France
25
27
29
1980 1990 2000 2010
Year
Germany
25.0
27.5
30.0
32.5
35.0
1980 1990 2000 2010
Year
France
Germany
42

To get conﬁdence interval for indices, use bootsrap techniques (see last week).
The code is simply
1 > IC <− function (x , f , n=1000, alpha =.95) {
2 + F=rep (NA, n)
3 + f o r ( i in 1: n) {
4 + F[ i ]= f ( sample (x , s i z e=length ( x) , r e p l a c e=TRUE) ) }
5 + return ( q uanti l e (F, c((1− alpha ) /2,1−(1− alpha ) / 2) ) ) }
For instance,
1 > IC ( income , Gini )
2 2.5% 97.5%
3 0.2915897 0.3039454
(the sample is rather large, n = 6, 043.
43

1 > IC ( income , Gini )
2 2.5% 97.5%
3 0.2915897 0.3039454
4 > IC ( income , Theil )
5 2.5% 97.5%
6 0.1421775 0.1595012
7 > IC ( income , entropy )
8 2.5% 97.5%
9 0.1377267 0.1517201
44

Back on Gini Index
We’ve seen Gini index as an area,
G = 2
1
0
[p − L(p)]dp = 1 − 2
1
0
L(p)dp
Using integration by parts, u = 1 and v = L(p),
G = −1 + 2
1
0
pL (p)dp =
2
µ
∞
0
yF(y)f(y)dy −
µ
2
using a change of variables, p = F(y) and because L (p) = F−1
(p)/µ = y/mu.
Thus
G =
2
µ
cov(y, F(y))
→ Gini index is proportional to the covariance between the income and its rank.
45

Back on Gini Index
Using integration be parts, one can then write
G =
1
2
∞
0
F(x)[1 − F(x)]dx = 1 −
1
µ
)0∞
[1 − F(x)]2
dx.
which can also be writen
G =
1
2µ R2
+
|x − y|dF(x)dF(y)
(see previous discussion on connexions between Gini index and the variance)
46

Decomposition(s)
When studying inequalities, it might be interesting to discussion possible
decompostions either by subgroups, or by sources,
– subgroups decomposition, e.g Male/Female, Rural/Urban see FAO (2006,
fao.org)
– source decomposition, e.g earnings/gvnt beneﬁts/investment/pension, etc, see
slide 41 #1 and FAO (2006, fao.org)
For the variance, decomposition per groups is related to ANOVA,
Var(Y ) = E[Var(Y |X)]
within
+ Var(E[Y |X])
between
Hence, if X ∈ {x1, · · · , xk} (k subgroups),
Var(Y ) =
k
pkVar(Y | group k)
within
+ Var(E[Y |X])
between
47

Decomposition(s)
For Gini index, it is possible to write
G(Y ) =
k
ωkG(Y | group k)
within
+ G(Y )
between
+residual
for some weights ω, where the between term is the Gini index between subgroup
means. But the decomposition is not perfect.
More generally, for General Entropy indices,
Eξ(Y ) =
k
ωkEξ(Y | group k)
within
+ Eξ(Y )
between
where Eξ(Y ) is the entropy on the subgroup means
ωk =
Y k
Y
ξ
(pk)
1−ξ
48

Decomposition(s)
Now, a decomposition per source, i.e. Yi = Y1,i + · · · + Yk,i + · · · , among sources.
For Gini index natural decomposition was suggested by Lerman & Yitzhaki
(1985, jstor.org)
G(Y ) =
2
Y
cov(Y, F(Y )) =
k
2
Y
cov(Yk, F(Y ))
k-th contribution
thus, it is based on the covariance between the k-th source and the ranks based
on cumulated incomes.
Similarly for Theil index,
T(Y ) =
k
1
n i
Yk,i
Y
log
Yi
Y
k-th contribution
49

Decomposition(s)
It is possible to use Shapley value for decomposition of indices I(·). Consider m
groups, N = {1, · · · , m}, and deﬁnie I(S) = I(xS) where S ⊂ N. Then Shapley
value yields
φk(v) =
S⊆N{k}
|S|! (m − |S| − 1)!
m!
(I(S ∪ {k}) − I(S))
50

Slides ineq-3b

More Related Content

What's hot (20)

Viewers also liked (18)

Similar to Slides ineq-3b (20)

More from Arthur Charpentier (20)

Slides ineq-3b