SlideShare a Scribd company logo
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Regression models and panel data
Part I: Introduction
Rodolfo Metulini
B rmetulini@unisa.it
Department of Economics and Statistics (DISES) - University of Salerno
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Outline
1 Introduction
2 Benefits from using panel data
3 A gentle introduction to panel data models
4 Useful notation
5 References
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Introduction
• student’s presentation
• presentation of myself
• presentation of the short course
• syllabus, textbook and material
• presentation of the examination method
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data
Definition
Panel data: the pooling of observations on a cross-section of
households, countries, firms, etc. over several time periods. This can
be achieved by surveying a number of firms, households or individuals
and following them over time (Baltagi, 2008)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Big data
Big data are characterized for their:
1 Volume,
2 velocity,
3 variety,
4 veracity and
5 value.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data & Big data
• Panel data mainly addresses the issue of data heterogeneity
along space and time
• Heterogeneity is connected to many V’s because:
1 heterogeneity emerges with high volume of data,
2 veracity (uncertainly) increases when both space and time
index are considered,
3 to produce the panel dataset it may be needed to extract
data from many sources (variety),
4 the value is increased, as heterogeneity means that more
informations can be extracted from data
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
What we do and we do not do in
this course
We investigate the (causal) relation among variables in a panel
assuming:
• that the true relation between the variables is linear
• that the dependent variable (Y) is quantitative, so that we can
adopt normal distributions
We do not study models adopted when:
• we have to assume a non linear relation between variables
• the dependent variables is a count data, so that we have to
model it by a Poisson, or a Binomial distribution
Many books addressing models for non linear relations and/or count
data exists, but here we limit our attention to linear relations among
(possibly) normal variables
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Cross sections and time series (i)
• In statistics and econometrics, cross-sectional dataset is a
collection of one or more variables for a sample of the
population which observes different individuals in just one (and
the same for all individuals) period of time, disregarding time.
• In statistics and econometrics, a time-series dataset is a
collection of one or more variables for one individual along a
collection of ordered periods of time
• with time series we can highlight variation in time, with cross
section variation inter-individuals
• Examples of time series: i) real GDP by trimester, from
Q1.2007 to Q.4.2020. ii) Daily variation of Unicredit on Stock
exchange market.
• Example of cross sections: i) the average labour cost per capita
per hour for a sample of Chinese firms producing rice. ii) the
GDP growth for NUTS2 regions in the period 01-01-2020 to
31-12-2020.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Cross sections and time series (ii)
time series highlight variation in time, cross sections display
inter-individuals variations (or heterogeneity)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data
Definition
Panel data: the pooling of observations on a cross-section of
households, countries, firms, etc. over several time periods. This can
be achieved by surveying a number of firms, households or individuals
and following them over time (Baltagi, 2008)
• Panel data allow to account for both time and individual
variability
• May regards macro or micro phenomena, so, individuals may be
firms or regions
• Panel may be balanced (same individuals along different
periods) or unbalanced (the sample of individuals changes along
time)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data structure: an example
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data structure: time and
individual heterogeneity
• With panel we have more than 1 information per year and more
than 1 information per state
• Computing averages by state and by year we can highlight,
respectively, the presence of inter-state heterogeneity and
inter-year heterogeneity
• Each state presents a different level
• each year presents a different level as well.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Role of individual heterogeneity
• The main interest when using regression models is that of
correctly studying the causal relation between Y (the
dependent) and X (the independent). In doing so, it is
important to correctly account for individual heterogeneity in Y .
• Stock and Watson (2012) offered an example:
• The research question is whether taxing alcoholics can reduce
deaths due to road’s incidents
• fraten = α + βbeertaxn + en estimated on year 1982 returns a
positive β (!!!). fratent = α + βbeertaxnt + ent estimated on
the full panel, returns a positive β as well!
• the model fratent = αn + βbeertaxnt + ent accounts for
individual heterogeneity via αn (state-level parameter). This
model returns a negative β
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Role of individual heterogeneity
(discussion)
• WHY?? unobserved state-level characteristics (not included in
the first two models) are correlated to local beer tax.
• If I miss to include something relevant in the model, and this is
correlated with X, OLS is biased and inconsistent.
• Can I include state-level characteristics in cross section model?
• NO, because it involves the estimation of n parameters (related
to the n intercepts) on n observations (no degrees of freedom)
• I might know a measure for state-level characteristics, but they
are generally unknown
• TAKE HOME MESSAGE: By using cross section data, one may
obtain erroneous results (about the causal effect of a regressor
on the dependent).
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Application
Application in R: example 1.1 (Croissant, Millo)
(solution 2 only)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Benefits from using panel data
1 Controlling for individual heterogeneity;
2 more informative data, more variability, less collinearity among
the variables, more degrees of freedom and more efficiency;
3 better able to study the dynamics of adjustment;
4 better able to identify and measure effects that are simply not
detectable in pure cross-section or pure time-series data;
5 allow to construct and test more complicated behavioral models
than purely cross-section or time-series data.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Controlling for individual
heterogeneity
• Panel data suggests that individuals, firms, states or countries
are heterogeneous.
• Time-series and cross-section studies not controlling this
heterogeneity run the risk of obtaining biased and inconsistent
results.
• Baltagi and Levin (1992) consider cigarette demand across 46
American states for the years 1963–88 (t=26):
consnt = consn,t−1 + pricent + incoment + ent.
• The model does not consider unobservable time invariant Zn
(e.g., religion, education) or state invariant Wt (e.g., advertising
on TV).
• Authors shaw that, omitting Zn and/or Wt, results may be
biased.
• Panel data is able to control for these unobserved variable by
including individual- and time-specific effects
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
More informative data, more
variability, less collinearity among
the variables, more degrees of
freedom and more efficiency
• Time-series are plagued with multicollinearity; for example, in
the case of demand for cigarettes there is high collinearity
(reminds linear regression assumptions) between cigarettes’
price and income (considering US aggregated data)
• In panel data collinearity is less likely, because the variation in
data can be decomposed in within-states and between-states
(usually bigger than within)
• With panel data, having larger samples, it is possible to
estimate more complex models (with more parameters)
• For example, it may be possible to estimate a state-varying
parameters model ynt = α + βnxnt + ent
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
better able to study the dynamics
of adjustment
• Unemployment, job turnover, poverty, growth, etc.. (which
presents a cyclic trend with a cycle’s duration) are better
studied with panels.
• If these panels are long enough, they can shed light on the
speed of adjustments to economic policy changes (e.g, elasticity
of the price of a cup of coffee on price of inputs after an
increase in taxation).
• For example, differently from cross sections and time series,
panel data:
1 can estimate what proportion of those who are unemployed
in one period can remain unemployed in another period;
2 enables to determine at what extent countries’
employment rate in time t is benefiting from a government
policy in t − 1;
3 allow to determine which pharmaceutical firms are
benefiting from an increase on EU research funds.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
better able to identify and measure
effects that are simply not
detectable in pure cross-section or
pure time-series data
• Example: suppose that we have a cross-section of women with a
50% average yearly labour force participation rate.
• This might be due to:
1 each woman having a 50% chance of being in the labour
force, in any given year
2 50% of the women working all the time and 50% not at all.
• Case 1 has high working turnover, while case 2 has no working
turnover: only panel data could discriminate between these cases
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
allow to construct and test more
complicated behavioral models
than purely cross-section or
time-series data
• With cross sections we are forced to treat individuals as all
having the same behaviour (e.g., all firms’ productivity react to
EU research funds with the same elasticity)
• With panel data we can treat individuals as having different
behaviours, since we are allow to model a firm-varying
coefficient model (so that, funds’ elasticity is different along
firms)
• Moreover, firm’s productivity at time t may depends on the
productivity in time t − 1 (dynamic models),
• or it may depends on the productivity of the neighbours (firms
located close by) (spatial models)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Limitations from using panel data
Better to say problems with data collection
• Design and data collection problems;
• distortions of measurement errors;
• Selectivity problems:
1 self-selectivity;
2 nonresponse;
3 attrition.
• short time-series dimension;
• cross-section dependence.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Design and data collection
problems
• These issues include:
1 problems of coverage (incomplete account of the
population of interest)
2 nonresponse (due to lack of cooperation of the respondent
or because of interviewer errors)
3 recall (respondent not remembering correctly)
4 frequency of interviewing and interview spacing (reference
period)
• For an extensive discussion of problems that arise in designing
panel surveys as well as data collection and data management
issues see Kalton et al. (1989)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Distortions of measurement errors
• Measurement errors may arise because of:
1 faulty responses due to unclear questions
2 memory errors
3 deliberate distortion of responses (e.g., prestige bias)
4 misrecording of responses
5 interviewer effects
• Panel advantage: Cross-section data users have little choice
but to believe the reported information in the survey (unless
they have external information) while users of panel data can
check for inconsistencies of responses along different interviews.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Selectivity problems
• Self selectivity: Example on people’s wage: some people
choose not to work because the reservation wage is higher than
the offered wage.
We will observe the characteristics of these individuals but not
their wage: the sample is going to be truncated (when data is
missing) or censored (when we just know wage is under a
threshold)
• Nonresponse: refusal to participate, nobody at home, untraced
sample unit, etc...
Partial nonresponse occurs when one or more questions are left
unanswered.
Complete nonresponse occurs when no information is available
from the sampled individual
• Attrition: Nonresponse is a more pronounced issue in panel
(compared to cross section).
Subsequent waves of the survey are still subject to nonresponse
because respondents may die, or move, or find that the cost of
responding is high.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Short time series dimensions
• Typical micro panels involve data covering a short time span for
each individual.
• This means that asymptotical consistency relies on the number
of individuals tending to infinity.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Cross sectional dependence
• Macro panels on countries or regions with long time series that
do not account for cross-country dependence (the variable Y is
not iid) may lead to misleading inference.
• Accounting for cross-section dependence turns out to be
important and affects inference.
• Panel unit root tests are suggested that account for this
dependence (Baltagi - 2008, ch. 12)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Unobserved heterogeneity
• From a statistical point of view, panel data addresses the issue
of unobserved heterogeneity (i.e., controlling for it to avoid
biased estimations)
• Let consider the model y = α + βx + γz + e, where x is an
observable regressor and z is unobservable.
• The ”feasible” model is y = α + βx + e and may suffer for
omitted variable bias: OLS β̂ is consistent and unbiased if z is
uncorrelated with either x or y
• Example: agricultural production function (Mundlak, 1961).
output (y) depends on labour (x) and soil quality (z). Soil
quality is correlated with the effort (labour), hence β̂OLS will be
inconsistent for β.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Panel data solution
• The panel data model aims to avoid this problem
ynt = α + βT
xnt + (µn + υnt) (1)
where µn is time-invariant (and it represents the unobservable
characteristics on the individuals)
• the objective is to eliminate (wipes out) µn from the model.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Eliminating the unobservables
• If unobservable characteristics are time invariant, µn is a good
proxy for it, and it is possible to rewrite the model in eq. 1 in
terms of observables only.
1 Differencing method (then, OLS method, which is
consistent, since we remove Z):
∆ynt = βT
∆xnt + ∆µn + ∆υnt (2)
where typical elements are ∆ynt = ynt − yn,t−1, t = 2, ..., T
2 Fixed effects method (within transformation) (remove Z)
3 Least Square dummy variables (LSDV) method (account
for Z): include time-invariant individual effects by
introducing them via individual intercept (N dummy
variables µn, n = 1, ..., N).
- Degrees of freedom are NT − N − K (impossible to
estimate for small T).
- β̂ is NT−consistent, µn are T−consistent.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Fixed effects method
• LSDV consistency depends on N and T and may be numerically
inefficient
• An equivalent formulation is to trasform the data by subtracting
the averages by individual to every variable:
ynt − ȳn. = (xnt − x̄n.)β + (υnt − ῡn.) (3)
where ȳn. and x̄n. are the individual means of y and x
• α and µn disappears because they are time-invariant
• Fixed effect model is equivalent to estimate LSDV , but in
LSDV µn are estimated directly, in fixed effects they are not
(but it is possible to recover that).
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Application
Application in R: example 1.1 (all solutions) and 1.2
(Croissant, Millo)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
General notation
• P: probability, E: expected value, V : variance
• tr: trace of a matrix (sum of principal diagonal)
• cor: correlation, σ: standard deviation
• q: quadratic form
• I: identity matrix
• P: X(XT
X)−1
X, M: 1 − P
• C: Cholesky matrix decomposition, such that CACT
= I
• lnL: objective function of the maximum likelihood
• LR and LM are, respectively, the Likelihood ratio and the
Lagrange multiplier
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Index
• A panel is composed by N individuals denoted with n
• Each individual is observed during T time periods, denoted with
t
• Sample size is O, with O = NT
• The K covariates are indexed by k (If we also have the column
of one for the intercept, we have K + 1)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Two way error component model
(i)
ynt = α + βT
xnt + ent = γT
znt + ent
with ent = µn + λt + υnt
• y is the response, α the intercept, x is the vector of K covariates
with associated coefficients β, z, where zT
nt = (1, xT
nt), γ:
γT
= (α, βT
)
• e is the sum of time-invariant individual effect µ,
individual-invariant time effect λ and the iid residual error υ
• The variance is σ2
, so σ2
µ, σ2
λ, σ2
υ and σ2
e are the variance of the
four terms
• Estimated parameters are expressed with an hat (β̂, σ̂2
, etc...)
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Two way error component model
(ii)
• In matrix form:
y = αj + Xβ + e = Zγ + e
e = Dµµ + Dλλ + υ
where j is a vector of ones, X and Z the covariate matrix, µ the
vector of N individual effects, λ the vector of T time effects, υ
the vector of O residual effects
• D denotes a matrix of dummy variables
• Denoting by J = jjT
a squared matrix of ones, we have
Dµ = IN ⊗ JT
Dλ = JT ⊗ IN
• The covariance matrix (of e) is
Ωe = σ2
υINT + σ2
µIN ⊗ JT + σ2
λJT ⊗ IN
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
One way (error) component model
- Transformation (i)
In this model the time-invariant terms disappear
• S = IN ⊗ JT is the matrix that, if post multiplied by a variable,
returns a vector of length O containing the individual sums of
the variables (each one repeated T times)
• Ī = I − J̄ is the matrix that, if post multiplied by a variable,
returns the variable in deviation from its overall mean
• B = 1
T S and W = INT − B are, respectively, the between and
the within matrix
• Ωe = σ2
υ(W +
σ2
1
σ2
υ
B), where σ2
1 = σ2
υ + Tσ2
µ. φ =
σ2
υ
σ2
1
• θ = 1 − φ is the fraction of the individual mean that is
subtracted in the generalized least squared (GLS) model.
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
Two way error component model -
Transformation (i)
In this model we have two different between matrices:
Bµ = IN ⊗ JT /T; Bλ = JT ⊗ IN/N
• The within matrix J̄ = JNT /NT, if post multiplied by a variable,
returns the vector of the overall mean repeated NT times:
W = I − Bµ − Bλ + J̄
• Ωe = σ2
υ(W + 1
φ2
µ
¯
Bµ + 1
φ2
λ
¯
Bλ + 1
φ2
2
J̄), where ¯
Bµ = Bµ − J̄,
¯
Bλ = Bλ − J̄, φ2
µ = συ
p
συ+Tσ2
µ
, φ2
λ = συ
p
συ+Nσ2
λ
,
φ2
2 = συ
p
συ+Tσ2
µ+Nσ2
λ
• θi = 1 − φi , i = µ, λ, 2
Regression
models and
panel data
R. Metulini
Introduction
Benefits from
using panel
data
A gentle
introduction
to panel data
models
Useful
notation
References
References
• Mundlak, Y. (1961). Empirical production function free of
management bias. Journal of Farm Economics, 43(1), 44-56.
• Baltagi, B. (2008). Econometric analysis of panel data. John
Wiley & Sons.
• Baltagi, B. H., & Levin, D. (1992). Cigarette taxation: Raising
revenues and reducing consumption. Structural Change and
Economic Dynamics, 3(2), 321-335.
• Greene, W. H. (2000). Econometric analysis 4th edition.
International edition, New Jersey: Prentice Hall, 201-215.
• Kalton, G., Kasprzyk, D., & McMillen, D. (1989). Panel
surveys.
• Stock, J. H., & Watson, M. W. (2012). Introduction to
econometrics (Vol. 3). New York: Pearson.
• Wooldridge, J. M. (2010). Econometric analysis of cross section
and panel data. MIT press.

More Related Content

PDF
Logistic Regression Analysis
PPTX
Logistic regression
PPTX
STATA - Time Series Analysis
PPTX
Lesson 2 stationary_time_series
PPTX
Logistic regression
PPTX
Time Series Analysis.pptx
PDF
Model Templates for PROCESS
PPTX
A Presentation on IS-LM Model
Logistic Regression Analysis
Logistic regression
STATA - Time Series Analysis
Lesson 2 stationary_time_series
Logistic regression
Time Series Analysis.pptx
Model Templates for PROCESS
A Presentation on IS-LM Model

What's hot (20)

PPTX
Panel data analysis
PDF
PPTX
time series analysis
PPT
ECONOMETRICS
PPT
Basic econometrics lectues_1
PPTX
ders 6 Panel data analysis.pptx
PPTX
Chapter 06 - Heteroskedasticity.pptx
DOCX
Autocorrelation
PDF
Arch & Garch Processes
PPTX
Multicollinearity PPT
PPTX
Autocorrelation
PPTX
Multicolinearity
PDF
PPTX
Heteroscedasticity
PPTX
Dummy variables
PDF
Panel slides
PPT
Econometrics lecture 1st
PDF
Time Series, Moving Average
PPTX
Logit and Probit and Tobit model: Basic Introduction
PPTX
Heteroscedasticity
Panel data analysis
time series analysis
ECONOMETRICS
Basic econometrics lectues_1
ders 6 Panel data analysis.pptx
Chapter 06 - Heteroskedasticity.pptx
Autocorrelation
Arch & Garch Processes
Multicollinearity PPT
Autocorrelation
Multicolinearity
Heteroscedasticity
Dummy variables
Panel slides
Econometrics lecture 1st
Time Series, Moving Average
Logit and Probit and Tobit model: Basic Introduction
Heteroscedasticity
Ad

Similar to Regression models for panel data (20)

PDF
Lecture 7B Panel Econometrics I 2011
PDF
Panel data content
PDF
02. predicting financial distress logit mode jones
PDF
Application of panel data to the effect of five (5) world development indicat...
PDF
Application of panel data to the effect of five (5) world development indicat...
PDF
Lecture 6_Panel Data Models.pdf
PDF
M.E.Bontempi-Panel data: Models, estimation,and the role of attrition and Mea...
PDF
QUANTITATIVE METHODS NOTES.pdf
PPT
Using statistical process control to compare reconviction rates across local ...
PDF
Fiscal Policy And Trade Openness On Unemployment Essay
PPTX
Chapter five Application to Crosssectional analysis.pptx
PDF
Levels of Measurement.pdf
PPTX
Factors affecting the usage of ChatGPT: Advancing an information technology a...
PDF
Investigations of certain estimators for modeling panel data under violations...
PPTX
Lu2 introduction to statistics
PPTX
Panel Data Regression Notes Part-1 of 5.pptx
PPT
Unit 8 presenting data in charts, graphs and tables
PDF
Are we really including all relevant evidence
PDF
Notes1
PDF
Reinvestigating sources of movements in real exchange rate
Lecture 7B Panel Econometrics I 2011
Panel data content
02. predicting financial distress logit mode jones
Application of panel data to the effect of five (5) world development indicat...
Application of panel data to the effect of five (5) world development indicat...
Lecture 6_Panel Data Models.pdf
M.E.Bontempi-Panel data: Models, estimation,and the role of attrition and Mea...
QUANTITATIVE METHODS NOTES.pdf
Using statistical process control to compare reconviction rates across local ...
Fiscal Policy And Trade Openness On Unemployment Essay
Chapter five Application to Crosssectional analysis.pptx
Levels of Measurement.pdf
Factors affecting the usage of ChatGPT: Advancing an information technology a...
Investigations of certain estimators for modeling panel data under violations...
Lu2 introduction to statistics
Panel Data Regression Notes Part-1 of 5.pptx
Unit 8 presenting data in charts, graphs and tables
Are we really including all relevant evidence
Notes1
Reinvestigating sources of movements in real exchange rate
Ad

More from University of Salerno (20)

PDF
Modelling traffic flows with gravity models and mobile phone large data
PDF
Carpita metulini 111220_dssr_bari_version2
PDF
A strategy for the matching of mobile phone signals with census data
PDF
Detecting and classifying moments in basketball matches using sensor tracked ...
PDF
BASKETBALL SPATIAL PERFORMANCE INDICATORS
PDF
Human activity spatio-temporal indicators using mobile phone data
PDF
Poster venezia
PDF
Metulini280818 iasi
PDF
Players Movements and Team Performance
PDF
Big Data Analytics for Smart Cities
PDF
Meeting progetto ode_sm_rm
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
PDF
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
PDF
Metulini1503
PDF
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
PPT
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
PPT
The Global Virtual Water Network
PDF
The Worldwide Network of Virtual Water with Kriskogram
PDF
Ad b 1702_metu_v2
PDF
Statistics lab 1
Modelling traffic flows with gravity models and mobile phone large data
Carpita metulini 111220_dssr_bari_version2
A strategy for the matching of mobile phone signals with census data
Detecting and classifying moments in basketball matches using sensor tracked ...
BASKETBALL SPATIAL PERFORMANCE INDICATORS
Human activity spatio-temporal indicators using mobile phone data
Poster venezia
Metulini280818 iasi
Players Movements and Team Performance
Big Data Analytics for Smart Cities
Meeting progetto ode_sm_rm
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini1503
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Global Virtual Water Network
The Worldwide Network of Virtual Water with Kriskogram
Ad b 1702_metu_v2
Statistics lab 1

Recently uploaded (20)

PPTX
EABDM Slides for Indifference curve.pptx
PPTX
Introduction to Customs (June 2025) v1.pptx
PPTX
Globalization-of-Religion. Contemporary World
PDF
Understanding University Research Expenditures (1)_compressed.pdf
PDF
way to join Real illuminati agent 0782561496,0756664682
PDF
Spending, Allocation Choices, and Aging THROUGH Retirement. Are all of these ...
PPTX
Introduction to Managemeng Chapter 1..pptx
PDF
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
PDF
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
PDF
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
PDF
discourse-2025-02-building-a-trillion-dollar-dream.pdf
PPTX
social-studies-subject-for-high-school-globalization.pptx
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PPTX
Understanding-Economic-Growth in macro..
PPTX
fastest_growing_sectors_in_india_2025.pptx
PPT
E commerce busin and some important issues
PDF
Circular Flow of Income by Dr. S. Malini
PDF
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
PDF
Predicting Customer Bankruptcy Using Machine Learning Algorithm research pape...
PPTX
Unilever_Financial_Analysis_Presentation.pptx
EABDM Slides for Indifference curve.pptx
Introduction to Customs (June 2025) v1.pptx
Globalization-of-Religion. Contemporary World
Understanding University Research Expenditures (1)_compressed.pdf
way to join Real illuminati agent 0782561496,0756664682
Spending, Allocation Choices, and Aging THROUGH Retirement. Are all of these ...
Introduction to Managemeng Chapter 1..pptx
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
discourse-2025-02-building-a-trillion-dollar-dream.pdf
social-studies-subject-for-high-school-globalization.pptx
ECONOMICS AND ENTREPRENEURS LESSONSS AND
Understanding-Economic-Growth in macro..
fastest_growing_sectors_in_india_2025.pptx
E commerce busin and some important issues
Circular Flow of Income by Dr. S. Malini
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
Predicting Customer Bankruptcy Using Machine Learning Algorithm research pape...
Unilever_Financial_Analysis_Presentation.pptx

Regression models for panel data

  • 1. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Regression models and panel data Part I: Introduction Rodolfo Metulini B rmetulini@unisa.it Department of Economics and Statistics (DISES) - University of Salerno
  • 2. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Outline 1 Introduction 2 Benefits from using panel data 3 A gentle introduction to panel data models 4 Useful notation 5 References
  • 3. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Introduction • student’s presentation • presentation of myself • presentation of the short course • syllabus, textbook and material • presentation of the examination method
  • 4. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data Definition Panel data: the pooling of observations on a cross-section of households, countries, firms, etc. over several time periods. This can be achieved by surveying a number of firms, households or individuals and following them over time (Baltagi, 2008)
  • 5. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Big data Big data are characterized for their: 1 Volume, 2 velocity, 3 variety, 4 veracity and 5 value.
  • 6. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data & Big data • Panel data mainly addresses the issue of data heterogeneity along space and time • Heterogeneity is connected to many V’s because: 1 heterogeneity emerges with high volume of data, 2 veracity (uncertainly) increases when both space and time index are considered, 3 to produce the panel dataset it may be needed to extract data from many sources (variety), 4 the value is increased, as heterogeneity means that more informations can be extracted from data
  • 7. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References What we do and we do not do in this course We investigate the (causal) relation among variables in a panel assuming: • that the true relation between the variables is linear • that the dependent variable (Y) is quantitative, so that we can adopt normal distributions We do not study models adopted when: • we have to assume a non linear relation between variables • the dependent variables is a count data, so that we have to model it by a Poisson, or a Binomial distribution Many books addressing models for non linear relations and/or count data exists, but here we limit our attention to linear relations among (possibly) normal variables
  • 8. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Cross sections and time series (i) • In statistics and econometrics, cross-sectional dataset is a collection of one or more variables for a sample of the population which observes different individuals in just one (and the same for all individuals) period of time, disregarding time. • In statistics and econometrics, a time-series dataset is a collection of one or more variables for one individual along a collection of ordered periods of time • with time series we can highlight variation in time, with cross section variation inter-individuals • Examples of time series: i) real GDP by trimester, from Q1.2007 to Q.4.2020. ii) Daily variation of Unicredit on Stock exchange market. • Example of cross sections: i) the average labour cost per capita per hour for a sample of Chinese firms producing rice. ii) the GDP growth for NUTS2 regions in the period 01-01-2020 to 31-12-2020.
  • 9. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Cross sections and time series (ii) time series highlight variation in time, cross sections display inter-individuals variations (or heterogeneity)
  • 10. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data Definition Panel data: the pooling of observations on a cross-section of households, countries, firms, etc. over several time periods. This can be achieved by surveying a number of firms, households or individuals and following them over time (Baltagi, 2008) • Panel data allow to account for both time and individual variability • May regards macro or micro phenomena, so, individuals may be firms or regions • Panel may be balanced (same individuals along different periods) or unbalanced (the sample of individuals changes along time)
  • 11. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data structure: an example
  • 12. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data structure: time and individual heterogeneity • With panel we have more than 1 information per year and more than 1 information per state • Computing averages by state and by year we can highlight, respectively, the presence of inter-state heterogeneity and inter-year heterogeneity • Each state presents a different level • each year presents a different level as well.
  • 13. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Role of individual heterogeneity • The main interest when using regression models is that of correctly studying the causal relation between Y (the dependent) and X (the independent). In doing so, it is important to correctly account for individual heterogeneity in Y . • Stock and Watson (2012) offered an example: • The research question is whether taxing alcoholics can reduce deaths due to road’s incidents • fraten = α + βbeertaxn + en estimated on year 1982 returns a positive β (!!!). fratent = α + βbeertaxnt + ent estimated on the full panel, returns a positive β as well! • the model fratent = αn + βbeertaxnt + ent accounts for individual heterogeneity via αn (state-level parameter). This model returns a negative β
  • 14. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Role of individual heterogeneity (discussion) • WHY?? unobserved state-level characteristics (not included in the first two models) are correlated to local beer tax. • If I miss to include something relevant in the model, and this is correlated with X, OLS is biased and inconsistent. • Can I include state-level characteristics in cross section model? • NO, because it involves the estimation of n parameters (related to the n intercepts) on n observations (no degrees of freedom) • I might know a measure for state-level characteristics, but they are generally unknown • TAKE HOME MESSAGE: By using cross section data, one may obtain erroneous results (about the causal effect of a regressor on the dependent).
  • 15. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Application Application in R: example 1.1 (Croissant, Millo) (solution 2 only)
  • 16. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Benefits from using panel data 1 Controlling for individual heterogeneity; 2 more informative data, more variability, less collinearity among the variables, more degrees of freedom and more efficiency; 3 better able to study the dynamics of adjustment; 4 better able to identify and measure effects that are simply not detectable in pure cross-section or pure time-series data; 5 allow to construct and test more complicated behavioral models than purely cross-section or time-series data.
  • 17. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Controlling for individual heterogeneity • Panel data suggests that individuals, firms, states or countries are heterogeneous. • Time-series and cross-section studies not controlling this heterogeneity run the risk of obtaining biased and inconsistent results. • Baltagi and Levin (1992) consider cigarette demand across 46 American states for the years 1963–88 (t=26): consnt = consn,t−1 + pricent + incoment + ent. • The model does not consider unobservable time invariant Zn (e.g., religion, education) or state invariant Wt (e.g., advertising on TV). • Authors shaw that, omitting Zn and/or Wt, results may be biased. • Panel data is able to control for these unobserved variable by including individual- and time-specific effects
  • 18. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References More informative data, more variability, less collinearity among the variables, more degrees of freedom and more efficiency • Time-series are plagued with multicollinearity; for example, in the case of demand for cigarettes there is high collinearity (reminds linear regression assumptions) between cigarettes’ price and income (considering US aggregated data) • In panel data collinearity is less likely, because the variation in data can be decomposed in within-states and between-states (usually bigger than within) • With panel data, having larger samples, it is possible to estimate more complex models (with more parameters) • For example, it may be possible to estimate a state-varying parameters model ynt = α + βnxnt + ent
  • 19. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References better able to study the dynamics of adjustment • Unemployment, job turnover, poverty, growth, etc.. (which presents a cyclic trend with a cycle’s duration) are better studied with panels. • If these panels are long enough, they can shed light on the speed of adjustments to economic policy changes (e.g, elasticity of the price of a cup of coffee on price of inputs after an increase in taxation). • For example, differently from cross sections and time series, panel data: 1 can estimate what proportion of those who are unemployed in one period can remain unemployed in another period; 2 enables to determine at what extent countries’ employment rate in time t is benefiting from a government policy in t − 1; 3 allow to determine which pharmaceutical firms are benefiting from an increase on EU research funds.
  • 20. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References better able to identify and measure effects that are simply not detectable in pure cross-section or pure time-series data • Example: suppose that we have a cross-section of women with a 50% average yearly labour force participation rate. • This might be due to: 1 each woman having a 50% chance of being in the labour force, in any given year 2 50% of the women working all the time and 50% not at all. • Case 1 has high working turnover, while case 2 has no working turnover: only panel data could discriminate between these cases
  • 21. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References allow to construct and test more complicated behavioral models than purely cross-section or time-series data • With cross sections we are forced to treat individuals as all having the same behaviour (e.g., all firms’ productivity react to EU research funds with the same elasticity) • With panel data we can treat individuals as having different behaviours, since we are allow to model a firm-varying coefficient model (so that, funds’ elasticity is different along firms) • Moreover, firm’s productivity at time t may depends on the productivity in time t − 1 (dynamic models), • or it may depends on the productivity of the neighbours (firms located close by) (spatial models)
  • 22. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Limitations from using panel data Better to say problems with data collection • Design and data collection problems; • distortions of measurement errors; • Selectivity problems: 1 self-selectivity; 2 nonresponse; 3 attrition. • short time-series dimension; • cross-section dependence.
  • 23. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Design and data collection problems • These issues include: 1 problems of coverage (incomplete account of the population of interest) 2 nonresponse (due to lack of cooperation of the respondent or because of interviewer errors) 3 recall (respondent not remembering correctly) 4 frequency of interviewing and interview spacing (reference period) • For an extensive discussion of problems that arise in designing panel surveys as well as data collection and data management issues see Kalton et al. (1989)
  • 24. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Distortions of measurement errors • Measurement errors may arise because of: 1 faulty responses due to unclear questions 2 memory errors 3 deliberate distortion of responses (e.g., prestige bias) 4 misrecording of responses 5 interviewer effects • Panel advantage: Cross-section data users have little choice but to believe the reported information in the survey (unless they have external information) while users of panel data can check for inconsistencies of responses along different interviews.
  • 25. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Selectivity problems • Self selectivity: Example on people’s wage: some people choose not to work because the reservation wage is higher than the offered wage. We will observe the characteristics of these individuals but not their wage: the sample is going to be truncated (when data is missing) or censored (when we just know wage is under a threshold) • Nonresponse: refusal to participate, nobody at home, untraced sample unit, etc... Partial nonresponse occurs when one or more questions are left unanswered. Complete nonresponse occurs when no information is available from the sampled individual • Attrition: Nonresponse is a more pronounced issue in panel (compared to cross section). Subsequent waves of the survey are still subject to nonresponse because respondents may die, or move, or find that the cost of responding is high.
  • 26. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Short time series dimensions • Typical micro panels involve data covering a short time span for each individual. • This means that asymptotical consistency relies on the number of individuals tending to infinity.
  • 27. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Cross sectional dependence • Macro panels on countries or regions with long time series that do not account for cross-country dependence (the variable Y is not iid) may lead to misleading inference. • Accounting for cross-section dependence turns out to be important and affects inference. • Panel unit root tests are suggested that account for this dependence (Baltagi - 2008, ch. 12)
  • 28. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Unobserved heterogeneity • From a statistical point of view, panel data addresses the issue of unobserved heterogeneity (i.e., controlling for it to avoid biased estimations) • Let consider the model y = α + βx + γz + e, where x is an observable regressor and z is unobservable. • The ”feasible” model is y = α + βx + e and may suffer for omitted variable bias: OLS β̂ is consistent and unbiased if z is uncorrelated with either x or y • Example: agricultural production function (Mundlak, 1961). output (y) depends on labour (x) and soil quality (z). Soil quality is correlated with the effort (labour), hence β̂OLS will be inconsistent for β.
  • 29. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Panel data solution • The panel data model aims to avoid this problem ynt = α + βT xnt + (µn + υnt) (1) where µn is time-invariant (and it represents the unobservable characteristics on the individuals) • the objective is to eliminate (wipes out) µn from the model.
  • 30. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Eliminating the unobservables • If unobservable characteristics are time invariant, µn is a good proxy for it, and it is possible to rewrite the model in eq. 1 in terms of observables only. 1 Differencing method (then, OLS method, which is consistent, since we remove Z): ∆ynt = βT ∆xnt + ∆µn + ∆υnt (2) where typical elements are ∆ynt = ynt − yn,t−1, t = 2, ..., T 2 Fixed effects method (within transformation) (remove Z) 3 Least Square dummy variables (LSDV) method (account for Z): include time-invariant individual effects by introducing them via individual intercept (N dummy variables µn, n = 1, ..., N). - Degrees of freedom are NT − N − K (impossible to estimate for small T). - β̂ is NT−consistent, µn are T−consistent.
  • 31. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Fixed effects method • LSDV consistency depends on N and T and may be numerically inefficient • An equivalent formulation is to trasform the data by subtracting the averages by individual to every variable: ynt − ȳn. = (xnt − x̄n.)β + (υnt − ῡn.) (3) where ȳn. and x̄n. are the individual means of y and x • α and µn disappears because they are time-invariant • Fixed effect model is equivalent to estimate LSDV , but in LSDV µn are estimated directly, in fixed effects they are not (but it is possible to recover that).
  • 32. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Application Application in R: example 1.1 (all solutions) and 1.2 (Croissant, Millo)
  • 33. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References General notation • P: probability, E: expected value, V : variance • tr: trace of a matrix (sum of principal diagonal) • cor: correlation, σ: standard deviation • q: quadratic form • I: identity matrix • P: X(XT X)−1 X, M: 1 − P • C: Cholesky matrix decomposition, such that CACT = I • lnL: objective function of the maximum likelihood • LR and LM are, respectively, the Likelihood ratio and the Lagrange multiplier
  • 34. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Index • A panel is composed by N individuals denoted with n • Each individual is observed during T time periods, denoted with t • Sample size is O, with O = NT • The K covariates are indexed by k (If we also have the column of one for the intercept, we have K + 1)
  • 35. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Two way error component model (i) ynt = α + βT xnt + ent = γT znt + ent with ent = µn + λt + υnt • y is the response, α the intercept, x is the vector of K covariates with associated coefficients β, z, where zT nt = (1, xT nt), γ: γT = (α, βT ) • e is the sum of time-invariant individual effect µ, individual-invariant time effect λ and the iid residual error υ • The variance is σ2 , so σ2 µ, σ2 λ, σ2 υ and σ2 e are the variance of the four terms • Estimated parameters are expressed with an hat (β̂, σ̂2 , etc...)
  • 36. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Two way error component model (ii) • In matrix form: y = αj + Xβ + e = Zγ + e e = Dµµ + Dλλ + υ where j is a vector of ones, X and Z the covariate matrix, µ the vector of N individual effects, λ the vector of T time effects, υ the vector of O residual effects • D denotes a matrix of dummy variables • Denoting by J = jjT a squared matrix of ones, we have Dµ = IN ⊗ JT Dλ = JT ⊗ IN • The covariance matrix (of e) is Ωe = σ2 υINT + σ2 µIN ⊗ JT + σ2 λJT ⊗ IN
  • 37. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References One way (error) component model - Transformation (i) In this model the time-invariant terms disappear • S = IN ⊗ JT is the matrix that, if post multiplied by a variable, returns a vector of length O containing the individual sums of the variables (each one repeated T times) • Ī = I − J̄ is the matrix that, if post multiplied by a variable, returns the variable in deviation from its overall mean • B = 1 T S and W = INT − B are, respectively, the between and the within matrix • Ωe = σ2 υ(W + σ2 1 σ2 υ B), where σ2 1 = σ2 υ + Tσ2 µ. φ = σ2 υ σ2 1 • θ = 1 − φ is the fraction of the individual mean that is subtracted in the generalized least squared (GLS) model.
  • 38. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References Two way error component model - Transformation (i) In this model we have two different between matrices: Bµ = IN ⊗ JT /T; Bλ = JT ⊗ IN/N • The within matrix J̄ = JNT /NT, if post multiplied by a variable, returns the vector of the overall mean repeated NT times: W = I − Bµ − Bλ + J̄ • Ωe = σ2 υ(W + 1 φ2 µ ¯ Bµ + 1 φ2 λ ¯ Bλ + 1 φ2 2 J̄), where ¯ Bµ = Bµ − J̄, ¯ Bλ = Bλ − J̄, φ2 µ = συ p συ+Tσ2 µ , φ2 λ = συ p συ+Nσ2 λ , φ2 2 = συ p συ+Tσ2 µ+Nσ2 λ • θi = 1 − φi , i = µ, λ, 2
  • 39. Regression models and panel data R. Metulini Introduction Benefits from using panel data A gentle introduction to panel data models Useful notation References References • Mundlak, Y. (1961). Empirical production function free of management bias. Journal of Farm Economics, 43(1), 44-56. • Baltagi, B. (2008). Econometric analysis of panel data. John Wiley & Sons. • Baltagi, B. H., & Levin, D. (1992). Cigarette taxation: Raising revenues and reducing consumption. Structural Change and Economic Dynamics, 3(2), 321-335. • Greene, W. H. (2000). Econometric analysis 4th edition. International edition, New Jersey: Prentice Hall, 201-215. • Kalton, G., Kasprzyk, D., & McMillen, D. (1989). Panel surveys. • Stock, J. H., & Watson, M. W. (2012). Introduction to econometrics (Vol. 3). New York: Pearson. • Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data. MIT press.