SlideShare a Scribd company logo
Chapter 8
Multicollinearity
Perfect Multicollinearity
• Perfect multicollinearity is a violation of Classical Assumption
VI.
• It is the case where the variation in one explanatory variable
can be completely explained by movements in another
explanatory variable.
• Such a case between two independent variables would be:
• Where the X’s are independent variables in:
Perfect Multicollinearity (continued)
• Other examples of perfect linear relationships:
• Real world examples?
-Distance between two cities.
-Percent of voters voting in favor and against a
proposition.
Perfect Multicollinearity (continued)
Perfect Multicollinearity (continued)
• OLS is incapable of generating estimates of regression
coefficients where perfect multicollinearity is present.
• You cannot “hold all the other independent variables in the
equation constant.”
• A special case related to perfect multicollinearity is a
dominant variable.
• A dominant variable is so highly correlated with the
dependent variable that it masks the effects of other
independent variables.
• Don’t confuse dominant variables with highly significant
variables.
Imperfect Multicollinearity
• Imperfect multicollinearity: linear functional relationship
between two or more independent variables so strong that it
can significantly affect the estimations of coefficients.
• It occurs when two (or more) independent variables are
imperfectly linearly related, as in:
• Note ui, a stochastic error term in Equation (8.7)
Imperfect Multicollinearity (continued)
The Consequences of Multicollinearity
• The major consequences of multicollinearity are:
1. Estimates will remain unbiased.
2. The variances and standard errors of the estimates
will increase.
3. The computed t-scores will fall.
4. Estimates will become sensitive to changes in
specification.
5. The overall fit of the equation and estimation of the
coefficients of non multicollinear variables will be largely
unaffected.
The Consequences of
Multicollinearity (continued)
1. Estimates will remain unbiased.
• Even if an equation has significant multicollinearity, the
estimates of β will be unbiased if first six Classical
Assumptions hold.
2. The variances and standard errors of the estimates will
increase.
• With multicollinearity, it becomes difficult to precisely identify
the separate effects of multicollinear variables.
• OLS is still BLUE with multicollinearity.
• But the “minimum variances” can be fairly large.
The Consequences of
Multicollinearity (continued)
The Consequences of
Multicollinearity (continued)
3. The computed t-scores will fall.
• Multicollinearity tends to decrease t-scores mainly because of
the formula for the t-statistic.
• If standard error increases, t-score must fall.
• Confidence intervals also increase because standard errors
increase.
The Consequences of
Multicollinearity (continued)
4. Estimates will become sensitive to changes in specification.
• Adding/dropping variables and/or observations will often
cause major changes in β estimates when significant
multicollinearity exists.
• This occurs because with severe multicollinearity OLS is
forced to emphasize small differences between variables in
order to distinguish the effect of one multicollinear variable.
The Consequences of
Multicollinearity (continued)
5. The overall fit of the equation and estimation of the
coefficients of nonmulticollinear variables will be largely
unaffected.
• will not fall much, if at all, with significant multicollinearity.
• Combination of high and no statistically significant variables
is an indication of multicollinearity.
• It is possible for an F-test of overall significance to reject the
null even though none of the individual t-tests do.
Two Examples of the Consequences
of Multicollinearity
Example: Student consumption function
where:
COi = annual consumption expenditures of the ith
student on items other than tuition and room and
board.
Ydi = annual disposable income (including gifts) of that
student
LAi = liquid assets (savings, etc.) of the ith student
εi = stochastic error term
Two Examples of the Consequences
of Multicollinearity (continued)
• Estimate Equation 8.9 with OLS:
• Including only disposable income:
Two Examples of the Consequences
of Multicollinearity (continued)
Example: Demand for gasoline by state
where:
PCONi = petroleum consumption in the ith state
(trillions of BTUs)
UHMi = urban highway miles within the ith state
TAXi = gasoline tax in the ith state (cents per
gallon)
REGi= motor vehicle registrations in the ith state
(thousands)
Two Examples of the Consequences
of Multicollinearity (continued)
• Estimate Equation 8.12 with OLS:
• If you drop UHM:
The Detection of Multicollinearity
• Multicollinearity exists in every equation.
• Important question is how much exists.
• The severity can change from sample to sample.
• There are no generally accepted, true statistical tests for
multicollinearity.
• Researchers develop a general feeling for the severity of
multicollinearity by examining a number of characteristics.
Two common ones are:
1. Simple correlation coefficient
2. Variance inflation factors
High Simple Correlation Coefficients
• The simple correlation coefficient, r, is a measure of the
strength and direction of the linear relationship of two
variables.
• Range of r is +1 to -1.
• Sign of r indicates the direction of the correlation.
• If r, in absolute value, is high, then the two variables are quite
correlated and multicollinearity is a potential problem.
High Simple Correlation Coefficients
(continued)
• How high is high?
• Some researchers select arbitrary number, such as 0.80.
• Better answer might be r is high if it causes unacceptable
large variances.
• The use of r to detect multicollinearity has a major limitation:
groups of variables acting together can cause multicollinearity
without any single simple correlation coefficient being high.
High Variance Inflation Factors (VIFs)
• Variance inflation factor (VIF) is a method of detecting the
severity of multicollinearity by looking at the extent to which
a given explanatory variable can be explained by all other
explanatory variables in an equation.
• Suppose the following model with K independent variables:
• Need to calculate a VIF for each of the K independent
variables.
High Variance Inflation
Factors (VIFs) (continued)
• To calculate VIFs:
1. Run an OLS regression that has Xi as a function of
all the other explanatory variables in the equation.
2. Calculate the variance inflation factor for
High Variance Inflation
Factors (VIFs) (continued)
• The higher the VIF, the more severe the effects of
multicollinearity.
• But, there are no formal critical VIF values.
• A common rule of thumb: if VIF > 5, multicollinearity is
severe.
• It’s possible to have large multicollinearity effects without
having a large VIF.
Remedies for Multicollinearity
Remedy 1: Do nothing
• Existence of multicollinearity might not mean anything (i.e.
coefficients still significant and meet expectations).
• If you delete a multicollinear variable that belongs in model,
you cause specification bias.
• Every time a regression is rerun, we risk encountering a
specification that accidently works on the specific sample.
Remedies for Multicollinearity
(continued)
Remedy 2: Drop a redundant variable
• Two or more variables in an equation measuring essentially
the same thing might be called redundant.
• Dropping redundant variable is nothing more than making up
for a specification error.
• In case of severe multicollinearity, it makes no statistical
difference which variable is dropped.
• The theoretical underpinnings of model should be the basis
for dropping a redundant variable.
Remedies for Multicollinearity
(continued)
Example: Student consumption function:
Remedies for Multicollinearity
(continued)
Remedy 3: Increase the size of the sample
• Normally, a larger sample will reduce the variance of the
estimated coefficients diminishing impact of multicollinearity.
• Unfortunately, while a useful alternative to be considered, it
may be impossible.
An Example of Why Multicollinearity
Often is Best Left Unadjusted
Example: Impact of marketing on soft drink sales
where:
St = sales of the soft drink in year t
Pt = average relative price of the drink in year t
At = advertising expenditures for the company in year t
Bt = advertising expenditures for the company’s main
competitor in year t.
An Example of Why Multicollinearity
Often is Best Left Unadjusted (continued)
• If variable B is dropped:
• Note expected bias in estimated coefficient of At:
CHAPTER 8: the end

More Related Content

PPTX
Multicollinearity PPT
PPTX
Multicolinearity
PDF
Multicollinearity1
PPTX
LEC11 (1).pptx
PPTX
Multicollinearity.pptx this is a presentation of hetro.
PDF
Multicollinearity econometrics semester 4 Delhi University
PPT
Econometrics_ch11.ppt
PPTX
Multicollinearity meaning and its Consequences
Multicollinearity PPT
Multicolinearity
Multicollinearity1
LEC11 (1).pptx
Multicollinearity.pptx this is a presentation of hetro.
Multicollinearity econometrics semester 4 Delhi University
Econometrics_ch11.ppt
Multicollinearity meaning and its Consequences

Similar to Chapter8_Final.pptxkhnhkjlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll (20)

PPT
Econometrics ch11
PDF
Multicolliniaritymultipleregression
PDF
Define Multicollinearity in the following terms a. In what .pdf
DOCX
A researcher in attempting to run a regression model noticed a neg.docx
PPTX
MULTICOLLINERITY.pptx
PPT
Multicollinearity
PPTX
Regression Analysis
PPTX
SSP PRESENTATION COMPLETE ( ADVANCE ) .pptx
PPTX
Chapter 5 (1) Multi Collinearity.pptx econometrics
PPT
Multivariate Linear Regression.ppt
PDF
Perfect and imperfect multicollinearitya) Define perfect multicol.pdf
PDF
Bmgt 311 chapter_15
PPTX
All About Econometrics
PPT
Econometric model ing
PPTX
Econometrics.pptx
PPT
biv_mult.ppt
PPT
biv_sssssssssssssssssssssssssssssssssssmult.ppt
PPT
regression
PPT
Introduction to Regression analysis .ppt
PPT
chapter17 introduction to regression .ppt
Econometrics ch11
Multicolliniaritymultipleregression
Define Multicollinearity in the following terms a. In what .pdf
A researcher in attempting to run a regression model noticed a neg.docx
MULTICOLLINERITY.pptx
Multicollinearity
Regression Analysis
SSP PRESENTATION COMPLETE ( ADVANCE ) .pptx
Chapter 5 (1) Multi Collinearity.pptx econometrics
Multivariate Linear Regression.ppt
Perfect and imperfect multicollinearitya) Define perfect multicol.pdf
Bmgt 311 chapter_15
All About Econometrics
Econometric model ing
Econometrics.pptx
biv_mult.ppt
biv_sssssssssssssssssssssssssssssssssssmult.ppt
regression
Introduction to Regression analysis .ppt
chapter17 introduction to regression .ppt
Ad

Recently uploaded (20)

PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PPTX
Probability Distribution, binomial distribution, poisson distribution
PDF
Nidhal Samdaie CV - International Business Consultant
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
PPT
Chapter four Project-Preparation material
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
IFRS Notes in your pocket for study all the time
PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
COST SHEET- Tender and Quotation unit 2.pdf
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
Business model innovation report 2022.pdf
PPTX
Principles of Marketing, Industrial, Consumers,
DOC-20250806-WA0002._20250806_112011_0000.pdf
Power and position in leadershipDOC-20250808-WA0011..pdf
Probability Distribution, binomial distribution, poisson distribution
Nidhal Samdaie CV - International Business Consultant
unit 1 COST ACCOUNTING AND COST SHEET
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Unit 1 Cost Accounting - Cost sheet
20250805_A. Stotz All Weather Strategy - Performance review July 2025.pdf
Chapter four Project-Preparation material
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
Laughter Yoga Basic Learning Workshop Manual
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
IFRS Notes in your pocket for study all the time
Chapter 5_Foreign Exchange Market in .pdf
COST SHEET- Tender and Quotation unit 2.pdf
Euro SEO Services 1st 3 General Updates.docx
Business model innovation report 2022.pdf
Principles of Marketing, Industrial, Consumers,
Ad

Chapter8_Final.pptxkhnhkjlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

  • 2. Perfect Multicollinearity • Perfect multicollinearity is a violation of Classical Assumption VI. • It is the case where the variation in one explanatory variable can be completely explained by movements in another explanatory variable. • Such a case between two independent variables would be: • Where the X’s are independent variables in:
  • 3. Perfect Multicollinearity (continued) • Other examples of perfect linear relationships: • Real world examples? -Distance between two cities. -Percent of voters voting in favor and against a proposition.
  • 5. Perfect Multicollinearity (continued) • OLS is incapable of generating estimates of regression coefficients where perfect multicollinearity is present. • You cannot “hold all the other independent variables in the equation constant.” • A special case related to perfect multicollinearity is a dominant variable. • A dominant variable is so highly correlated with the dependent variable that it masks the effects of other independent variables. • Don’t confuse dominant variables with highly significant variables.
  • 6. Imperfect Multicollinearity • Imperfect multicollinearity: linear functional relationship between two or more independent variables so strong that it can significantly affect the estimations of coefficients. • It occurs when two (or more) independent variables are imperfectly linearly related, as in: • Note ui, a stochastic error term in Equation (8.7)
  • 8. The Consequences of Multicollinearity • The major consequences of multicollinearity are: 1. Estimates will remain unbiased. 2. The variances and standard errors of the estimates will increase. 3. The computed t-scores will fall. 4. Estimates will become sensitive to changes in specification. 5. The overall fit of the equation and estimation of the coefficients of non multicollinear variables will be largely unaffected.
  • 9. The Consequences of Multicollinearity (continued) 1. Estimates will remain unbiased. • Even if an equation has significant multicollinearity, the estimates of β will be unbiased if first six Classical Assumptions hold. 2. The variances and standard errors of the estimates will increase. • With multicollinearity, it becomes difficult to precisely identify the separate effects of multicollinear variables. • OLS is still BLUE with multicollinearity. • But the “minimum variances” can be fairly large.
  • 11. The Consequences of Multicollinearity (continued) 3. The computed t-scores will fall. • Multicollinearity tends to decrease t-scores mainly because of the formula for the t-statistic. • If standard error increases, t-score must fall. • Confidence intervals also increase because standard errors increase.
  • 12. The Consequences of Multicollinearity (continued) 4. Estimates will become sensitive to changes in specification. • Adding/dropping variables and/or observations will often cause major changes in β estimates when significant multicollinearity exists. • This occurs because with severe multicollinearity OLS is forced to emphasize small differences between variables in order to distinguish the effect of one multicollinear variable.
  • 13. The Consequences of Multicollinearity (continued) 5. The overall fit of the equation and estimation of the coefficients of nonmulticollinear variables will be largely unaffected. • will not fall much, if at all, with significant multicollinearity. • Combination of high and no statistically significant variables is an indication of multicollinearity. • It is possible for an F-test of overall significance to reject the null even though none of the individual t-tests do.
  • 14. Two Examples of the Consequences of Multicollinearity Example: Student consumption function where: COi = annual consumption expenditures of the ith student on items other than tuition and room and board. Ydi = annual disposable income (including gifts) of that student LAi = liquid assets (savings, etc.) of the ith student εi = stochastic error term
  • 15. Two Examples of the Consequences of Multicollinearity (continued) • Estimate Equation 8.9 with OLS: • Including only disposable income:
  • 16. Two Examples of the Consequences of Multicollinearity (continued) Example: Demand for gasoline by state where: PCONi = petroleum consumption in the ith state (trillions of BTUs) UHMi = urban highway miles within the ith state TAXi = gasoline tax in the ith state (cents per gallon) REGi= motor vehicle registrations in the ith state (thousands)
  • 17. Two Examples of the Consequences of Multicollinearity (continued) • Estimate Equation 8.12 with OLS: • If you drop UHM:
  • 18. The Detection of Multicollinearity • Multicollinearity exists in every equation. • Important question is how much exists. • The severity can change from sample to sample. • There are no generally accepted, true statistical tests for multicollinearity. • Researchers develop a general feeling for the severity of multicollinearity by examining a number of characteristics. Two common ones are: 1. Simple correlation coefficient 2. Variance inflation factors
  • 19. High Simple Correlation Coefficients • The simple correlation coefficient, r, is a measure of the strength and direction of the linear relationship of two variables. • Range of r is +1 to -1. • Sign of r indicates the direction of the correlation. • If r, in absolute value, is high, then the two variables are quite correlated and multicollinearity is a potential problem.
  • 20. High Simple Correlation Coefficients (continued) • How high is high? • Some researchers select arbitrary number, such as 0.80. • Better answer might be r is high if it causes unacceptable large variances. • The use of r to detect multicollinearity has a major limitation: groups of variables acting together can cause multicollinearity without any single simple correlation coefficient being high.
  • 21. High Variance Inflation Factors (VIFs) • Variance inflation factor (VIF) is a method of detecting the severity of multicollinearity by looking at the extent to which a given explanatory variable can be explained by all other explanatory variables in an equation. • Suppose the following model with K independent variables: • Need to calculate a VIF for each of the K independent variables.
  • 22. High Variance Inflation Factors (VIFs) (continued) • To calculate VIFs: 1. Run an OLS regression that has Xi as a function of all the other explanatory variables in the equation. 2. Calculate the variance inflation factor for
  • 23. High Variance Inflation Factors (VIFs) (continued) • The higher the VIF, the more severe the effects of multicollinearity. • But, there are no formal critical VIF values. • A common rule of thumb: if VIF > 5, multicollinearity is severe. • It’s possible to have large multicollinearity effects without having a large VIF.
  • 24. Remedies for Multicollinearity Remedy 1: Do nothing • Existence of multicollinearity might not mean anything (i.e. coefficients still significant and meet expectations). • If you delete a multicollinear variable that belongs in model, you cause specification bias. • Every time a regression is rerun, we risk encountering a specification that accidently works on the specific sample.
  • 25. Remedies for Multicollinearity (continued) Remedy 2: Drop a redundant variable • Two or more variables in an equation measuring essentially the same thing might be called redundant. • Dropping redundant variable is nothing more than making up for a specification error. • In case of severe multicollinearity, it makes no statistical difference which variable is dropped. • The theoretical underpinnings of model should be the basis for dropping a redundant variable.
  • 27. Remedies for Multicollinearity (continued) Remedy 3: Increase the size of the sample • Normally, a larger sample will reduce the variance of the estimated coefficients diminishing impact of multicollinearity. • Unfortunately, while a useful alternative to be considered, it may be impossible.
  • 28. An Example of Why Multicollinearity Often is Best Left Unadjusted Example: Impact of marketing on soft drink sales where: St = sales of the soft drink in year t Pt = average relative price of the drink in year t At = advertising expenditures for the company in year t Bt = advertising expenditures for the company’s main competitor in year t.
  • 29. An Example of Why Multicollinearity Often is Best Left Unadjusted (continued) • If variable B is dropped: • Note expected bias in estimated coefficient of At: