SlideShare a Scribd company logo
Quantitative
Methods
for
Lawyers Class #21
Regression Analysis
Part 4
@ computational
computationallegalstudies.com
professor daniel martin katz danielmartinkatz.com
lexpredict.com slideshare.net/DanielKatz
Building Regression
Tables in R
“Stargazer is a new R package
that creates LaTeX code for well-
formatted regression tables, with
multiple models side-by-side, as
well as for summary statistics
tables. It can also output the
content of data frames directly
into LaTeX.”
If you want to go further in this
area you probably need to learn
some LaTeX.
LaTeX is the industry standard
for type setting technical
documents
If you do not like LyX:
First you need load a TeX Package:
http://en.wikipedia.org/wiki/Comparison_of_TeX_editors
MacTeX http://tug.org/mactex/
http://miktex.org/MikTeX
Then it is useful to have IDE:
http://www.lyx.org/
Install the Stargazer Package:
Stargazer is a going to give you LaTeX output which
you can paste and compile into a Table
Install the Stargazer Package:
Stargazer is a going to give you LaTeX output which
you can paste and compile into a Table
Install the Stargazer Package:
This is a very helpful website that you should consult
regularly (and follow on FB ) for all things
http://www.r-bloggers.com/stargazer-package-for-
beautiful-latex-tables-from-r-statistical-models-output/
Lets Consult the ‘Stargazer’ Tutorial
http://www.r-bloggers.com/stargazer-package-for-
beautiful-latex-tables-from-r-statistical-models-output/
The ‘attitude’ data frame (which should be available with your
default installation of R)
Lets take a quick peak:
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
Applying the basic
command to the dataframe
get you a set of LaTeX
output as shown to the left
(2) File > New
http://www.lyx.org/(1) Open
(3) Start a LaTeX Box
(4) Cut from R output + Then Paste the LaTeX Code in box
(5) Then Hit this Button to See Output
starting here: ending here:
http://www.r-
bloggers.com/
stargazer-package-for-
beautiful-latex-tables-
from-r-statistical-
models-output/
Download this as an alternative because it allows you to easily
override errors and push through to get a regression table
Okay Lets Run a Few Regression Models
Now Lets Generate the LaTeX Code
The
Resulting
LaTeX
Code
Put this below
end{document}
Put this above
documentclass{article}
begin{document}
These Tables
are Typically
How
Regression
Output is
Reported
A Quick Primer on Interpreting
Regression Output
http://dss.princeton.edu/training/
We are Working Through Selected Examples From this Fabulous
Resource Created by Oscar Torres-Reyna @ Princeton
Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4
A Quick Primer on Interpreting
Regression Output
How Should We Discuss the
R e l a t i o n s h i p B e t w e e n
Independent Variables and
Dependent Variables?
We Think in a Ceteris
paribus Manner
(i.e. All Other Things
Being Equal)
These are
dummy
variables for
the respective
regions
How Should We Discuss the
R e l a t i o n s h i p B e t w e e n
Independent Variables and
Dependent Variables?
We Think in a Ceteris paribus
Manner (All Other Things Being
Equal)
How Should We Discuss the
R e l a t i o n s h i p B e t w e e n
Independent Variables and
Dependent Variables?
We Think in a Ceteris paribus
Manner (All Other Things Being
Equal)
The Implies We Are Interested in a Thought Experiment:
If We Were To Change Some Independent Variable by 1 Unit
-- What Would Be the Corresponding Effect on Y?
This Should be Considered Both in the Case of a
Regular Variable and a Dummy/Indicator Variable
The Implies We Are Interested in a
Thought Experiment:
If We Were To Change Some
Independent Variable by 1 Unit --
What Would Be the
Corresponding Effect on Y?
This Should be Considered Both in the Case of a Regular Variable
and a Dummy/Indicator Variable
Start with “College” Variable -
3.38 is the Beta Coefficient on College
Start with “College” Variable -
Thinking in a Ceteris Paribus
Manner
3.38 is the Beta Coefficient on College
Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) +
( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
Start with “College” Variable -
Thinking in a Ceteris Paribus
Manner
3.38 is the Beta Coefficient on College
Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) +
( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college
+ 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε
Start with “College” Variable -
Thinking in a Ceteris Paribus
Manner
3.38 is the Beta Coefficient on College
Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) +
( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college
+ 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε
Start with “College” Variable -
Thinking in a Ceteris Paribus
Manner
All Else Equal - For Each 1 Unit Change in
“College” there is a corresponding 3.38 Unit
Change in “Csat”
3.38 is the Beta Coefficient on College
Thinking in a Ceteris Paribus
Manner
76.84 if region =2 is True
27.26 if region =3 is True
34.35 if region =4 is True
Otherwise if if region =1 is True
we retain the Default Coefficient Estimates
Notice that
there are
really 4
Separate
Models
Here
csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college
+ 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε
Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) +
( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
Non Linearities and
Transformations
Okay This is the Interpretation in the Linear Case
From a Model / Prediction Standpoint, Failure to Adjust to Account
for Non-Linearity might lead to Type II Error
Sometimes Data Does not Neatly Conform to Our Linearity
Assumption
Non Linearities and
Transformations
Simple Linear Model
Y = B0 + (B1 * (X1)) + ε
Y = B0 (B1 * (X1)2
) + ε
Polynomial Regression Model
“Lin- Log” Model
Y = B0 + (B1 * (ln X1)) + ε
Dependent Variable is Linear
1 or More Indep Var is Log
In this Case of X^2
this is a Negative quadratic Function
_
How Do We Determine that a
Transformation is Appropriate?
These Are the Variables From Our Model
How Do We Determine that a
Transformation is Appropriate?
Mean
composite
SAT
score
Per pupil
expenditures
prim&sec
% HS
graduates
taking
SAT
Median
household
income,
$1,000
%
adults
HS
diploma
% adults
college
degree
Take
A
Look
at
this
How Do We Determine that a
Transformation is Appropriate?
Plot the Relationship
Between X & Y and
Observe the
Relationship
L e t s L o o k a t
“ C s a t ” a n d
“Percent”
How Do We Determine that a
Transformation is Appropriate?
R e l a t i o n s h i p
looks non-Linear
-- “Curvilinear”
Aka
Curve
+
Line
How Do We Determine that a
Transformation is Appropriate?
It Appears that a Polynomial (Quadratic) relationship probably exists
thus, it makes sense to add a square version of it
-300-200-1000100
Augmentedcomponentplusresidual
0 20 40 60 80
% HS graduates taking SAT
The command acprplot (augmented
component-plus-residual plot) provides
a graphical way to examine linearity.
Run this command after running a
regression
regress csat percent
This is a Stata Command
There is an alternative in R
How Do I Generate
a New Variable?
We Want to Generate a New Variable Called
“Percent Squared”
Here is How We Do This In R
Okay Lets Feed This Back Into the
Regression Model
Now We
Have Added
“Percent
Squared”
to the Model
R^2 is not
everything but
we can see the
impact of
alternative
specification of
the model on
R^2
Other Transformations
We Might Have A Variable Whose Relationship was Non-Linear
and follow a Natural Log
Include in the Model and Look at the Corresponding Model Fit
NOTE YOU CAN ALSO TRANSFORM THE DEPENDENT VARIABLE
ln Y = B0 + (B1 * (X1)) + ε
How To Understand
Log Transformed
Regression Output
Dependent Variable is not in Log Form, Independent Variable is in Log Form (aka Linear-Log)
“A 1 Percent Change in the Independent Variable is associated with a (.01* Beta) Change in
the Dependent Variable”
Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Linear)
“A Change in the Independent Variable by 1 unit is associated with a (100percent * Beta)
Change in the Dependent Variable”
Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Log)
“A Change in the Independent Variable by 1 unit is associated with a (Beta % Change) in the
Dependent Variable”
Interaction Terms
Interaction Terms
Sometime X1 Impacts Y and X2 Impacts Y but when both X1 and
X2 are Present there is an additional impact (+ or - ) beyond
Y = B0 + (B1 * (X1)) + (B2 * (X2)) + (B3 * (X3)(X2) + ε
Income = B0 + B1 *Gender + B2 * Education + B3* Gender * Education + ε
Our Beta Three Term Gives Us the Effect of Gender and Education
Together
Assuming Gender is Binary in the Model - The Interaction Will
Explore the Differential Effect on Income By Gender
Image From - Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models:
Improving Empirical Analyses, 14 Political Analysis 63 (2005)
A Visual Display of
Interaction Terms
For More on
Interaction Terms ...
Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction
Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005)
Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chicago kent college of law@

More Related Content

PDF
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
PDF
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
PDF
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1
PDF
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
PDF
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
PPT
multiple regression
PDF
Applied Statistics In Business
PPT
Chapter 4 - multiple regression
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #17 - Scatter Plots, Covariance, Cor...
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1
Quantitative Methods for Lawyers - Class #19 - Regression Analysis - Part 2
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
multiple regression
Applied Statistics In Business
Chapter 4 - multiple regression

What's hot (20)

PDF
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
PPTX
Linear Regression and Logistic Regression in ML
PPT
Multiple regression presentation
PPTX
Dummy variables
PPTX
Machine learning session4(linear regression)
PPT
Chapter 14
PPTX
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
PPT
Correlation and regression
PPTX
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
PDF
Regression Analysis
PPT
Lecture 4
PPT
Regression
PDF
Linear models for data science
PPT
Chapter05
PPTX
Regression analysis by akanksha Bali
DOCX
Dummy variable
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PPTX
Generalized linear model
PDF
Quantitative Methods for Lawyers - Class #6 - Basic Statistics + Probability ...
PPTX
Regression analysis.
Quantitative Methods for Lawyers - Class #10 - Binomial Distributions, Normal...
Linear Regression and Logistic Regression in ML
Multiple regression presentation
Dummy variables
Machine learning session4(linear regression)
Chapter 14
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Correlation and regression
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Regression Analysis
Lecture 4
Regression
Linear models for data science
Chapter05
Regression analysis by akanksha Bali
Dummy variable
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Generalized linear model
Quantitative Methods for Lawyers - Class #6 - Basic Statistics + Probability ...
Regression analysis.
Ad

Similar to Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4 (20)

DOCX
Exercise 29Calculating Simple Linear RegressionSimple linear reg.docx
DOCX
Econ 103 Homework 2Manu NavjeevanAugust 15, 2022S
PPT
Logistic Regression in Case-Control Study
PPTX
Eititntjtgigkrkrorn4jfnrjfjrnrjfjrjrj4jjrjrjrrjr
PDF
Linear regression model in econometrics undergraduate
PPTX
An Introduction to Regression Models: Linear and Logistic approaches
DOCX
For this assignment, use the aschooltest.sav dataset.The d
PDF
Institutional Research and Regression
PDF
CH1ECONMETRICS 3 USES, REGRESS ANAL-GRPAH EG UNI MULTIVARIATE, STOCHASTIC ERR...
PPT
logit_probit.ppt
PPT
Gordoncorr
PPT
Day 10 prediction and regression
PPTX
simple-linear-regression (1).pptx
PPTX
Correlation & Regression.pptx
DOCX
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
DOCX
The future is uncertain. Some events do have a very small probabil.docx
PPT
Econometrics
PPTX
Stats chapter 4
PDF
Linear Regression An Introduction To Statistical Models Peter Martin
PPT
Intro to corhklloytdeb koptrcb k & reg.ppt
Exercise 29Calculating Simple Linear RegressionSimple linear reg.docx
Econ 103 Homework 2Manu NavjeevanAugust 15, 2022S
Logistic Regression in Case-Control Study
Eititntjtgigkrkrorn4jfnrjfjrnrjfjrjrj4jjrjrjrrjr
Linear regression model in econometrics undergraduate
An Introduction to Regression Models: Linear and Logistic approaches
For this assignment, use the aschooltest.sav dataset.The d
Institutional Research and Regression
CH1ECONMETRICS 3 USES, REGRESS ANAL-GRPAH EG UNI MULTIVARIATE, STOCHASTIC ERR...
logit_probit.ppt
Gordoncorr
Day 10 prediction and regression
simple-linear-regression (1).pptx
Correlation & Regression.pptx
Links to Estimation Techniques Tim Shaughnessy, Chapter 7 .docx
The future is uncertain. Some events do have a very small probabil.docx
Econometrics
Stats chapter 4
Linear Regression An Introduction To Statistical Models Peter Martin
Intro to corhklloytdeb koptrcb k & reg.ppt
Ad

More from Daniel Katz (20)

PDF
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
PDF
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
DOCX
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
PDF
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
PDF
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
PDF
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
PDF
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
PDF
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
PDF
Artificial Intelligence and Law - 
A Primer
PDF
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
PDF
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
PDF
LexPredict - Empowering the Future of Legal Decision Making
PDF
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
PDF
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
PDF
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
PDF
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
PDF
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
PDF
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
PDF
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
PDF
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Artificial Intelligence and Law - 
A Primer
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
LexPredict - Empowering the Future of Legal Decision Making
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...

Recently uploaded (20)

PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Institutional Correction lecture only . . .
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
master seminar digital applications in india
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Lesson notes of climatology university.
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Cell Structure & Organelles in detailed.
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
GDM (1) (1).pptx small presentation for students
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Final Presentation General Medicine 03-08-2024.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Institutional Correction lecture only . . .
Microbial diseases, their pathogenesis and prophylaxis
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
master seminar digital applications in india
Microbial disease of the cardiovascular and lymphatic systems
Lesson notes of climatology university.
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Cell Structure & Organelles in detailed.
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Anesthesia in Laparoscopic Surgery in India
GDM (1) (1).pptx small presentation for students
STATICS OF THE RIGID BODIES Hibbelers.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4

  • 1. Quantitative Methods for Lawyers Class #21 Regression Analysis Part 4 @ computational computationallegalstudies.com professor daniel martin katz danielmartinkatz.com lexpredict.com slideshare.net/DanielKatz
  • 3. “Stargazer is a new R package that creates LaTeX code for well- formatted regression tables, with multiple models side-by-side, as well as for summary statistics tables. It can also output the content of data frames directly into LaTeX.” If you want to go further in this area you probably need to learn some LaTeX. LaTeX is the industry standard for type setting technical documents
  • 4. If you do not like LyX: First you need load a TeX Package: http://en.wikipedia.org/wiki/Comparison_of_TeX_editors MacTeX http://tug.org/mactex/ http://miktex.org/MikTeX Then it is useful to have IDE: http://www.lyx.org/
  • 6. Stargazer is a going to give you LaTeX output which you can paste and compile into a Table Install the Stargazer Package:
  • 7. Stargazer is a going to give you LaTeX output which you can paste and compile into a Table Install the Stargazer Package:
  • 8. This is a very helpful website that you should consult regularly (and follow on FB ) for all things
  • 10. http://www.r-bloggers.com/stargazer-package-for- beautiful-latex-tables-from-r-statistical-models-output/ The ‘attitude’ data frame (which should be available with your default installation of R) Lets take a quick peak:
  • 12. Applying the basic command to the dataframe get you a set of LaTeX output as shown to the left
  • 13. (2) File > New http://www.lyx.org/(1) Open (3) Start a LaTeX Box (4) Cut from R output + Then Paste the LaTeX Code in box (5) Then Hit this Button to See Output starting here: ending here:
  • 15. Download this as an alternative because it allows you to easily override errors and push through to get a regression table
  • 16. Okay Lets Run a Few Regression Models Now Lets Generate the LaTeX Code
  • 17. The Resulting LaTeX Code Put this below end{document} Put this above documentclass{article} begin{document}
  • 19. A Quick Primer on Interpreting Regression Output
  • 20. http://dss.princeton.edu/training/ We are Working Through Selected Examples From this Fabulous Resource Created by Oscar Torres-Reyna @ Princeton
  • 22. A Quick Primer on Interpreting Regression Output How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (i.e. All Other Things Being Equal)
  • 23. These are dummy variables for the respective regions
  • 24. How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal)
  • 25. How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal) The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable
  • 26. The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable Start with “College” Variable - 3.38 is the Beta Coefficient on College
  • 27. Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  • 28. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  • 29. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  • 30. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner All Else Equal - For Each 1 Unit Change in “College” there is a corresponding 3.38 Unit Change in “Csat” 3.38 is the Beta Coefficient on College
  • 31. Thinking in a Ceteris Paribus Manner 76.84 if region =2 is True 27.26 if region =3 is True 34.35 if region =4 is True Otherwise if if region =1 is True we retain the Default Coefficient Estimates Notice that there are really 4 Separate Models Here csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
  • 32. Non Linearities and Transformations Okay This is the Interpretation in the Linear Case From a Model / Prediction Standpoint, Failure to Adjust to Account for Non-Linearity might lead to Type II Error Sometimes Data Does not Neatly Conform to Our Linearity Assumption
  • 33. Non Linearities and Transformations Simple Linear Model Y = B0 + (B1 * (X1)) + ε Y = B0 (B1 * (X1)2 ) + ε Polynomial Regression Model “Lin- Log” Model Y = B0 + (B1 * (ln X1)) + ε Dependent Variable is Linear 1 or More Indep Var is Log In this Case of X^2 this is a Negative quadratic Function _
  • 34. How Do We Determine that a Transformation is Appropriate? These Are the Variables From Our Model
  • 35. How Do We Determine that a Transformation is Appropriate? Mean composite SAT score Per pupil expenditures prim&sec % HS graduates taking SAT Median household income, $1,000 % adults HS diploma % adults college degree Take A Look at this
  • 36. How Do We Determine that a Transformation is Appropriate? Plot the Relationship Between X & Y and Observe the Relationship L e t s L o o k a t “ C s a t ” a n d “Percent”
  • 37. How Do We Determine that a Transformation is Appropriate? R e l a t i o n s h i p looks non-Linear -- “Curvilinear” Aka Curve + Line
  • 38. How Do We Determine that a Transformation is Appropriate? It Appears that a Polynomial (Quadratic) relationship probably exists thus, it makes sense to add a square version of it -300-200-1000100 Augmentedcomponentplusresidual 0 20 40 60 80 % HS graduates taking SAT The command acprplot (augmented component-plus-residual plot) provides a graphical way to examine linearity. Run this command after running a regression regress csat percent This is a Stata Command There is an alternative in R
  • 39. How Do I Generate a New Variable? We Want to Generate a New Variable Called “Percent Squared” Here is How We Do This In R
  • 40. Okay Lets Feed This Back Into the Regression Model
  • 41. Now We Have Added “Percent Squared” to the Model R^2 is not everything but we can see the impact of alternative specification of the model on R^2
  • 42. Other Transformations We Might Have A Variable Whose Relationship was Non-Linear and follow a Natural Log Include in the Model and Look at the Corresponding Model Fit NOTE YOU CAN ALSO TRANSFORM THE DEPENDENT VARIABLE ln Y = B0 + (B1 * (X1)) + ε
  • 43. How To Understand Log Transformed Regression Output Dependent Variable is not in Log Form, Independent Variable is in Log Form (aka Linear-Log) “A 1 Percent Change in the Independent Variable is associated with a (.01* Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Linear) “A Change in the Independent Variable by 1 unit is associated with a (100percent * Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Log) “A Change in the Independent Variable by 1 unit is associated with a (Beta % Change) in the Dependent Variable”
  • 45. Interaction Terms Sometime X1 Impacts Y and X2 Impacts Y but when both X1 and X2 are Present there is an additional impact (+ or - ) beyond Y = B0 + (B1 * (X1)) + (B2 * (X2)) + (B3 * (X3)(X2) + ε Income = B0 + B1 *Gender + B2 * Education + B3* Gender * Education + ε Our Beta Three Term Gives Us the Effect of Gender and Education Together Assuming Gender is Binary in the Model - The Interaction Will Explore the Differential Effect on Income By Gender
  • 46. Image From - Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005) A Visual Display of Interaction Terms
  • 47. For More on Interaction Terms ... Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005)
  • 48. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@