SlideShare a Scribd company logo
The fundamentals of
regression
@theStephLocke
Steph Locke
• CEO @ Nightingale HQ
• Data Scientist
• Author
• Microsoft Data Platform
& Artificial Intelligence
MVP
• T: @theStephLocke
• Li: /stephanielocke
The fundamentals of regression
Machine Learning
algorithms
Fitting
Supervised
Loss function
Error
Fitting a model
• The process of iteratively applying an algorithm to generate a
model
• Optimises model based on the loss function
• Can rely on hyperparameters to control how algorithm proceeds
The fundamentals of regression
Supervised fitting
• Produce a model based on a label
• Expresses some combination of features
• Loss function typically minimises error
Aka outcome,
dependant variable
Aka fields,
independent
variables, columns
The difference
between the
predicted and actual
value of a label
Example
X Y Y=1+2X Y=2+1X y=2.5+1X
1 3 3 3 3.5
1 4 3 3 3.5
2 4 5 4 4.5
2 5 5 4 4.5
3 5 7 5 5.5
3 6 7 5 5.5
4 6 9 6 6.5
4 7 9 6 6.5 0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
1.00 2.00 3.00 4.00
Y=1+2x Y=2+x y=2.5+x Y
The fundamentals of regression
Loss function
• A loss function is a calculation used to determine the difference
between what did happen and what the algorithm has produced
• Algorithms typically minimise the output of this function
• Selecting the right loss function is important to fitting models
appropriately
Example
X Y Y=1+2x Y=2+1x y=2.5+1x
1 3 3 3 3.5
1 4 3 3 3.5
2 4 5 4 4.5
2 5 5 4 4.5
3 5 7 5 5.5
3 6 7 5 5.5
4 6 9 6 6.5
4 7 9 6 6.5
Error (P-
A)
8 -4 0
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
1.00 2.00 3.00 4.00
Y=1+2x Y=2+x y=2.5+x Y
Example
X Y y=2.5+1x Y=0+2x
1 3 3.5 2
1 4 3.5 2
2 4 4.5 4
2 5 4.5 4
3 5 5.5 6
3 6 5.5 6
4 6 6.5 8
4 7 6.5 8
Error (P-
A)
0 0
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
1.00 2.00 3.00 4.00
y=2.5+x y=2x Y
The fundamentals of regression
Error
• Error due to simplification is called bias
• Error due to complexity is called variance
• Error can come from how data is collected / measured
• There is always some irreducible error
The fundamentals of regression
The fundamentals of regression
Regression algorithms
Features
Assumptions
Link functions
Loss functions
Regression
A numeric combination of variables used to predict another variable
Regression
• At it’s simplest: y = mx + c
• y is a combination of:
• m units of x
• c (represents a bunch of other
stuff that can’t be explained)
y = x + 2.5
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Features
• All variables must be represented numerically
• Categorical and text values have a variety of ways they can be
represented
• The handling of missing values impacts the final model
• Variables can be processed to meet assumptions
Assumptions
❑The sample represents the population
❑All features are represented numerically
❑Features are independent and uncorrelated
❑The outcome is dependent on a combination of the features
❑The relationship is consistent across observations
❑The linear combination of variables should be normally
distributed
Multivariate
normal
The combination of variables is
normally distributed
By Bscan - Own work, CC0,
https://commons.wikimedia.org/w/index.php?curid=25235145
Link function
A function to express a relationship between the linear combination
of features and the outcome
The fundamentals of regression
Loss function
A function to express the performance of a model on the training
data
Ordinary
Least Squares
y = 2x + 1
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Square the residuals
Ordinary
Least Squares
Sum the squares
y = 2x + 1
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Ordinary
Least Squares
Divide by number of observations
y = 2x + 1
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Ordinary Least
Squares
Repeat with new line
y = x + 2.5
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Ordinary Least
Squares
y = x + 2.5
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Calculate
Ordinary Least
Squares
Compare
y = x + 2.5
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
<
Y=x+2.5 is
better
than
Y=2x+1
The fundamentals of regression
Linear regression
Features
Interpretation
Evaluation
Linear regression
• y = mx + c
• y is a numeric variable
• Y is a linear combination of:
• m units of x
• c (represents a bunch of other
stuff that can’t be explained)
y = x + 2.5
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
Features
• OLS based models are sensitive to outliers
• Commonly see categorical variables are included via one-hot
encoding
• If two variables impact each other, you can include their
interaction as a feature
• Features on disparate numeric scales can be normalised to reduce
potential distortion
The fundamentals of regression
The fundamentals of regression
The fundamentals of regression
Interpretation
• Categoricals encoded via one-hot add to the intercept
• The coefficient for a feature represents its contribution to y for
one unit of change in its value (rescaling features needs
translating)
• Sign indicates correlation between variable and outcome
• P-value asterisks indicate confidence that there is a correlation
between the feature and the outcome
• Standard error indicates how precise the coefficient estimate is
An example
## Call:
## lm(formula = dist ~ speed.c, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.9800 2.1750 19.761 < 2e-16 ***
## speed.c 3.9324 0.4155 9.464 1.49e-12 ***
##
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ‘ 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
Evaluation
• R2 describes performance over just guessing the average
• <0 Worse!
• 0 Same
• >0 Better
• 1 No error in model (you’ve probably done something wrong!)
• Various measures take the square or the absolute of errors
• Relative to dataset
• Smaller is usually better
• Options include:
• Root Mean Squared Error
• Mean Squared Error
• Mean Absolute Error
An example
## Call:
## lm(formula = dist ~ speed.c, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.9800 2.1750 19.761 < 2e-16 ***
## speed.c 3.9324 0.4155 9.464 1.49e-12 ***
##
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ‘ 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
Evaluation
• The distribution of residuals (errors) helps indicate problems
• They should be normally distributed
• They should be distributed across the range of fitted values with a
similar range
• Some observations can have high influence on the model (usually
outliers)
• Compare model against other versions (fewer features, more etc)
• Beware the curse of dimensionality
Residuals vs
actuals
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
0.00 2.00 4.00 6.00 8.00
R: Y=1+2x R: Y=2+x R: y=2.5+x R: y=2x
The fundamentals of regression
The fundamentals of regression
Next steps
Continue attending
Learn R or Python
Check out the resources
Resources • Making Friends with Machine Learning –
YouTube
• Machine Learning Flashcards
(chrisalbon.com)
• Setosa data visualization and visual
explanations
• Feature Engineering and Selection: A
Practical Approach for Predictive Models
• RPubs - Residual Analysis in Linear
Regression
• Regression Models for Data Science in R

More Related Content

PPTX
ML - Multiple Linear Regression
PDF
PDF
AM2 MATLAB
PPTX
Simplex Algorithm
PPTX
Simulation - Generating Continuous Random Variables
PDF
Matlab practice
PPTX
ERF Training Workshop Panel Data 3
ML - Multiple Linear Regression
AM2 MATLAB
Simplex Algorithm
Simulation - Generating Continuous Random Variables
Matlab practice
ERF Training Workshop Panel Data 3

What's hot (20)

PPTX
Static Models of Continuous Variables
PPTX
R part I
PPTX
Lines and planes in space
PPTX
Intro to Matlab programming
PPTX
Introduction to matlab
PPTX
Introduction to matlab lecture 2 of 4
PDF
Section6 stochastic
PDF
Perspective in Informatics 3 - Assignment 1 - Answer Sheet
PPTX
PDF
Introduction to python programming
PDF
Curvefitting
PPTX
Nams- Roots of equations by numerical methods
PDF
MATLAB for Technical Computing
PPTX
Introduction to matlab lecture 3 of 4
PDF
Scientific Computing II Numerical Tools & Algorithms - CEI40 - AGA
PPTX
Lecture three
PDF
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
PPTX
curve fitting or regression analysis-1.pptx
PPTX
Solving of Non-Linear Equations-1.pptx
PDF
Basic concepts in_matlab
Static Models of Continuous Variables
R part I
Lines and planes in space
Intro to Matlab programming
Introduction to matlab
Introduction to matlab lecture 2 of 4
Section6 stochastic
Perspective in Informatics 3 - Assignment 1 - Answer Sheet
Introduction to python programming
Curvefitting
Nams- Roots of equations by numerical methods
MATLAB for Technical Computing
Introduction to matlab lecture 3 of 4
Scientific Computing II Numerical Tools & Algorithms - CEI40 - AGA
Lecture three
Perspective in Informatics 3 - Assignment 2 - Answer Sheet
curve fitting or regression analysis-1.pptx
Solving of Non-Linear Equations-1.pptx
Basic concepts in_matlab
Ad

Similar to The fundamentals of regression (20)

PPTX
Mat lab workshop
PDF
مدخل إلى تعلم الآلة
PPTX
Introduction to MATLAB
PDF
MATLAB Programming
PPTX
MATLAB Workshop for project and research
PPTX
Computer Studies 2013 Curriculum framework 11 Notes ppt.pptx
PPT
ch02-primitive-data-definite-loops.ppt
PPT
ch02-primitive-data-definite-loops.ppt
PPTX
Machine Learning Essentials Demystified part2 | Big Data Demystified
PPT
Matlab_Simulink_Tutorial.ppt
PPTX
Enhancing the performance of kmeans algorithm
PDF
Practical data analysis with wine
PPT
Matlab_Simulink_Tutorial.ppt
PPT
MatlabSimulinkTutorial.ppt
PPT
Ch02 primitive-data-definite-loops
PDF
02 - Data Types and Expressions using C.pdf
PPTX
Functional Programming in Swift
PPTX
Introduction to Regression Analysis
PPTX
Matlab Functions for programming fundamentals
PPTX
Pythonlearn-02-Expressions123AdvanceLevel.pptx
Mat lab workshop
مدخل إلى تعلم الآلة
Introduction to MATLAB
MATLAB Programming
MATLAB Workshop for project and research
Computer Studies 2013 Curriculum framework 11 Notes ppt.pptx
ch02-primitive-data-definite-loops.ppt
ch02-primitive-data-definite-loops.ppt
Machine Learning Essentials Demystified part2 | Big Data Demystified
Matlab_Simulink_Tutorial.ppt
Enhancing the performance of kmeans algorithm
Practical data analysis with wine
Matlab_Simulink_Tutorial.ppt
MatlabSimulinkTutorial.ppt
Ch02 primitive-data-definite-loops
02 - Data Types and Expressions using C.pdf
Functional Programming in Swift
Introduction to Regression Analysis
Matlab Functions for programming fundamentals
Pythonlearn-02-Expressions123AdvanceLevel.pptx
Ad

More from Stephanie Locke (19)

PDF
Let's banish "it works on my machine"
PDF
How to build brilliant managers.pdf
PDF
Working with data using Azure Functions.pdf
PDF
Developer Velocity
PDF
The Microsoft Well Architected Framework For Data Analytics
PDF
Sustainable manufacturing with AI
PDF
Wrangling data like a boss
PDF
Digitalisation from the back office to the factory floor
PDF
Practical AI & data science ethics
PDF
Help There’s Too Many [Something]Ops!
PDF
Reproducible machine learning
PPTX
AI monitoring in the workplace
PPTX
Working with relational data in Microsoft Azure
PPTX
Win more, win faster with sales automation
PPTX
Build or buy AI?
PPTX
AI in manufacturing - a technical perspective
PPTX
The historian and AI
PPTX
AI for marketers
PPTX
AI in manufacturing
Let's banish "it works on my machine"
How to build brilliant managers.pdf
Working with data using Azure Functions.pdf
Developer Velocity
The Microsoft Well Architected Framework For Data Analytics
Sustainable manufacturing with AI
Wrangling data like a boss
Digitalisation from the back office to the factory floor
Practical AI & data science ethics
Help There’s Too Many [Something]Ops!
Reproducible machine learning
AI monitoring in the workplace
Working with relational data in Microsoft Azure
Win more, win faster with sales automation
Build or buy AI?
AI in manufacturing - a technical perspective
The historian and AI
AI for marketers
AI in manufacturing

Recently uploaded (20)

PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Quality review (1)_presentation of this 21
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Business Analytics and business intelligence.pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
.pdf is not working space design for the following data for the following dat...
Reliability_Chapter_ presentation 1221.5784
Quality review (1)_presentation of this 21
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Qualitative Qantitative and Mixed Methods.pptx
ISS -ESG Data flows What is ESG and HowHow
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Data_Analytics_and_PowerBI_Presentation.pptx
Supervised vs unsupervised machine learning algorithms
Business Analytics and business intelligence.pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to Knowledge Engineering Part 1
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Database Infoormation System (DBIS).pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
.pdf is not working space design for the following data for the following dat...

The fundamentals of regression

  • 2. Steph Locke • CEO @ Nightingale HQ • Data Scientist • Author • Microsoft Data Platform & Artificial Intelligence MVP • T: @theStephLocke • Li: /stephanielocke
  • 5. Fitting a model • The process of iteratively applying an algorithm to generate a model • Optimises model based on the loss function • Can rely on hyperparameters to control how algorithm proceeds
  • 7. Supervised fitting • Produce a model based on a label • Expresses some combination of features • Loss function typically minimises error Aka outcome, dependant variable Aka fields, independent variables, columns The difference between the predicted and actual value of a label
  • 8. Example X Y Y=1+2X Y=2+1X y=2.5+1X 1 3 3 3 3.5 1 4 3 3 3.5 2 4 5 4 4.5 2 5 5 4 4.5 3 5 7 5 5.5 3 6 7 5 5.5 4 6 9 6 6.5 4 7 9 6 6.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 1.00 2.00 3.00 4.00 Y=1+2x Y=2+x y=2.5+x Y
  • 10. Loss function • A loss function is a calculation used to determine the difference between what did happen and what the algorithm has produced • Algorithms typically minimise the output of this function • Selecting the right loss function is important to fitting models appropriately
  • 11. Example X Y Y=1+2x Y=2+1x y=2.5+1x 1 3 3 3 3.5 1 4 3 3 3.5 2 4 5 4 4.5 2 5 5 4 4.5 3 5 7 5 5.5 3 6 7 5 5.5 4 6 9 6 6.5 4 7 9 6 6.5 Error (P- A) 8 -4 0 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 1.00 2.00 3.00 4.00 Y=1+2x Y=2+x y=2.5+x Y
  • 12. Example X Y y=2.5+1x Y=0+2x 1 3 3.5 2 1 4 3.5 2 2 4 4.5 4 2 5 4.5 4 3 5 5.5 6 3 6 5.5 6 4 6 6.5 8 4 7 6.5 8 Error (P- A) 0 0 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 1.00 2.00 3.00 4.00 y=2.5+x y=2x Y
  • 14. Error • Error due to simplification is called bias • Error due to complexity is called variance • Error can come from how data is collected / measured • There is always some irreducible error
  • 18. Regression A numeric combination of variables used to predict another variable
  • 19. Regression • At it’s simplest: y = mx + c • y is a combination of: • m units of x • c (represents a bunch of other stuff that can’t be explained) y = x + 2.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
  • 20. Features • All variables must be represented numerically • Categorical and text values have a variety of ways they can be represented • The handling of missing values impacts the final model • Variables can be processed to meet assumptions
  • 21. Assumptions ❑The sample represents the population ❑All features are represented numerically ❑Features are independent and uncorrelated ❑The outcome is dependent on a combination of the features ❑The relationship is consistent across observations ❑The linear combination of variables should be normally distributed
  • 22. Multivariate normal The combination of variables is normally distributed By Bscan - Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=25235145
  • 23. Link function A function to express a relationship between the linear combination of features and the outcome
  • 25. Loss function A function to express the performance of a model on the training data
  • 26. Ordinary Least Squares y = 2x + 1 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 Square the residuals
  • 27. Ordinary Least Squares Sum the squares y = 2x + 1 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
  • 28. Ordinary Least Squares Divide by number of observations y = 2x + 1 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
  • 29. Ordinary Least Squares Repeat with new line y = x + 2.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
  • 30. Ordinary Least Squares y = x + 2.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 Calculate
  • 31. Ordinary Least Squares Compare y = x + 2.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 < Y=x+2.5 is better than Y=2x+1
  • 34. Linear regression • y = mx + c • y is a numeric variable • Y is a linear combination of: • m units of x • c (represents a bunch of other stuff that can’t be explained) y = x + 2.5 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50
  • 35. Features • OLS based models are sensitive to outliers • Commonly see categorical variables are included via one-hot encoding • If two variables impact each other, you can include their interaction as a feature • Features on disparate numeric scales can be normalised to reduce potential distortion
  • 39. Interpretation • Categoricals encoded via one-hot add to the intercept • The coefficient for a feature represents its contribution to y for one unit of change in its value (rescaling features needs translating) • Sign indicates correlation between variable and outcome • P-value asterisks indicate confidence that there is a correlation between the feature and the outcome • Standard error indicates how precise the coefficient estimate is
  • 40. An example ## Call: ## lm(formula = dist ~ speed.c, data = cars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -29.069 -9.525 -2.272 9.215 43.201 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 42.9800 2.1750 19.761 < 2e-16 *** ## speed.c 3.9324 0.4155 9.464 1.49e-12 *** ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ‘ 1 ## ## Residual standard error: 15.38 on 48 degrees of freedom ## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438 ## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
  • 41. Evaluation • R2 describes performance over just guessing the average • <0 Worse! • 0 Same • >0 Better • 1 No error in model (you’ve probably done something wrong!) • Various measures take the square or the absolute of errors • Relative to dataset • Smaller is usually better • Options include: • Root Mean Squared Error • Mean Squared Error • Mean Absolute Error
  • 42. An example ## Call: ## lm(formula = dist ~ speed.c, data = cars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -29.069 -9.525 -2.272 9.215 43.201 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 42.9800 2.1750 19.761 < 2e-16 *** ## speed.c 3.9324 0.4155 9.464 1.49e-12 *** ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ‘ 1 ## ## Residual standard error: 15.38 on 48 degrees of freedom ## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438 ## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
  • 43. Evaluation • The distribution of residuals (errors) helps indicate problems • They should be normally distributed • They should be distributed across the range of fitted values with a similar range • Some observations can have high influence on the model (usually outliers) • Compare model against other versions (fewer features, more etc) • Beware the curse of dimensionality
  • 44. Residuals vs actuals -3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00 0.00 2.00 4.00 6.00 8.00 R: Y=1+2x R: Y=2+x R: y=2.5+x R: y=2x
  • 47. Next steps Continue attending Learn R or Python Check out the resources
  • 48. Resources • Making Friends with Machine Learning – YouTube • Machine Learning Flashcards (chrisalbon.com) • Setosa data visualization and visual explanations • Feature Engineering and Selection: A Practical Approach for Predictive Models • RPubs - Residual Analysis in Linear Regression • Regression Models for Data Science in R