SlideShare a Scribd company logo
2
Most read
10
Most read
18
Most read
REGRESSION ANALYSIS
SADIA KHAN
3/ 42
• Regression analysis is a statistical technique used to describe
relationships among variables.
• The simplest case to examine is one in which a variable Y , referred
to as the dependent or target variable, may be related to one variable
X , called an independent or explanatory variable, or simply a
regressor.
• If the relationship between Y and X is believed to be linear, then the
equation for a line may be appropriate; Y = β1 + β2X, where β1 is an
intercept term and β2 is a slope coefficient.
• In simplest terms, the purpose of regression is to try to find the best fit
line or equation that expresses the relationship between Y and X .
LINEAR REGRESSION
I Consider the following data points
X 1 2 3 4 5 6
Y 3 5 7 9 11 13
I A graph of the (x, y ) pairs would appearas
1 4
1 2
1 0
8
6
4
2
0
0 1 2 3 4 5 6 7
Y
X
F i g . 1 . 1
4/ 42
LINEAR REGRESSION
No relationship vs. Strong relationship
•The regression line is flat when there is no ability to predict whatsoever.
•The regression line is sloped at an angle when there is a relationship.
LINEAR REGRESSION FLORMULA
• Mathematically, regression uses a linear function to approximate (predict) the dependent variable
given as: Y = βo + β1X + ∈ where,
• Y - Dependent variable; This is the variable we predict
• X - Independent variable; This is the variable we use to make a prediction
• βo – Intercept; This is the intercept term. It is the prediction value you get when X = 0
• β1 – Slope; This is the slope term. It explains the change in Y when X changes by 1 unit.
• ∈ - Error This represents the residual value, i.e. the difference between actual and predicted
values
Prediction
• A perfect correlation between two variables
produces a line when plotted in a bivariate
scatterplot
• In this figure, every increase of the value of
X is associated with an increase in Y
without any exceptions.
• If we wanted to predict values of Y based
on a certain value of X, we would have no
problem in doing so with this figure.
• A value of 2 for X should be associated
with a value of 10 on the Y variable, as
indicated by this graph.
Total Variation
•The explained sum of squares and unexplained sum of squares
add up to equal the total sum of squares. The variation of the
scores is either explained by x or not.
Total sum of squares = explained sum of squares + unexplained
sum of squares.
Error of Prediction: “Unexplained
Variance”
• Usually, prediction won't be so perfect.
Most often, not all the points will fall
perfectly on the line. There will be
some error in the prediction.
• For each value of X, we know the
approximate value of Y but not the
exact value.
Unexplained Variance
• We can look at how much each
point falls off the line by drawing a
little line straight from the point to
the line as shown below.
• If we wanted to summarize how
much error in prediction we had
overall, we could sum up the
distances (or deviations)
represented by all those little lines.
• The middle line is called the
regression line.
Sum of Squares Residual
• Summing up the deviations of the points gives us an overall idea of how much error in
prediction there is. sum of the squared deviations from the regression line (or the predicted
points) is a summary of the error up.
• If we choose a line that goes exactly through the middle of the points, about half of the points
that fall off of the line should be below the line and about half should be above. Some of the
deviations will be negative and some will be positive, and, thus the sum of all of them will equal
0.
• the (imaginary) scores that fall exactly on the regression line are called the predicted scores,
and there is a predicted score for each value of X. The predicted scores are represented by y^
(sometimes referred to as "y-hat", because of the little hat; or as "y-predict").
• So the sum of the squared deviations from the predicted scores is represented by y scores is
subtracted from the predicted score (or the line) and then squared. Then all the squared
deviations are summed a measure of the residual variation
Sum of Squares Regression: The
Explained Variance
• The extent to which the regression line is sloped
represents the amount we can predict y scores
based on x scores, and the extent to which the
regression line is beneficial in predicting y scores
over and above the mean of the y scores.
• To represent this, we could look at how much the
predicted points (which fall on the regression line)
deviate from the mean.
• This deviation is represented by the little vertical
lines.
Formula for Sum of Squares
Regression: Explained Variance
• The squared deviations of the predicted scores from the mean
score, or
• represent the amount of variance explained in the y scores by the x
scores.
Total Variation
• The total variation in the y score is measured simply by the sum of the
squared deviations of the y scores from the mean.
Total Variation
•The explained sum of squares and unexplained sum of squares
add up to equal the total sum of squares. The variation of the
scores is either explained by x or not.
Total sum of squares = explained sum of squares + unexplained
sum of squares.
R2
• The amount of variation explained by the regression line in
regression analysis is equal to the amount of shared
variation between the X and Y variables in correlation.
R2
• We can create a ratio of the amount of variance explained (sum or
squares regression, or SSR) relative to the overall variation of the y
variable (sum of squares total, or SST) which will give us r-square.
Multiple Regression
• Multiple regression is an extension of a simple linear regression.
• In multiple regression, a dependent variable is predicted by more
than one independent variable
• Y = a + b1x1 + b2x2 + . . . + bkxk
References
• Introduction to regression; YouTube videos
• https://www.youtube.com/watch?v=zPG4NjIkCjc&t=1s
• https://www.youtube.com/watch?v=owI7zxCqNY0
• Regression via SPSS, values explained
• https://www.youtube.com/watch?v=VvlqA-iO2HA
• Correlation and Regression
• https://www.youtube.com/watch?v=xTpHD5WLuoA

More Related Content

PDF
Naive Bayes
PDF
R decision tree
PPT
Simple Linear Regression
PPT
Vanishing & Exploding Gradients
PPT
Extension principle
PPTX
Support vector machine
PDF
EM Algorithm
PPTX
multiple linear regression
Naive Bayes
R decision tree
Simple Linear Regression
Vanishing & Exploding Gradients
Extension principle
Support vector machine
EM Algorithm
multiple linear regression

What's hot (20)

PPTX
Graphics Primitives and CG Display Devices
PDF
Logistic regression
PPTX
Parallel algorithms
PPTX
MATCHING GRAPH THEORY
PDF
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
PPTX
Lect7 Association analysis to correlation analysis
PPTX
Dimensionality reduction
PPT
Unit 4 DBMS.ppt
PPTX
Distributed DBMS - Unit 6 - Query Processing
PPTX
Linear Regression
PPTX
Gradient descent method
PDF
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
PPTX
Linear regression
PPTX
Linear algebra for deep learning
PPTX
Cross validation.pptx
PPTX
Lecture-12Evaluation Measures-ML.pptx
PPTX
Naive Bayes Classifier.pptx
PDF
Unit 3
PPTX
Transport layer interface
Graphics Primitives and CG Display Devices
Logistic regression
Parallel algorithms
MATCHING GRAPH THEORY
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lect7 Association analysis to correlation analysis
Dimensionality reduction
Unit 4 DBMS.ppt
Distributed DBMS - Unit 6 - Query Processing
Linear Regression
Gradient descent method
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Linear regression
Linear algebra for deep learning
Cross validation.pptx
Lecture-12Evaluation Measures-ML.pptx
Naive Bayes Classifier.pptx
Unit 3
Transport layer interface
Ad

Similar to LINEAR REGRESSION ANALYSIS.pptx (20)

PPT
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
PPT
correlation in Marketing research uses..
PPT
2-20-04.ppt
PPTX
regression.pptx
PPTX
Regression analysis
PPTX
Correlation and regression
PPTX
Unit-III Correlation and Regression.pptx
PPTX
Correlation and regression
PPTX
Linear regression
PDF
9. parametric regression
PPTX
Regression-SIMPLE LINEAR (1).psssssssssptx
PPT
Regression.ppt basic introduction of regression with example
PPTX
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
PPTX
Regression analysis
PPT
Regression and Co-Relation
PPT
Correlation & Regression for Statistics Social Science
PPT
Corr-and-Regress.ppt
PPT
Cr-and-Regress.ppt
PPT
Corr-and-Regress.ppt
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
correlation in Marketing research uses..
2-20-04.ppt
regression.pptx
Regression analysis
Correlation and regression
Unit-III Correlation and Regression.pptx
Correlation and regression
Linear regression
9. parametric regression
Regression-SIMPLE LINEAR (1).psssssssssptx
Regression.ppt basic introduction of regression with example
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression analysis
Regression and Co-Relation
Correlation & Regression for Statistics Social Science
Corr-and-Regress.ppt
Cr-and-Regress.ppt
Corr-and-Regress.ppt
Ad

More from sadiakhan783184 (6)

PPT
Theories of Envoirnmnetal Psy.ppt
PPTX
Frequency distributions.pptx
PPT
STATISTICAL MEASURES.ppt
PPT
HYPOTHESIS TESTING.ppt
PPT
CORRELATION.ppt
PPT
ANOVA.ppt
Theories of Envoirnmnetal Psy.ppt
Frequency distributions.pptx
STATISTICAL MEASURES.ppt
HYPOTHESIS TESTING.ppt
CORRELATION.ppt
ANOVA.ppt

Recently uploaded (20)

PDF
System and Network Administraation Chapter 3
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Essential Infomation Tech presentation.pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
Digital Strategies for Manufacturing Companies
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPT
JAVA ppt tutorial basics to learn java programming
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
System and Network Administration Chapter 2
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Complete React Javascript Course Syllabus.pdf
System and Network Administraation Chapter 3
Online Work Permit System for Fast Permit Processing
Essential Infomation Tech presentation.pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
The Five Best AI Cover Tools in 2025.docx
Digital Strategies for Manufacturing Companies
How to Choose the Right IT Partner for Your Business in Malaysia
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
How to Migrate SBCGlobal Email to Yahoo Easily
PTS Company Brochure 2025 (1).pdf.......
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
JAVA ppt tutorial basics to learn java programming
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
System and Network Administration Chapter 2
How Creative Agencies Leverage Project Management Software.pdf
ISO 45001 Occupational Health and Safety Management System
Upgrade and Innovation Strategies for SAP ERP Customers
Complete React Javascript Course Syllabus.pdf

LINEAR REGRESSION ANALYSIS.pptx

  • 2. 3/ 42 • Regression analysis is a statistical technique used to describe relationships among variables. • The simplest case to examine is one in which a variable Y , referred to as the dependent or target variable, may be related to one variable X , called an independent or explanatory variable, or simply a regressor. • If the relationship between Y and X is believed to be linear, then the equation for a line may be appropriate; Y = β1 + β2X, where β1 is an intercept term and β2 is a slope coefficient. • In simplest terms, the purpose of regression is to try to find the best fit line or equation that expresses the relationship between Y and X . LINEAR REGRESSION
  • 3. I Consider the following data points X 1 2 3 4 5 6 Y 3 5 7 9 11 13 I A graph of the (x, y ) pairs would appearas 1 4 1 2 1 0 8 6 4 2 0 0 1 2 3 4 5 6 7 Y X F i g . 1 . 1 4/ 42 LINEAR REGRESSION
  • 4. No relationship vs. Strong relationship •The regression line is flat when there is no ability to predict whatsoever. •The regression line is sloped at an angle when there is a relationship.
  • 5. LINEAR REGRESSION FLORMULA • Mathematically, regression uses a linear function to approximate (predict) the dependent variable given as: Y = βo + β1X + ∈ where, • Y - Dependent variable; This is the variable we predict • X - Independent variable; This is the variable we use to make a prediction • βo – Intercept; This is the intercept term. It is the prediction value you get when X = 0 • β1 – Slope; This is the slope term. It explains the change in Y when X changes by 1 unit. • ∈ - Error This represents the residual value, i.e. the difference between actual and predicted values
  • 6. Prediction • A perfect correlation between two variables produces a line when plotted in a bivariate scatterplot • In this figure, every increase of the value of X is associated with an increase in Y without any exceptions. • If we wanted to predict values of Y based on a certain value of X, we would have no problem in doing so with this figure. • A value of 2 for X should be associated with a value of 10 on the Y variable, as indicated by this graph.
  • 7. Total Variation •The explained sum of squares and unexplained sum of squares add up to equal the total sum of squares. The variation of the scores is either explained by x or not. Total sum of squares = explained sum of squares + unexplained sum of squares.
  • 8. Error of Prediction: “Unexplained Variance” • Usually, prediction won't be so perfect. Most often, not all the points will fall perfectly on the line. There will be some error in the prediction. • For each value of X, we know the approximate value of Y but not the exact value.
  • 9. Unexplained Variance • We can look at how much each point falls off the line by drawing a little line straight from the point to the line as shown below. • If we wanted to summarize how much error in prediction we had overall, we could sum up the distances (or deviations) represented by all those little lines. • The middle line is called the regression line.
  • 10. Sum of Squares Residual • Summing up the deviations of the points gives us an overall idea of how much error in prediction there is. sum of the squared deviations from the regression line (or the predicted points) is a summary of the error up. • If we choose a line that goes exactly through the middle of the points, about half of the points that fall off of the line should be below the line and about half should be above. Some of the deviations will be negative and some will be positive, and, thus the sum of all of them will equal 0. • the (imaginary) scores that fall exactly on the regression line are called the predicted scores, and there is a predicted score for each value of X. The predicted scores are represented by y^ (sometimes referred to as "y-hat", because of the little hat; or as "y-predict"). • So the sum of the squared deviations from the predicted scores is represented by y scores is subtracted from the predicted score (or the line) and then squared. Then all the squared deviations are summed a measure of the residual variation
  • 11. Sum of Squares Regression: The Explained Variance • The extent to which the regression line is sloped represents the amount we can predict y scores based on x scores, and the extent to which the regression line is beneficial in predicting y scores over and above the mean of the y scores. • To represent this, we could look at how much the predicted points (which fall on the regression line) deviate from the mean. • This deviation is represented by the little vertical lines.
  • 12. Formula for Sum of Squares Regression: Explained Variance • The squared deviations of the predicted scores from the mean score, or • represent the amount of variance explained in the y scores by the x scores.
  • 13. Total Variation • The total variation in the y score is measured simply by the sum of the squared deviations of the y scores from the mean.
  • 14. Total Variation •The explained sum of squares and unexplained sum of squares add up to equal the total sum of squares. The variation of the scores is either explained by x or not. Total sum of squares = explained sum of squares + unexplained sum of squares.
  • 15. R2 • The amount of variation explained by the regression line in regression analysis is equal to the amount of shared variation between the X and Y variables in correlation.
  • 16. R2 • We can create a ratio of the amount of variance explained (sum or squares regression, or SSR) relative to the overall variation of the y variable (sum of squares total, or SST) which will give us r-square.
  • 17. Multiple Regression • Multiple regression is an extension of a simple linear regression. • In multiple regression, a dependent variable is predicted by more than one independent variable • Y = a + b1x1 + b2x2 + . . . + bkxk
  • 18. References • Introduction to regression; YouTube videos • https://www.youtube.com/watch?v=zPG4NjIkCjc&t=1s • https://www.youtube.com/watch?v=owI7zxCqNY0 • Regression via SPSS, values explained • https://www.youtube.com/watch?v=VvlqA-iO2HA • Correlation and Regression • https://www.youtube.com/watch?v=xTpHD5WLuoA