SlideShare a Scribd company logo
Chapter 4
Describing the Relation
Between Two Variables
4.1
Scatter Diagrams; Correlation
Bivariate data is data in which two
variables are measured on an individual.
The response variable is the variable
whose value can be explained or
determined based upon the value of the
predictor variable.
A lurking variable is one that is related to
the response and/or predictor variable, but
is excluded from the analysis
A scatter diagram shows the relationship
between two quantitative variables
measured on the same individual. Each
individual in the data set is represented by a
point in the scatter diagram. The predictor
variable is plotted on the horizontal axis and
the response variable is plotted on the
vertical axis. Do not connect the points
when drawing a scatter diagram.
EXAMPLE Drawing a Scatter Diagram
The following data are based on a study for
drilling rock. The researchers wanted to
determine whether the time it takes to dry drill
a distance of 5 feet in rock increases with the
depth at which the drilling begins. So, depth
at which drilling begins is the predictor
variable, x, and time (in minutes) to drill five
feet is the response variable, y. Draw a
scatter diagram of the data.
Source: Penner, R., and Watts, D.G. “Mining Information.” The American Statistician, Vol.
45, No. 1, Feb. 1991, p. 6.
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Two variables that are linearly related are said to
be positively associated when above average
values of one variable are associated with above
average values of the corresponding variable.
That is, two variables are positively associated
when the values of the predictor variable increase,
the values of the response variable also increase.
Two variables that are linearly related are said to
be negatively associated when above average
values of one variable are associated with below
average values of the corresponding variable.
That is, two variables are negatively associated
when the values of the predictor variable increase,
the values of the response variable decrease
The linear correlation coefficient or Pearson
product moment correlation coefficient is a
measure of the strength of linear relation between
two quantitative variables. We use the Greek letter
(rho) to represent the population correlation
coefficient and r to represent the sample correlation
coefficient. We shall only present the formula for
the sample correlation coefficient.
1. The linear correlation coefficient is always
between -1 and 1, inclusive. That is, -1 < r < 1.
2. If r = +1, there is a perfect positive linear relation
between the two variables.
3. If r = -1, there is a perfect negative linear relation
between the two variables.
4. The closer r is to +1, the stronger the evidence of
positive association between the two variables.
5. The closer r is to -1, the stronger the evidence of
negative association between the two variables.
Properties of the Linear Correlation CoefficientProperties of the Linear Correlation Coefficient
6. If r is close to 0, there is evidence of no linear
relation between the two variables. Because the
linear correlation coefficient is a measure of
strength of linear relation, r close to 0 does not
imply no relation, just no linear relation.
7. It is a unitless measure of association. So, the
unit of measure for x and y plays no role in the
interpretation of r.
Properties of the Linear Correlation CoefficientProperties of the Linear Correlation Coefficient
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
Math n Statistic
EXAMPLE Drawing a Scatter Diagram and
Computing the Correlation Coefficient
For the following data
(a)Draw a scatter diagram and comment on the
type of relation that appears to exist between x
and y.
(b) By hand, compute the linear correlation
coefficient.
EXAMPLE Determining the Linear
Correlation Coefficient
Determine the linear correlation coefficient
of the drilling data.
Math n Statistic
i
x
x x
s
− i
y
y y
s
−
i i
x y
x x y y
s s
  − −
   
  
x =
y =
A linear correlation coefficient that implies
a strong positive or negative association
that is computed using observational data
does not imply causation among the
variables.
Chapter 4
Describing the Relation
Between Two Variables
4.2
Least-squares Regression
EXAMPLE Finding an Equation that Describes
a Linear Relation
(a) Find a linear equation that relates x (the
predictor variable) and y (the response variable)
by selecting two points and finding the equation
of the line containing the points.
(b) Graph the equation on the scatter diagram.
(c) Use the equation to predict y if x = 5.
Using the following sample data:
The difference between the observed value
of y and the predicted value of y is the error
or residual. That is
residual = observed - predicted
Compute the residual for the prediction
corresponding to x = 5.
Math n Statistic
Math n Statistic
EXAMPLE Finding the Least-squares
Regression Line
Using the sample data:
(a) Find the least-squares regression line.
(b) Interpret the slope and intercept.
(c) Predict y if x = 5.
(d) Compute the residual for x = 5.
(e) Draw the least-squares regression line on the
scatter diagram of the data.
EXAMPLE Computing the Sum of Squared
Residuals
Compute the sum of squared residuals for
the line describing the relation between x
and y that was obtained using two points.
Compute the sum of squared residuals for
the least-squares regression line. Which is
smaller?
EXAMPLE Finding the Least-squares
Regression Line
(a) Find the least-squares regression line
for the drilling data.
(b) Use the line to predict the drilling time
at x = 130 feet.
(c) Should the line be used to predict the
drilling time at x = 400 feet? Why?
(d) Interpret the slope and y-intercept.
Math n Statistic
Math n Statistic

More Related Content

PDF
R linear regression
PDF
correlation_and_covariance
PPT
Correlation analysis
PPSX
Linear regression
PPTX
Simple Linear Regression: Step-By-Step
PPT
Statistics
PPT
Statistics
PDF
Simple linear regression
R linear regression
correlation_and_covariance
Correlation analysis
Linear regression
Simple Linear Regression: Step-By-Step
Statistics
Statistics
Simple linear regression

What's hot (19)

PDF
Linear regression
PPTX
Coefficient of correlation
PPTX
Spearman Rank
PPTX
Regression analysis
PPTX
Stats 3000 Week 2 - Winter 2011
PPT
Sumit presentation
PPT
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
PDF
Scatter diagram
PPT
Regression & correlation
PPTX
Regression
PPTX
Presentation On Regression
PPTX
correlation
PPTX
Correlation and Regression ppt
PPTX
9.2 lin reg coeff of det
PDF
Regression analysis
PDF
Least Squares Regression Method | Edureka
PPT
Simple lin regress_inference
PDF
Kendall's ,partial correlation and scatter plot
PPTX
4. regression analysis1
Linear regression
Coefficient of correlation
Spearman Rank
Regression analysis
Stats 3000 Week 2 - Winter 2011
Sumit presentation
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Scatter diagram
Regression & correlation
Regression
Presentation On Regression
correlation
Correlation and Regression ppt
9.2 lin reg coeff of det
Regression analysis
Least Squares Regression Method | Edureka
Simple lin regress_inference
Kendall's ,partial correlation and scatter plot
4. regression analysis1
Ad

Viewers also liked (16)

PDF
Scatter Diagrams
PPTX
PPT
7 qc tools
PPTX
Pearson Correlation
ODP
Scatter diagrams and correlation
PPT
scatter diagram
PPTX
Scatter diagram in tqm
PPSX
Coefficient of correlation...ppt
PPT
Lesson 8 Linear Correlation And Regression
PPTX
Correlation
PPTX
Correlation analysis
PPTX
Correlation of subjects in school (b.ed notes)
PPTX
7 c's of marketing.
PPT
Correlation analysis ppt
PPTX
Correlation ppt...
PDF
Mpc 006 - 02-01 product moment coefficient of correlation
Scatter Diagrams
7 qc tools
Pearson Correlation
Scatter diagrams and correlation
scatter diagram
Scatter diagram in tqm
Coefficient of correlation...ppt
Lesson 8 Linear Correlation And Regression
Correlation
Correlation analysis
Correlation of subjects in school (b.ed notes)
7 c's of marketing.
Correlation analysis ppt
Correlation ppt...
Mpc 006 - 02-01 product moment coefficient of correlation
Ad

Similar to Math n Statistic (20)

PDF
9. parametric regression
PPTX
Correlation and regression
PPTX
Correlation and regression
PPT
Exploring bivariate data
PPTX
Correlation and regression
PPTX
PPTX
Correlation and regression impt
PPTX
Correlation
PPT
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
PPTX
REGRESSION ANALYSIS THEORY EXPLAINED HERE
PPT
Chapter 10
PPT
Chapter 10
PPTX
Correlation and Regression
PPT
2-20-04.ppt
PDF
Correlation and Regression
PPTX
Regression -Linear.pptx
PPTX
Scatterplots, Correlation, and Regression
PDF
Study of Correlation
PPTX
Measure of Association
9. parametric regression
Correlation and regression
Correlation and regression
Exploring bivariate data
Correlation and regression
Correlation and regression impt
Correlation
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
REGRESSION ANALYSIS THEORY EXPLAINED HERE
Chapter 10
Chapter 10
Correlation and Regression
2-20-04.ppt
Correlation and Regression
Regression -Linear.pptx
Scatterplots, Correlation, and Regression
Study of Correlation
Measure of Association

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Encapsulation theory and applications.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Web App vs Mobile App What Should You Build First.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Digital-Transformation-Roadmap-for-Companies.pptx
A comparative analysis of optical character recognition models for extracting...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Unlocking AI with Model Context Protocol (MCP)
1 - Historical Antecedents, Social Consideration.pdf
Enhancing emotion recognition model for a student engagement use case through...
Building Integrated photovoltaic BIPV_UPV.pdf
A novel scalable deep ensemble learning framework for big data classification...
Programs and apps: productivity, graphics, security and other tools
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
cloud_computing_Infrastucture_as_cloud_p
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Encapsulation theory and applications.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia

Math n Statistic

  • 1. Chapter 4 Describing the Relation Between Two Variables 4.1 Scatter Diagrams; Correlation
  • 2. Bivariate data is data in which two variables are measured on an individual. The response variable is the variable whose value can be explained or determined based upon the value of the predictor variable. A lurking variable is one that is related to the response and/or predictor variable, but is excluded from the analysis
  • 3. A scatter diagram shows the relationship between two quantitative variables measured on the same individual. Each individual in the data set is represented by a point in the scatter diagram. The predictor variable is plotted on the horizontal axis and the response variable is plotted on the vertical axis. Do not connect the points when drawing a scatter diagram.
  • 4. EXAMPLE Drawing a Scatter Diagram The following data are based on a study for drilling rock. The researchers wanted to determine whether the time it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. So, depth at which drilling begins is the predictor variable, x, and time (in minutes) to drill five feet is the response variable, y. Draw a scatter diagram of the data. Source: Penner, R., and Watts, D.G. “Mining Information.” The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
  • 10. Two variables that are linearly related are said to be positively associated when above average values of one variable are associated with above average values of the corresponding variable. That is, two variables are positively associated when the values of the predictor variable increase, the values of the response variable also increase.
  • 11. Two variables that are linearly related are said to be negatively associated when above average values of one variable are associated with below average values of the corresponding variable. That is, two variables are negatively associated when the values of the predictor variable increase, the values of the response variable decrease
  • 12. The linear correlation coefficient or Pearson product moment correlation coefficient is a measure of the strength of linear relation between two quantitative variables. We use the Greek letter (rho) to represent the population correlation coefficient and r to represent the sample correlation coefficient. We shall only present the formula for the sample correlation coefficient.
  • 13. 1. The linear correlation coefficient is always between -1 and 1, inclusive. That is, -1 < r < 1. 2. If r = +1, there is a perfect positive linear relation between the two variables. 3. If r = -1, there is a perfect negative linear relation between the two variables. 4. The closer r is to +1, the stronger the evidence of positive association between the two variables. 5. The closer r is to -1, the stronger the evidence of negative association between the two variables. Properties of the Linear Correlation CoefficientProperties of the Linear Correlation Coefficient
  • 14. 6. If r is close to 0, there is evidence of no linear relation between the two variables. Because the linear correlation coefficient is a measure of strength of linear relation, r close to 0 does not imply no relation, just no linear relation. 7. It is a unitless measure of association. So, the unit of measure for x and y plays no role in the interpretation of r. Properties of the Linear Correlation CoefficientProperties of the Linear Correlation Coefficient
  • 24. EXAMPLE Drawing a Scatter Diagram and Computing the Correlation Coefficient For the following data (a)Draw a scatter diagram and comment on the type of relation that appears to exist between x and y. (b) By hand, compute the linear correlation coefficient.
  • 25. EXAMPLE Determining the Linear Correlation Coefficient Determine the linear correlation coefficient of the drilling data.
  • 27. i x x x s − i y y y s − i i x y x x y y s s   − −        x = y =
  • 28. A linear correlation coefficient that implies a strong positive or negative association that is computed using observational data does not imply causation among the variables.
  • 29. Chapter 4 Describing the Relation Between Two Variables 4.2 Least-squares Regression
  • 30. EXAMPLE Finding an Equation that Describes a Linear Relation (a) Find a linear equation that relates x (the predictor variable) and y (the response variable) by selecting two points and finding the equation of the line containing the points. (b) Graph the equation on the scatter diagram. (c) Use the equation to predict y if x = 5. Using the following sample data:
  • 31. The difference between the observed value of y and the predicted value of y is the error or residual. That is residual = observed - predicted Compute the residual for the prediction corresponding to x = 5.
  • 34. EXAMPLE Finding the Least-squares Regression Line Using the sample data: (a) Find the least-squares regression line. (b) Interpret the slope and intercept. (c) Predict y if x = 5. (d) Compute the residual for x = 5. (e) Draw the least-squares regression line on the scatter diagram of the data.
  • 35. EXAMPLE Computing the Sum of Squared Residuals Compute the sum of squared residuals for the line describing the relation between x and y that was obtained using two points. Compute the sum of squared residuals for the least-squares regression line. Which is smaller?
  • 36. EXAMPLE Finding the Least-squares Regression Line (a) Find the least-squares regression line for the drilling data. (b) Use the line to predict the drilling time at x = 130 feet. (c) Should the line be used to predict the drilling time at x = 400 feet? Why? (d) Interpret the slope and y-intercept.