SlideShare a Scribd company logo
INTRODUCTION TO
STATISTICS & PROBABILITY
Chapter 2:
Looking at Data–Relationships (Part 2)
Dr. Nahid Sultana
1
2
Chapter 2:
Looking at Data–Relationships
2.1: Scatterplots
2.2: Correlation
2.3: Least-Squares Regression
2.5: Data Analysis for Two-Way Tables
Objectives
 The correlation coefficient “r”
 r does not distinguish between x and y
 r has no units of measurement
 r ranges from -1 to +1
 Influential points
2.2: Correlation
3
The correlation coefficient "r"
 The correlation coefficient is a measure of the direction and
strength of a linear relationship.
 It is calculated using the mean and the standard deviation of both
the x and y variables.
 Correlation can only be used to describe quantitative variables.
Categorical variables don’t have means and standard deviations.
4
The correlation coefficient “r“ (Cont…)
Time to swim: = 35, sx = 0.7
Pulse rate: = 140, sy = 9.5
x
y
5
r =
1
n −1
xi − x
sx






i=1
n
∑
yi − y
sy






 Suppose that we have data
on variables x and y for n
individuals.
 The means and standard
deviations of the two variables
are and for the x-values,
and and for y-values.
 The correlation r between x
and y
x
y
“r” does not distinguish x & y
The correlation coefficient, r,
treats x and y symmetrically.
"Time to swim" is the explanatory variable here, and belongs on
the x axis. However, in either plot r is the same (r=-0.75).
r = -0.75 r = -0.75
r =
1
n −1
xi − x
sx






i=1
n
∑
yi − y
sy






6
Changing the units of variables does
not change the correlation coefficient
"r“.
"r" has no unit r = -0.75
r = -0.75
7
standardized
value of x
(unit less)
standardized
value of y
(unit less)
"r" ranges from -1 to +1
Properties of Correlation
 r is always a no. between –1 and 1.
 r > 0 indicates a positive association.
r < 0 indicates a negative association.
 Values of r near 0 indicate a very
weak linear relationship.
 The strength of the linear relationship
increases as r moves away from 0
toward –1 or 1.
 The extreme values r = –1 and r = 1
occur only in the case of a perfect
linear relationship.
8
9
“r” increases as variation decreases
When variability in
one or both variables
decreases, the
correlation coefficient
gets stronger
( closer to +1 or -1).
Correlation only describes linear
relationships
10
No matter how strong the association,
r does not describe curved relationships.
11
Influential points
Correlations are calculated using
means and standard deviations,
and thus are NOT resistant to
outliers.
Just moving one point away from
the general trend here decreases
the correlation from -0.91 to -
0.75
12
12
Influential points (Cont…)

More Related Content

DOCX
MCA_UNIT-4_Computer Oriented Numerical Statistical Methods
DOCX
Course pack unit 5
PPTX
Correlation
PDF
Correlation and Regression
PDF
Correlation and regression
PPTX
Correlation and regression analysis
PPTX
Statistics-Correlation and Regression Analysis
PPTX
Correlation Analysis
MCA_UNIT-4_Computer Oriented Numerical Statistical Methods
Course pack unit 5
Correlation
Correlation and Regression
Correlation and regression
Correlation and regression analysis
Statistics-Correlation and Regression Analysis
Correlation Analysis

What's hot (17)

PPT
Correlation and regression
PPT
Correlation analysis ppt
PPT
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
PPT
Correlation and Regression
PPTX
Correlation & Regression
PPTX
Correlation and Regression
PPTX
Presentation on regression analysis
PPT
Correlation & regression uwsb (3)
PPT
Regression
PDF
Chapter 14 Part I
PDF
Chapter 2 part3-Least-Squares Regression
PPTX
Correlation analysis
PPTX
Correlation and Regression
PPTX
Karl pearson's correlation
PDF
Correlation 2
PPTX
Regression Analysis
PPTX
Karl pearson's coefficient of correlation
Correlation and regression
Correlation analysis ppt
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation and Regression
Correlation & Regression
Correlation and Regression
Presentation on regression analysis
Correlation & regression uwsb (3)
Regression
Chapter 14 Part I
Chapter 2 part3-Least-Squares Regression
Correlation analysis
Correlation and Regression
Karl pearson's correlation
Correlation 2
Regression Analysis
Karl pearson's coefficient of correlation
Ad

Similar to Chapter 2 part2-Correlation (20)

PPTX
Correlation Analysis PRESENTED.pptx
PPTX
PDF
Introduction to correlation and regression analysis
PPT
CORRELATION.ppt
PDF
Unit 1 Correlation- BSRM.pdf
PDF
CORRELATION-AND-REGRESSION.pdf for human resource
PDF
Correlation analysis
PPT
correlation and r3433333333333333333333333333333333333333333333333egratio111n...
PDF
DOCX
Statistics
PPTX
ESTIMATING THE STRENGTH OF CORRELATION.pptx
PPTX
Correlation and Regression ppt
PPT
correlation and regression
PPTX
Stat 1163 -correlation and regression
PPTX
Correlation-and-regression-Analysis.pptx
PPTX
Mini6 correlation-ppt (2)
PPT
correlation.ppt
PPTX
Introduction to Educational statistics and measurement Unit 2
PPT
5 regressionand correlation
Correlation Analysis PRESENTED.pptx
Introduction to correlation and regression analysis
CORRELATION.ppt
Unit 1 Correlation- BSRM.pdf
CORRELATION-AND-REGRESSION.pdf for human resource
Correlation analysis
correlation and r3433333333333333333333333333333333333333333333333egratio111n...
Statistics
ESTIMATING THE STRENGTH OF CORRELATION.pptx
Correlation and Regression ppt
correlation and regression
Stat 1163 -correlation and regression
Correlation-and-regression-Analysis.pptx
Mini6 correlation-ppt (2)
correlation.ppt
Introduction to Educational statistics and measurement Unit 2
5 regressionand correlation
Ad

More from nszakir (19)

PDF
Chapter-4: More on Direct Proof and Proof by Contrapositive
PDF
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
PDF
Chapter 2: Relations
PDF
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
PDF
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
PDF
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
PDF
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
PDF
Chapter 5 part1- The Sampling Distribution of a Sample Mean
PDF
Chapter 4 part4- General Probability Rules
PDF
Chapter 4 part3- Means and Variances of Random Variables
PDF
Chapter 4 part2- Random Variables
PDF
Chapter 4 part1-Probability Model
PDF
Chapter 3 part3-Toward Statistical Inference
PDF
Chapter 3 part2- Sampling Design
PDF
Chapter 3 part1-Design of Experiments
PDF
Chapter 2 part1-Scatterplots
PDF
Density Curves and Normal Distributions
PDF
Describing Distributions with Numbers
PDF
Displaying Distributions with Graphs
Chapter-4: More on Direct Proof and Proof by Contrapositive
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
Chapter 2: Relations
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 4 part4- General Probability Rules
Chapter 4 part3- Means and Variances of Random Variables
Chapter 4 part2- Random Variables
Chapter 4 part1-Probability Model
Chapter 3 part3-Toward Statistical Inference
Chapter 3 part2- Sampling Design
Chapter 3 part1-Design of Experiments
Chapter 2 part1-Scatterplots
Density Curves and Normal Distributions
Describing Distributions with Numbers
Displaying Distributions with Graphs

Chapter 2 part2-Correlation

  • 1. INTRODUCTION TO STATISTICS & PROBABILITY Chapter 2: Looking at Data–Relationships (Part 2) Dr. Nahid Sultana 1
  • 2. 2 Chapter 2: Looking at Data–Relationships 2.1: Scatterplots 2.2: Correlation 2.3: Least-Squares Regression 2.5: Data Analysis for Two-Way Tables
  • 3. Objectives  The correlation coefficient “r”  r does not distinguish between x and y  r has no units of measurement  r ranges from -1 to +1  Influential points 2.2: Correlation 3
  • 4. The correlation coefficient "r"  The correlation coefficient is a measure of the direction and strength of a linear relationship.  It is calculated using the mean and the standard deviation of both the x and y variables.  Correlation can only be used to describe quantitative variables. Categorical variables don’t have means and standard deviations. 4
  • 5. The correlation coefficient “r“ (Cont…) Time to swim: = 35, sx = 0.7 Pulse rate: = 140, sy = 9.5 x y 5 r = 1 n −1 xi − x sx       i=1 n ∑ yi − y sy        Suppose that we have data on variables x and y for n individuals.  The means and standard deviations of the two variables are and for the x-values, and and for y-values.  The correlation r between x and y x y
  • 6. “r” does not distinguish x & y The correlation coefficient, r, treats x and y symmetrically. "Time to swim" is the explanatory variable here, and belongs on the x axis. However, in either plot r is the same (r=-0.75). r = -0.75 r = -0.75 r = 1 n −1 xi − x sx       i=1 n ∑ yi − y sy       6
  • 7. Changing the units of variables does not change the correlation coefficient "r“. "r" has no unit r = -0.75 r = -0.75 7 standardized value of x (unit less) standardized value of y (unit less)
  • 8. "r" ranges from -1 to +1 Properties of Correlation  r is always a no. between –1 and 1.  r > 0 indicates a positive association. r < 0 indicates a negative association.  Values of r near 0 indicate a very weak linear relationship.  The strength of the linear relationship increases as r moves away from 0 toward –1 or 1.  The extreme values r = –1 and r = 1 occur only in the case of a perfect linear relationship. 8
  • 9. 9 “r” increases as variation decreases When variability in one or both variables decreases, the correlation coefficient gets stronger ( closer to +1 or -1).
  • 10. Correlation only describes linear relationships 10 No matter how strong the association, r does not describe curved relationships.
  • 11. 11 Influential points Correlations are calculated using means and standard deviations, and thus are NOT resistant to outliers. Just moving one point away from the general trend here decreases the correlation from -0.91 to - 0.75