SlideShare a Scribd company logo
MIT Arts, commerce and science
college, alandi
Measures of Correlation
Presented
by
Prof. Dr. Sangita Birajdar
Assistant Professor,
Department of Statistics,
MIT ACSC, Alandi
Copyright @ Dr. Sangita Birajdar 1
OBJECTIVES
In this Unit you are going to learn correlation and
its measures:
1. Types of data
2. Concept and meaning of correlation
3. Types of correlation
4. Scatter diagram, its interpretation and merits and demerits
5. Covariance and its properties
6. Karl Pearson’s Coefficient of Correlation, its properties and its interpretation
7. Spearman’s Rank correlation coefficient and its interpretation.
8. Concept and meaning of regression
9. Lines of regression
10. Regression coefficients, their properties and interpretation
11. Numerical examples and problems
Copyright @ Dr. Sangita Birajdar 2
TYPES OF DATA
Univariate data: If a single variable which can be measured with
only one characteristic under study is known as Univariate data.
For Example: Marks of students in particular subject, Monthly
income of workers, Height of individuals, Blood pressure of adults,
etc.
Bivariate data: Variables which can be measured with two
characteristics at a time with same unit under study are known as
bivariate data.
For example: Day temperature and ice cream sales, income and
expenditure of families, Height and weight of students, Monthly
electricity bill and consumption are the examples of bivariate
data.
Copyright @ Dr. Sangita Birajdar 3
TYPES OF DATA
Trivariate Data: When the data involves three variables, it is
categorized under trivariate data.
For example, age, weight and blood pressure of a person, yield of
a crop, temperature and amount of fertiliser used, price, demand
and supply of a commodity, etc. are the examples of trivariate
data.
Bivariate data (Definition): A set of n pairs of observations related
to two variables X and Y are (x1
, y1
), (x2
, y2
), (x3
, y3
), …, (xn
, yn
)
under study is a bivariate data.
Copyright @ Dr. Sangita Birajdar 4
CONCEPT OF CORRELATION
Copyright @ Dr. Sangita Birajdar 5
• The mathematical measure of correlation was given by the great
Mathematician and Bio-statistician Karl Pearson in 1896 in the form of
correlation coefficient.
• This was extensively used by Sir Francis Galton to explain many
phenomena in biology and genetics.
• Correlation is a statistical technique that shows whether pairs of variables
are related to each other and how strongly they are correlated.
• The extent of linear relationship between two variables is called as
correlation.
• It measures the intensity of relationship between two variables and not
the causation. It means that, correlation is not a cause and effect
relationship.
• When two variables are correlated, increase or decrease in the values of
one variable corresponds to decrease or increase in the values of another
variable.
Types of Correlation
Copyright @ Dr. Sangita Birajdar 6
Depending on direction of changes in pairs of
variables, correlation classified into following three
types:
• Positive Correlation.
• Negative Correlation.
• Zero Correlation.
Positive Correlation
•
Copyright @ Dr. Sangita Birajdar 7
A positive correlation indicates the extent to which both
variables increase and decrease in parallel or values of variable
changes in same direction i.e. when values of one variable
increases as the other variable increases, or values of one
variable decreases while the values of other variable also
decreases. This type of correlation is said to be direct
correlation.
For example:
1. Marks obtained in an examination by a group of students are
positively correlated with the number of hours the students
studied for examination.
2. Sale of ice cream is positively correlated with day temperature.
3. Height and weight of a group of persons is positively correlated as
height increases on an average weight also increases.
Negative Correlation
•
Copyright @ Dr. Sangita Birajdar 8
A negative correlation indicates the extent to which values of
variable changes in opposite direction, i.e. when the value of one
variable increases (decreases) as the value of other variable
decreases (increases). This correlation is said to be invers
correlation.
For example:
1. Volume and pressure of a perfect gas.
2. Supply and price of a commodity.
3. Negative correlation would be as the slope of a hill increase,
the amount of speed a walker reaches may decrease.
Zero Correlation
Zero correlation means no relationship between the two variables
X and Y; i.e. the change in one variable (X) is not associated with
the change in the other variable (Y).
For example, body weight and intelligence, shoe size and monthly
salary, amount of tea drunk and level of intelligence, etc.
Copyright @ Dr. Sangita Birajdar 9
Think about the following
• Age and weight of person.
• Blood pressure of a group of bulky persons and their weights.
• Speed of the vehicle and time required to stop the vehicle after applying
break
• Selling prices of flats and its distance from the central place
• The crop yield and rainfall (up to certain extent)
• Marks in English and Marks in Mathematics.
• Height and marks obtained by students.
• Demand and price of the commodity.
• Amount of cereal in meal and maintaining healthy weight.
• Sale of woolen garments and day temperature.
• Student who spent more time on social media, they perform poor in
examination.
• Shoe size and monthly salary.
• Amount of tea drunk and level of intelligence
Copyright @ Dr. Sangita Birajdar 10
Remember that !!!
• If increase or decrease in the values of one variable does not
correspond to decrease or increase in the values of another
variable then two variables will be uncorrelated.
• Sometimes, the relationship between two variables is simple
incidence. For instance, the relation between the arrival of
migratory birds in a sanctuary and the birth rates in the
locality. Such correlation may be attributed to chance.
• A third variable’s impact on two variables may give rise to a
relation between the two variables. For instance, the relation
between illiteracy and crime arises due to increase in
population.
Copyright @ Dr. Sangita Birajdar 11
Measures of correlation
1) Scatter Diagram
2) Karl Pearson’s Coefficient of Correlation.
3) Spearman’s Rank Correlation.
Copyright @ Dr. Sangita Birajdar 12
Scatter Diagram
• A scatter plot visualise relationships or association between two variables.
• It is simplest and attractive method of diagrammatic representation of
bivariate data that gives the idea about whether the variables are correlated
or not.
• In this method each pair of observation are represented by a point in XY
plane and can be defined as follows:
Definition: Suppose {(xi
, yi
); i =1, 2, ..., n} are bivariate data related to two
variables X and Y. If the pairs of n observation are plotted on XY plan by taking
one variable on X axis and other on Y axis with corresponding to every ordered
pairs (xi
, yi
) to get a dots or points, such a diagram of dots known as scatter
diagram.
Copyright @ Dr. Sangita Birajdar 13
Scatter Diagram
Copyright @ Dr. Sangita Birajdar 14
Merits and Demerits of Scatter Diagram
Merits of Scatter Diagram
1) Scatter diagram is the simplest measure of correlation that enables to get
rough idea of the nature of the relationship between two variables.
2) It is easy to understand.
3) It is not influenced by extreme values.
4) The scatter diagram enables to obtain line of best fit by free hand method.
Demerits of Scatter Diagram
1) It fails to give the magnitude (numeric value) of correlation.
2) It is not useful for qualitative data.
3) It is a subjective method.
Copyright @ Dr. Sangita Birajdar 15
Exercise
What type of correlation you expect in the following situation?
1) A student who has many absentees has a decrease in grades.
2) The longer someone invests the more compound interest he will earn.
3) The less time I spend marketing my business, the fewer new customers I
will have.
4) As the temperature goes up, ice cream sales also go up.
5) When an employee works more hours his pay check increases
proportionately.
6) As one exercise more, his Marks in English is less.
7) The older a man gets the less hair that he has.
8) More number of errors in computer program takes longer time to run a
program
Copyright @ Dr. Sangita Birajdar 16
Covariance
Covariance: Joint variation between two variables.
If two variables are correlated then Cov(X, Y) ≠ 0 but if they are
not correlated then Cov(X, Y) = 0
If {(xi
, yi
), i=1, 2, 3,…,n} is a bivariate data on (X,Y), then
covariance is define as arithmetic mean of product of deviation of
observations from their respective means and is denoted by Cov
(X,Y) and given by,
Copyright @ Dr. Sangita Birajdar 17
Properties of Covariance
1. Cov ( X, Y) = Cov (Y, X)
2. Cov (X, X) = Var (X)
3. Cov(X, -Y)= Cov(-X, Y) = - Cov(X, Y)
4. Cov (X, constant) = 0
5. Effect of change of origin:Covariance is invariant of change of origin
i.e. Cov (X - a, Y - b) = Cov (X, Y), where ‘a’ and ‘b’ are constants.
6. Effect of Change of origin and Scale: Covariance is invariant of change of
origin but variant on change of scale.
Copyright @ Dr. Sangita Birajdar 18
KARL PEARSON’S CORRELATION COEFFICIENT OR PRODUCT MOMENT CORRELATION
COEFFICIENT
Definition: Let (x1
, y1
), (x2
, y2
), …, (xn
, yn
) is a bivariate data on (X, Y). The Karl
Pearson’s coefficient of correlation denoted by r or r(X,Y) or rxy
is defined as the
ratio of covariance to the product of standard deviation.
Copyright @ Dr. Sangita Birajdar 19
Properties of Karl Pearson’s Correlation Coefficient
1) Limits of Pearson’s correlation coefficient: Pearson’s correlation coefficient lies
between bet -1 to +1 i.e. -1 ≤ r ≤ +1
2) Corr(X, Y) = Corr(Y, X)
3) Corr (X, X) = 1
4) Corr ( -X, Y) = Corr (X, -Y) = -Corr (X, Y)
5) Effect of change of origin
Corr(X - a, Y - b) = Corr(X,Y), where ‘a’ and ‘b’ are constants.
6) Effect of Change of Origin and Scale
Statement: Pearson’s correlation coefficient is invariant to change of origin and scale.
Copyright @ Dr. Sangita Birajdar 20
Interpretation of r
Copyright @ Dr. Sangita Birajdar 21
Merits and Demerits of Karl Pearson’s Coefficient of Correlation
Merits:
1. It depends upon all the observations.
2. It gives the extent of linear association between two variables.
3. It also indicates the type of correlation.
Demerits:
1. It fails to measure the non-linear relationship between two
variables.
2. It is unduly affected by extreme values.
3. It cannot be calculated for qualitative data.
Copyright @ Dr. Sangita Birajdar 22
SPEARMAN’S RANK CORRELATION (R)
• There are some qualitative variables which are required to quantify in terms
of ranks for example beauty, honesty, temperament etc. In such cases the
characteristics need to be expressed in terms of ranks.
• Some quantitative variables are also there like income, weight etc. which
will be more meaningful when measured in terms of ranks.
• The British psychologist C.E. Spearman in 1904 developed a measure called
as Spearman’s Rank Correlation that calculates the linear association
between ranks assigned to qualitative variables measured on ordinal scale
as well as the quantitative variables measured on interval or ratio scale and
converted into ranks.
• The Spearman’s Rank Correlation coefficient has been derived from Karl
Pearson’s coefficient of correlation where the individual values of the
variables have been replaced by ranks and it has interpretation as like Karl
Pearson’s coefficient of correlation.
• Ranking: Ordered arrangement of items according to their merits.
• Rank: The number indicating the position in ranking.
Copyright @ Dr. Sangita Birajdar 23
SPEARMAN’S RANK CORRELATION (R)
• The Spearman’s Rank Correlation coefficient is denoted by R
and given by,
• Where, di
= Rank (X) – Rank (Y)
n = number of pairs of observation.
Note: The Spearman rank correlation coefficient R lies
between – 1 to +1.
• The data under consideration before proceeding with the
Spearman’s Rank Correlation evaluation. The ranks will be
assigned to both the variables either in ascending or
descending order.
Copyright @ Dr. Sangita Birajdar 24
Ranks with ties
• If the observations repeated two or more times in the data set
then they are said to be “tied”.
• Each of their ranks equal to the mean of the ranks of the
positions they occupy in the ordered data set and the next
observation will be assigned the rank, next to the rank already
assumed.
• The number of observations getting same rank is called as
length of the tie and it is denoted by m.
Copyright @ Dr. Sangita Birajdar 25
Ranks with ties
• For example, in the data set 70, 74, 74, 78, and 79 kg,
observation 2nd
and 3rd
are tied; the mean of 2 and 3 is 2.5, so
the ranks of the five data are 1, 2.5, 2.5, 4, and 5. And the
length of the tie will be m=2.
• In the data set 1.6, 1.7, 1.9, 1.9, and 1.9, observation 3rd
, 4th
and 5th
are tied; the mean of 3, 4, and 5 is 4, so the ranks of
the five data are 1, 2, 4, 4, 4. In this case the length of the tie
will be m=3.
• Then formula for spearman’s rank correlation with ties is as,
Copyright @ Dr. Sangita Birajdar 26
Ranks with ties
• Then formula for spearman’s rank correlation with ties is as,
Where,
Copyright @ Dr. Sangita Birajdar 27
Merits and Demerits of Spearman’s rank correlation
• Merits of Spearman’s rank correlation
1. It depends upon all the observations.
2. It the linear association between ranks assigned to
qualitative variables measured on ordinal scale as well as the
quantitative variables converted into ranks.
3. It also indicates the type of correlation.
• Demerits of Spearman’s rank correlation
1. It is only approximate measure as actual values are not
used for calculations.
2. It is difficult to calculate Spearman’s rank correlation when
the numbers of ties are too many.
Copyright @ Dr. Sangita Birajdar 28
Copyright @ Dr. Sangita Birajdar 29

More Related Content

PPTX
Marketing channel
DOCX
Philips curve
PPT
Monetary policy
PDF
Theory of unbalanced_growth
PDF
The Kaldor Hicks Compensation Principle
PDF
Index number
PPTX
Various statistical software's in data analysis.
PDF
Civil registration and vital statistics in india
Marketing channel
Philips curve
Monetary policy
Theory of unbalanced_growth
The Kaldor Hicks Compensation Principle
Index number
Various statistical software's in data analysis.
Civil registration and vital statistics in india

What's hot (20)

PPTX
Analysis of 15th Finance Commission's First Report
PPT
Application of SPSS by umakant bhaskar gohatre
PPTX
Permanent income hypothesis
PPTX
UNIT 4NUTRITIONAL ASSESSMENT.pptx
PPTX
Research Methodology (RM)- Scaling Techniques- MBA
PDF
Monetary model of exchange rates
PPTX
Poverty committee in India
DOCX
DUMMY VARIABLE REGRESSION MODEL
PPSX
The LM curve
PDF
Poverty Mapping: An overview of methods, based on a Malawi
PPTX
Banking sector reforms in india
PPTX
Credit control method
PPT
Monetary policy
PPTX
Standard error
PPSX
Structuralist theory of inflation
PPTX
Inflation
PPTX
Social accounting
PPTX
TOOLS OF MONETARY POLICY
PPTX
Research Methodology, Objectives, Types and Significance of Research
PPTX
Measuring of risk
Analysis of 15th Finance Commission's First Report
Application of SPSS by umakant bhaskar gohatre
Permanent income hypothesis
UNIT 4NUTRITIONAL ASSESSMENT.pptx
Research Methodology (RM)- Scaling Techniques- MBA
Monetary model of exchange rates
Poverty committee in India
DUMMY VARIABLE REGRESSION MODEL
The LM curve
Poverty Mapping: An overview of methods, based on a Malawi
Banking sector reforms in india
Credit control method
Monetary policy
Standard error
Structuralist theory of inflation
Inflation
Social accounting
TOOLS OF MONETARY POLICY
Research Methodology, Objectives, Types and Significance of Research
Measuring of risk
Ad

Similar to Correlation.pptx.pdf (20)

PPTX
Correlation analysis
PPTX
correlation ;.pptx
PPTX
correlation.pptx
PDF
P G STAT 531 Lecture 9 Correlation
PPTX
Correlation and Its Types with Questions and Examples
PPTX
DciupewdncupiercnuiperhcCORRELATION-ANALYSIS.pptx
PPTX
Correlation
PPTX
Correlation.pptx
PDF
PPTX
Correlation Analysis in the statistics.pptx
PPTX
Correlation analysis
PPTX
Correlation- an introduction and application of spearman rank correlation by...
PPT
Correlation IN STATISTICS
PPT
correlation.ppt
PPTX
Correlation and Regression
PPTX
correlation-ppt [Autosaved].pptx statistics in BBA from parul University
PPTX
Introduction to Educational statistics and measurement Unit 2
PPTX
Biostatistics - Correlation explanation.pptx
PPTX
Correlation and regression impt
Correlation analysis
correlation ;.pptx
correlation.pptx
P G STAT 531 Lecture 9 Correlation
Correlation and Its Types with Questions and Examples
DciupewdncupiercnuiperhcCORRELATION-ANALYSIS.pptx
Correlation
Correlation.pptx
Correlation Analysis in the statistics.pptx
Correlation analysis
Correlation- an introduction and application of spearman rank correlation by...
Correlation IN STATISTICS
correlation.ppt
Correlation and Regression
correlation-ppt [Autosaved].pptx statistics in BBA from parul University
Introduction to Educational statistics and measurement Unit 2
Biostatistics - Correlation explanation.pptx
Correlation and regression impt
Ad

Recently uploaded (20)

PDF
Trump Administration's workforce development strategy
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PPTX
20th Century Theater, Methods, History.pptx
PDF
My India Quiz Book_20210205121199924.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Hazard Identification & Risk Assessment .pdf
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Trump Administration's workforce development strategy
Weekly quiz Compilation Jan -July 25.pdf
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
A powerpoint presentation on the Revised K-10 Science Shaping Paper
AI-driven educational solutions for real-life interventions in the Philippine...
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
20th Century Theater, Methods, History.pptx
My India Quiz Book_20210205121199924.pdf
Computing-Curriculum for Schools in Ghana
LDMMIA Reiki Yoga Finals Review Spring Summer
Hazard Identification & Risk Assessment .pdf
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
Unit 4 Computer Architecture Multicore Processor.pptx
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape

Correlation.pptx.pdf

  • 1. MIT Arts, commerce and science college, alandi Measures of Correlation Presented by Prof. Dr. Sangita Birajdar Assistant Professor, Department of Statistics, MIT ACSC, Alandi Copyright @ Dr. Sangita Birajdar 1
  • 2. OBJECTIVES In this Unit you are going to learn correlation and its measures: 1. Types of data 2. Concept and meaning of correlation 3. Types of correlation 4. Scatter diagram, its interpretation and merits and demerits 5. Covariance and its properties 6. Karl Pearson’s Coefficient of Correlation, its properties and its interpretation 7. Spearman’s Rank correlation coefficient and its interpretation. 8. Concept and meaning of regression 9. Lines of regression 10. Regression coefficients, their properties and interpretation 11. Numerical examples and problems Copyright @ Dr. Sangita Birajdar 2
  • 3. TYPES OF DATA Univariate data: If a single variable which can be measured with only one characteristic under study is known as Univariate data. For Example: Marks of students in particular subject, Monthly income of workers, Height of individuals, Blood pressure of adults, etc. Bivariate data: Variables which can be measured with two characteristics at a time with same unit under study are known as bivariate data. For example: Day temperature and ice cream sales, income and expenditure of families, Height and weight of students, Monthly electricity bill and consumption are the examples of bivariate data. Copyright @ Dr. Sangita Birajdar 3
  • 4. TYPES OF DATA Trivariate Data: When the data involves three variables, it is categorized under trivariate data. For example, age, weight and blood pressure of a person, yield of a crop, temperature and amount of fertiliser used, price, demand and supply of a commodity, etc. are the examples of trivariate data. Bivariate data (Definition): A set of n pairs of observations related to two variables X and Y are (x1 , y1 ), (x2 , y2 ), (x3 , y3 ), …, (xn , yn ) under study is a bivariate data. Copyright @ Dr. Sangita Birajdar 4
  • 5. CONCEPT OF CORRELATION Copyright @ Dr. Sangita Birajdar 5 • The mathematical measure of correlation was given by the great Mathematician and Bio-statistician Karl Pearson in 1896 in the form of correlation coefficient. • This was extensively used by Sir Francis Galton to explain many phenomena in biology and genetics. • Correlation is a statistical technique that shows whether pairs of variables are related to each other and how strongly they are correlated. • The extent of linear relationship between two variables is called as correlation. • It measures the intensity of relationship between two variables and not the causation. It means that, correlation is not a cause and effect relationship. • When two variables are correlated, increase or decrease in the values of one variable corresponds to decrease or increase in the values of another variable.
  • 6. Types of Correlation Copyright @ Dr. Sangita Birajdar 6 Depending on direction of changes in pairs of variables, correlation classified into following three types: • Positive Correlation. • Negative Correlation. • Zero Correlation.
  • 7. Positive Correlation • Copyright @ Dr. Sangita Birajdar 7 A positive correlation indicates the extent to which both variables increase and decrease in parallel or values of variable changes in same direction i.e. when values of one variable increases as the other variable increases, or values of one variable decreases while the values of other variable also decreases. This type of correlation is said to be direct correlation. For example: 1. Marks obtained in an examination by a group of students are positively correlated with the number of hours the students studied for examination. 2. Sale of ice cream is positively correlated with day temperature. 3. Height and weight of a group of persons is positively correlated as height increases on an average weight also increases.
  • 8. Negative Correlation • Copyright @ Dr. Sangita Birajdar 8 A negative correlation indicates the extent to which values of variable changes in opposite direction, i.e. when the value of one variable increases (decreases) as the value of other variable decreases (increases). This correlation is said to be invers correlation. For example: 1. Volume and pressure of a perfect gas. 2. Supply and price of a commodity. 3. Negative correlation would be as the slope of a hill increase, the amount of speed a walker reaches may decrease.
  • 9. Zero Correlation Zero correlation means no relationship between the two variables X and Y; i.e. the change in one variable (X) is not associated with the change in the other variable (Y). For example, body weight and intelligence, shoe size and monthly salary, amount of tea drunk and level of intelligence, etc. Copyright @ Dr. Sangita Birajdar 9
  • 10. Think about the following • Age and weight of person. • Blood pressure of a group of bulky persons and their weights. • Speed of the vehicle and time required to stop the vehicle after applying break • Selling prices of flats and its distance from the central place • The crop yield and rainfall (up to certain extent) • Marks in English and Marks in Mathematics. • Height and marks obtained by students. • Demand and price of the commodity. • Amount of cereal in meal and maintaining healthy weight. • Sale of woolen garments and day temperature. • Student who spent more time on social media, they perform poor in examination. • Shoe size and monthly salary. • Amount of tea drunk and level of intelligence Copyright @ Dr. Sangita Birajdar 10
  • 11. Remember that !!! • If increase or decrease in the values of one variable does not correspond to decrease or increase in the values of another variable then two variables will be uncorrelated. • Sometimes, the relationship between two variables is simple incidence. For instance, the relation between the arrival of migratory birds in a sanctuary and the birth rates in the locality. Such correlation may be attributed to chance. • A third variable’s impact on two variables may give rise to a relation between the two variables. For instance, the relation between illiteracy and crime arises due to increase in population. Copyright @ Dr. Sangita Birajdar 11
  • 12. Measures of correlation 1) Scatter Diagram 2) Karl Pearson’s Coefficient of Correlation. 3) Spearman’s Rank Correlation. Copyright @ Dr. Sangita Birajdar 12
  • 13. Scatter Diagram • A scatter plot visualise relationships or association between two variables. • It is simplest and attractive method of diagrammatic representation of bivariate data that gives the idea about whether the variables are correlated or not. • In this method each pair of observation are represented by a point in XY plane and can be defined as follows: Definition: Suppose {(xi , yi ); i =1, 2, ..., n} are bivariate data related to two variables X and Y. If the pairs of n observation are plotted on XY plan by taking one variable on X axis and other on Y axis with corresponding to every ordered pairs (xi , yi ) to get a dots or points, such a diagram of dots known as scatter diagram. Copyright @ Dr. Sangita Birajdar 13
  • 14. Scatter Diagram Copyright @ Dr. Sangita Birajdar 14
  • 15. Merits and Demerits of Scatter Diagram Merits of Scatter Diagram 1) Scatter diagram is the simplest measure of correlation that enables to get rough idea of the nature of the relationship between two variables. 2) It is easy to understand. 3) It is not influenced by extreme values. 4) The scatter diagram enables to obtain line of best fit by free hand method. Demerits of Scatter Diagram 1) It fails to give the magnitude (numeric value) of correlation. 2) It is not useful for qualitative data. 3) It is a subjective method. Copyright @ Dr. Sangita Birajdar 15
  • 16. Exercise What type of correlation you expect in the following situation? 1) A student who has many absentees has a decrease in grades. 2) The longer someone invests the more compound interest he will earn. 3) The less time I spend marketing my business, the fewer new customers I will have. 4) As the temperature goes up, ice cream sales also go up. 5) When an employee works more hours his pay check increases proportionately. 6) As one exercise more, his Marks in English is less. 7) The older a man gets the less hair that he has. 8) More number of errors in computer program takes longer time to run a program Copyright @ Dr. Sangita Birajdar 16
  • 17. Covariance Covariance: Joint variation between two variables. If two variables are correlated then Cov(X, Y) ≠ 0 but if they are not correlated then Cov(X, Y) = 0 If {(xi , yi ), i=1, 2, 3,…,n} is a bivariate data on (X,Y), then covariance is define as arithmetic mean of product of deviation of observations from their respective means and is denoted by Cov (X,Y) and given by, Copyright @ Dr. Sangita Birajdar 17
  • 18. Properties of Covariance 1. Cov ( X, Y) = Cov (Y, X) 2. Cov (X, X) = Var (X) 3. Cov(X, -Y)= Cov(-X, Y) = - Cov(X, Y) 4. Cov (X, constant) = 0 5. Effect of change of origin:Covariance is invariant of change of origin i.e. Cov (X - a, Y - b) = Cov (X, Y), where ‘a’ and ‘b’ are constants. 6. Effect of Change of origin and Scale: Covariance is invariant of change of origin but variant on change of scale. Copyright @ Dr. Sangita Birajdar 18
  • 19. KARL PEARSON’S CORRELATION COEFFICIENT OR PRODUCT MOMENT CORRELATION COEFFICIENT Definition: Let (x1 , y1 ), (x2 , y2 ), …, (xn , yn ) is a bivariate data on (X, Y). The Karl Pearson’s coefficient of correlation denoted by r or r(X,Y) or rxy is defined as the ratio of covariance to the product of standard deviation. Copyright @ Dr. Sangita Birajdar 19
  • 20. Properties of Karl Pearson’s Correlation Coefficient 1) Limits of Pearson’s correlation coefficient: Pearson’s correlation coefficient lies between bet -1 to +1 i.e. -1 ≤ r ≤ +1 2) Corr(X, Y) = Corr(Y, X) 3) Corr (X, X) = 1 4) Corr ( -X, Y) = Corr (X, -Y) = -Corr (X, Y) 5) Effect of change of origin Corr(X - a, Y - b) = Corr(X,Y), where ‘a’ and ‘b’ are constants. 6) Effect of Change of Origin and Scale Statement: Pearson’s correlation coefficient is invariant to change of origin and scale. Copyright @ Dr. Sangita Birajdar 20
  • 21. Interpretation of r Copyright @ Dr. Sangita Birajdar 21
  • 22. Merits and Demerits of Karl Pearson’s Coefficient of Correlation Merits: 1. It depends upon all the observations. 2. It gives the extent of linear association between two variables. 3. It also indicates the type of correlation. Demerits: 1. It fails to measure the non-linear relationship between two variables. 2. It is unduly affected by extreme values. 3. It cannot be calculated for qualitative data. Copyright @ Dr. Sangita Birajdar 22
  • 23. SPEARMAN’S RANK CORRELATION (R) • There are some qualitative variables which are required to quantify in terms of ranks for example beauty, honesty, temperament etc. In such cases the characteristics need to be expressed in terms of ranks. • Some quantitative variables are also there like income, weight etc. which will be more meaningful when measured in terms of ranks. • The British psychologist C.E. Spearman in 1904 developed a measure called as Spearman’s Rank Correlation that calculates the linear association between ranks assigned to qualitative variables measured on ordinal scale as well as the quantitative variables measured on interval or ratio scale and converted into ranks. • The Spearman’s Rank Correlation coefficient has been derived from Karl Pearson’s coefficient of correlation where the individual values of the variables have been replaced by ranks and it has interpretation as like Karl Pearson’s coefficient of correlation. • Ranking: Ordered arrangement of items according to their merits. • Rank: The number indicating the position in ranking. Copyright @ Dr. Sangita Birajdar 23
  • 24. SPEARMAN’S RANK CORRELATION (R) • The Spearman’s Rank Correlation coefficient is denoted by R and given by, • Where, di = Rank (X) – Rank (Y) n = number of pairs of observation. Note: The Spearman rank correlation coefficient R lies between – 1 to +1. • The data under consideration before proceeding with the Spearman’s Rank Correlation evaluation. The ranks will be assigned to both the variables either in ascending or descending order. Copyright @ Dr. Sangita Birajdar 24
  • 25. Ranks with ties • If the observations repeated two or more times in the data set then they are said to be “tied”. • Each of their ranks equal to the mean of the ranks of the positions they occupy in the ordered data set and the next observation will be assigned the rank, next to the rank already assumed. • The number of observations getting same rank is called as length of the tie and it is denoted by m. Copyright @ Dr. Sangita Birajdar 25
  • 26. Ranks with ties • For example, in the data set 70, 74, 74, 78, and 79 kg, observation 2nd and 3rd are tied; the mean of 2 and 3 is 2.5, so the ranks of the five data are 1, 2.5, 2.5, 4, and 5. And the length of the tie will be m=2. • In the data set 1.6, 1.7, 1.9, 1.9, and 1.9, observation 3rd , 4th and 5th are tied; the mean of 3, 4, and 5 is 4, so the ranks of the five data are 1, 2, 4, 4, 4. In this case the length of the tie will be m=3. • Then formula for spearman’s rank correlation with ties is as, Copyright @ Dr. Sangita Birajdar 26
  • 27. Ranks with ties • Then formula for spearman’s rank correlation with ties is as, Where, Copyright @ Dr. Sangita Birajdar 27
  • 28. Merits and Demerits of Spearman’s rank correlation • Merits of Spearman’s rank correlation 1. It depends upon all the observations. 2. It the linear association between ranks assigned to qualitative variables measured on ordinal scale as well as the quantitative variables converted into ranks. 3. It also indicates the type of correlation. • Demerits of Spearman’s rank correlation 1. It is only approximate measure as actual values are not used for calculations. 2. It is difficult to calculate Spearman’s rank correlation when the numbers of ties are too many. Copyright @ Dr. Sangita Birajdar 28
  • 29. Copyright @ Dr. Sangita Birajdar 29