SlideShare a Scribd company logo
Module–4: Correlation & Regression
Analysis
By:
Kishlay Kumar
Assistant Professor
Faculty of Business Management
Correlation Analysis
Meaning
• Correlation analysis deals with association between two or more variable.
• The degree of relationship between the variable under the consideration is measured the
correlation analysis.
• The measure of correlation called the “Correlation Coefficient” or “Correlation Index”
summarizes in one figure the direction & degree of correlation.
Notes:
The direction is determined by whether one variable generally increases or decreases when the other
variable increases
For Examples:
1. Family income and expenditure on luxury items.
2. Sales revenue and expenses incurred on advertising
3. Frequency of smoking and lung damage
4. Weight and height of individuals.
5. Age and hours of TV viewing per day
• “Correlation is an analysis of the covariation between two or more variables.” —A.M. Tuttle
• “When the relationship is of a quantitative nature, the appropriate statistical tool for discovering
and measuring the relationship and expressing it in a brief formula is known as correlation”. —
Croxton and Cowden
Types of Correlation
There are three broad types of correlations:
1. Positive and negative,
2. Linear and non-linear,
3. Simple, partial, and multiple.
Positive and Negative Correlation
• A positive (or direct) correlation refers to the same direction of change in the values of variables.
In other words, if values of variables are varying (i.e., increasing or decreasing) in the same
direction, then such correlation is referred to as positive correlation.
• A negative (or inverse) correlation refers to the change in the values of variables in opposite
direction.
• Examples (Positive)
1. Heights and weights.
2. The family income and expenditure on luxury items.
3. Amount of rainfall and yield of crop (up to a point).
4. Price and supply of a commodity and so on
• Examples: (Negative)
1. Price and demand of a commodity.
2. Volume and pressure of a perfect gas.
3. Sale of woolen garments and the day temperature, and so on.
Simple, Partial, and Multiple Correlation
• The distinction between simple, partial, and multiple correlation is based upon the number of variables
involved in the correlation analysis.
• If only two variables are chosen to study correlation between them, then such a correlation is referred to
as simple correlation. A study on the yield of a crop with respect to only amount of fertilizer, or sales
revenue with respect to amount of money spent on advertisement, are a few examples of simple
correlation.
• In partial correlation, two variables are chosen to study the correlation between them, but the effect of
other influencing variables is kept constant. For example (i) yield of a crop is influenced by the amount
of fertilizer applied, rainfall, quality of seed, type of soil, and pesticides, (ii) sales revenue from a
product is influenced by the level of advertising expenditure, quality of the product, price, competitors,
distribution, and so on. In such cases an attempt to measure the correlation between yield and seed
quality, assuming that the average values of other factors exist, becomes a problem of partial correlation.
• In multiple correlation, the relationship between more than three variables is considered
simultaneously for study. For example, employer-employee relationship in any organization may be
examined with reference to, training and development facilities; medical, housing, and education to
children facilities; salary structure; grievances handling system; and so on.
Linear and Non- Linear Correlation
• Linear Correlation
A linear correlation implies a constant change in one of the variable values with respect to a change
in the corresponding values of another variable. In other words, a correlation is referred to as linear
correlation when variations in the values of two variables have a constant ratio. The following
example illustrates a linear correlation between two variables x and y.
When these pairs of values of x and y are plotted on a graph paper, the line joining these points
would be a straight line.
X 10 20 30 40 50
y 40 80 120 160 200
• Non- Linear Correlation
Correlation would be called non-linear or curvi-linear if the amount of change in one variable does
not bear a constant ratio to the amount of change in the other variable.
The following example illustrates a non-linear correlation between two variables x and y.
x 8 9 10 12 13 18 22 29
y 80 130 170 150 230 560 460 600
METHODS OF STUDYING CORRELATION
• The correlation between two ratio-scaled (numeric) variables is represented by the letter r which
takes on values between –1 and +1 only. Sometimes this measure is called the ‘Pearson product
moment correction’ or the correlation coefficient.
The following methods of finding the correlation coefficient between two variables x and y are
discussed:
1. Scatter Diagram method
2. Karl Pearson’s Coefficient of Correlation method
3. Spearman’s Rank Correlation method
4. Method of Least-squares
Karl Pearson’s Correlation Coefficient
• A mathematical method for measuring the intensity or the magnitude of relationship between two
variable series was suggested by Karl Pearson (1867-1936), a great British Bio-metrician and
Statistician and is by far the most widely used method in practice.
Karl Pearson’s measure, known as Pearsonian correlation coefficient between two
variables (series) X and Y, usually denoted by r (X, Y) or rxy or simply r, is a numerical measure of
linear relationship between them and is defined as the ratio of the covariance between X and Y.
Module 4- Correlation & Regression Analysis.pptx
Spearman’s Rank Correlation Coefficient
• This method of finding the correlation coefficient between two variables was developed by the
British psychologist Charles Edward Spearman in 1904. This method is applied to measure the
association between two variables when only ordinal (or rank) data are available. In other words,
this method is applied in a situation in which quantitative measure of certain qualitative factors
such as judgement, brands personalities, TV programmes, leadership, colour, taste, cannot be
fixed, but individual observations can be arranged in a definite order (also called rank).
where,
R = rank correlation coefficient
R1 = rank of observations with respect to first variable
R2 = rank of observations with respect to second variable
d = R1 – R2, difference in a pair of ranks
n = number of paired observations or individuals being ranked
• When Ranks are not Given
When pairs of observations in the data set are not ranked as in Case 1, the ranks are assigned by
taking either the highest value or the lowest value as 1 for both the variable’s values.
• When Ranks are Equal
While ranking observations in the data set by taking either the highest value or lowest value as rank
1, we may come across a situation of more than one observations being of equal size. In such a case
the rank to be assigned to individual observations is an average of the ranks which these individual
observations would have got had they differed from each other.
For example, if two observations are ranked equal at third place, then the average rank of (3 + 4)/2 =
3.5 is assigned to these two observations. Similarly, if three observations are ranked equal at third
place, then the average rank of (3 + 4 + 5)/3 = 4 is assigned to these three observations.
Regression Analysis
• The statistical tool with the help of which we are in a position to estimate( or predict) the unknown
values of one variable from known values of another variable is called regression. With the help of
regression analysis we are in position to find out the average probable change in one variable
given a certain amount of change in another.
• The literal or dictionary meaning of the word ‘Regression’ is ‘stepping back or returning to the
average value’
• The term regression was used in 1877 by Sir Francis Galton while studying the relationship
between the height of father and sons. His study of height of about one thousand fathers and sons
revealed a very interesting relationship, i.e., tall fathers tend to have tall sons and short fathers,
short sons; but the average height of the sons of a group of tall fathers is less than that of the tall
fathers and average height of the sons of a group of short fathers is greater than that of a short
fathers.
• Differences between correlation and regression analysis are as follows:
1. Developing an algebraic equation between two variables from sample data and predicting the
value of one variable, given the value of the other variable is referred to as regression analysis,
while measuring the strength (or degree) of the relationship between two variables is referred as
correlation analysis. The sign of correlation coefficient indicates the nature (direct or inverse) of
relationship between two variables, while the absolute value of correlation coefficient indicates
the extent of relationship.
2. Correlation analysis determines an association between two variables x and y but not that they
have a cause-and-effect relationship. Regression analysis, in contrast to correlation, determines
the cause-and-effect relationship between x and y, that is, a change in the value of independent
variable x causes a corresponding change (effect) in the value of dependent variable y if all
other factors that affect y remain unchanged.
3. In linear regression analysis one variable is considered as dependent variable and other as
independent variable, while in correlation analysis both variables are considered to be
independent

More Related Content

PDF
Correlation and Regression.pdf
PPTX
Module - 2 correlation and regression.pptx
PPTX
Module - 2 correlation and regression.pptx
PPTX
Correlation and regression
PPTX
Biostatistics - Correlation explanation.pptx
PPTX
Correlation Studies - Descriptive Studies
PPT
Data analysis test for association BY Prof Sachin Udepurkar
PPTX
Correletion.pptx
Correlation and Regression.pdf
Module - 2 correlation and regression.pptx
Module - 2 correlation and regression.pptx
Correlation and regression
Biostatistics - Correlation explanation.pptx
Correlation Studies - Descriptive Studies
Data analysis test for association BY Prof Sachin Udepurkar
Correletion.pptx

Similar to Module 4- Correlation & Regression Analysis.pptx (20)

PPTX
Measure of Relationship: Correlation Coefficient
PPTX
Correlation Analysis
PPTX
Data Processing and Statistical Treatment: Spreads and Correlation
PPT
Correlation IN STATISTICS
PPTX
concept of correlation analysis in data science
PPTX
Correlation and Regression Analysis.pptx
PDF
Correlation Computations Thiyagu
PPT
CORRELATION.ppt
PDF
01 psychological statistics 1
PPTX
Correlation analysis
PPT
Correltional research
PPT
Correlation Research_Arslan Sheikh_PhD Scholar
PPTX
DciupewdncupiercnuiperhcCORRELATION-ANALYSIS.pptx
PPTX
DESCRIBING DATA ANALYSIS IN DATA SCIENCE
PPTX
Correlation and regression
PDF
Study of Correlation
PDF
Artificial Intelligence (Unit - 8).pdf
PPTX
Correlational Methods and Statistics-1.pptx
PDF
P G STAT 531 Lecture 9 Correlation
PPT
Correlational research
Measure of Relationship: Correlation Coefficient
Correlation Analysis
Data Processing and Statistical Treatment: Spreads and Correlation
Correlation IN STATISTICS
concept of correlation analysis in data science
Correlation and Regression Analysis.pptx
Correlation Computations Thiyagu
CORRELATION.ppt
01 psychological statistics 1
Correlation analysis
Correltional research
Correlation Research_Arslan Sheikh_PhD Scholar
DciupewdncupiercnuiperhcCORRELATION-ANALYSIS.pptx
DESCRIBING DATA ANALYSIS IN DATA SCIENCE
Correlation and regression
Study of Correlation
Artificial Intelligence (Unit - 8).pdf
Correlational Methods and Statistics-1.pptx
P G STAT 531 Lecture 9 Correlation
Correlational research
Ad

Recently uploaded (20)

PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PDF
Dr. Enrique Segura Ense Group - A Self-Made Entrepreneur And Executive
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Training And Development of Employee .pdf
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Types of control:Qualitative vs Quantitative
PPTX
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
Nidhal Samdaie CV - International Business Consultant
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
MSPs in 10 Words - Created by US MSP Network
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PPT
Data mining for business intelligence ch04 sharda
PDF
IFRS Notes in your pocket for study all the time
PDF
WRN_Investor_Presentation_August 2025.pdf
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
Traveri Digital Marketing Seminar 2025 by Corey and Jessica Perlman
Ôn tập tiếng anh trong kinh doanh nâng cao
unit 1 COST ACCOUNTING AND COST SHEET
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Dr. Enrique Segura Ense Group - A Self-Made Entrepreneur And Executive
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Training And Development of Employee .pdf
Unit 1 Cost Accounting - Cost sheet
Types of control:Qualitative vs Quantitative
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
Reconciliation AND MEMORANDUM RECONCILATION
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Nidhal Samdaie CV - International Business Consultant
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
MSPs in 10 Words - Created by US MSP Network
Belch_12e_PPT_Ch18_Accessible_university.pptx
Data mining for business intelligence ch04 sharda
IFRS Notes in your pocket for study all the time
WRN_Investor_Presentation_August 2025.pdf
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
Traveri Digital Marketing Seminar 2025 by Corey and Jessica Perlman
Ad

Module 4- Correlation & Regression Analysis.pptx

  • 1. Module–4: Correlation & Regression Analysis By: Kishlay Kumar Assistant Professor Faculty of Business Management
  • 2. Correlation Analysis Meaning • Correlation analysis deals with association between two or more variable. • The degree of relationship between the variable under the consideration is measured the correlation analysis. • The measure of correlation called the “Correlation Coefficient” or “Correlation Index” summarizes in one figure the direction & degree of correlation. Notes: The direction is determined by whether one variable generally increases or decreases when the other variable increases
  • 3. For Examples: 1. Family income and expenditure on luxury items. 2. Sales revenue and expenses incurred on advertising 3. Frequency of smoking and lung damage 4. Weight and height of individuals. 5. Age and hours of TV viewing per day • “Correlation is an analysis of the covariation between two or more variables.” —A.M. Tuttle • “When the relationship is of a quantitative nature, the appropriate statistical tool for discovering and measuring the relationship and expressing it in a brief formula is known as correlation”. — Croxton and Cowden
  • 4. Types of Correlation There are three broad types of correlations: 1. Positive and negative, 2. Linear and non-linear, 3. Simple, partial, and multiple.
  • 5. Positive and Negative Correlation • A positive (or direct) correlation refers to the same direction of change in the values of variables. In other words, if values of variables are varying (i.e., increasing or decreasing) in the same direction, then such correlation is referred to as positive correlation. • A negative (or inverse) correlation refers to the change in the values of variables in opposite direction. • Examples (Positive) 1. Heights and weights. 2. The family income and expenditure on luxury items. 3. Amount of rainfall and yield of crop (up to a point). 4. Price and supply of a commodity and so on
  • 6. • Examples: (Negative) 1. Price and demand of a commodity. 2. Volume and pressure of a perfect gas. 3. Sale of woolen garments and the day temperature, and so on.
  • 7. Simple, Partial, and Multiple Correlation • The distinction between simple, partial, and multiple correlation is based upon the number of variables involved in the correlation analysis. • If only two variables are chosen to study correlation between them, then such a correlation is referred to as simple correlation. A study on the yield of a crop with respect to only amount of fertilizer, or sales revenue with respect to amount of money spent on advertisement, are a few examples of simple correlation. • In partial correlation, two variables are chosen to study the correlation between them, but the effect of other influencing variables is kept constant. For example (i) yield of a crop is influenced by the amount of fertilizer applied, rainfall, quality of seed, type of soil, and pesticides, (ii) sales revenue from a product is influenced by the level of advertising expenditure, quality of the product, price, competitors, distribution, and so on. In such cases an attempt to measure the correlation between yield and seed quality, assuming that the average values of other factors exist, becomes a problem of partial correlation. • In multiple correlation, the relationship between more than three variables is considered simultaneously for study. For example, employer-employee relationship in any organization may be examined with reference to, training and development facilities; medical, housing, and education to children facilities; salary structure; grievances handling system; and so on.
  • 8. Linear and Non- Linear Correlation • Linear Correlation A linear correlation implies a constant change in one of the variable values with respect to a change in the corresponding values of another variable. In other words, a correlation is referred to as linear correlation when variations in the values of two variables have a constant ratio. The following example illustrates a linear correlation between two variables x and y. When these pairs of values of x and y are plotted on a graph paper, the line joining these points would be a straight line. X 10 20 30 40 50 y 40 80 120 160 200
  • 9. • Non- Linear Correlation Correlation would be called non-linear or curvi-linear if the amount of change in one variable does not bear a constant ratio to the amount of change in the other variable. The following example illustrates a non-linear correlation between two variables x and y. x 8 9 10 12 13 18 22 29 y 80 130 170 150 230 560 460 600
  • 10. METHODS OF STUDYING CORRELATION • The correlation between two ratio-scaled (numeric) variables is represented by the letter r which takes on values between –1 and +1 only. Sometimes this measure is called the ‘Pearson product moment correction’ or the correlation coefficient. The following methods of finding the correlation coefficient between two variables x and y are discussed: 1. Scatter Diagram method 2. Karl Pearson’s Coefficient of Correlation method 3. Spearman’s Rank Correlation method 4. Method of Least-squares
  • 11. Karl Pearson’s Correlation Coefficient • A mathematical method for measuring the intensity or the magnitude of relationship between two variable series was suggested by Karl Pearson (1867-1936), a great British Bio-metrician and Statistician and is by far the most widely used method in practice. Karl Pearson’s measure, known as Pearsonian correlation coefficient between two variables (series) X and Y, usually denoted by r (X, Y) or rxy or simply r, is a numerical measure of linear relationship between them and is defined as the ratio of the covariance between X and Y.
  • 13. Spearman’s Rank Correlation Coefficient • This method of finding the correlation coefficient between two variables was developed by the British psychologist Charles Edward Spearman in 1904. This method is applied to measure the association between two variables when only ordinal (or rank) data are available. In other words, this method is applied in a situation in which quantitative measure of certain qualitative factors such as judgement, brands personalities, TV programmes, leadership, colour, taste, cannot be fixed, but individual observations can be arranged in a definite order (also called rank). where, R = rank correlation coefficient R1 = rank of observations with respect to first variable R2 = rank of observations with respect to second variable d = R1 – R2, difference in a pair of ranks n = number of paired observations or individuals being ranked
  • 14. • When Ranks are not Given When pairs of observations in the data set are not ranked as in Case 1, the ranks are assigned by taking either the highest value or the lowest value as 1 for both the variable’s values. • When Ranks are Equal While ranking observations in the data set by taking either the highest value or lowest value as rank 1, we may come across a situation of more than one observations being of equal size. In such a case the rank to be assigned to individual observations is an average of the ranks which these individual observations would have got had they differed from each other. For example, if two observations are ranked equal at third place, then the average rank of (3 + 4)/2 = 3.5 is assigned to these two observations. Similarly, if three observations are ranked equal at third place, then the average rank of (3 + 4 + 5)/3 = 4 is assigned to these three observations.
  • 15. Regression Analysis • The statistical tool with the help of which we are in a position to estimate( or predict) the unknown values of one variable from known values of another variable is called regression. With the help of regression analysis we are in position to find out the average probable change in one variable given a certain amount of change in another. • The literal or dictionary meaning of the word ‘Regression’ is ‘stepping back or returning to the average value’ • The term regression was used in 1877 by Sir Francis Galton while studying the relationship between the height of father and sons. His study of height of about one thousand fathers and sons revealed a very interesting relationship, i.e., tall fathers tend to have tall sons and short fathers, short sons; but the average height of the sons of a group of tall fathers is less than that of the tall fathers and average height of the sons of a group of short fathers is greater than that of a short fathers.
  • 16. • Differences between correlation and regression analysis are as follows: 1. Developing an algebraic equation between two variables from sample data and predicting the value of one variable, given the value of the other variable is referred to as regression analysis, while measuring the strength (or degree) of the relationship between two variables is referred as correlation analysis. The sign of correlation coefficient indicates the nature (direct or inverse) of relationship between two variables, while the absolute value of correlation coefficient indicates the extent of relationship. 2. Correlation analysis determines an association between two variables x and y but not that they have a cause-and-effect relationship. Regression analysis, in contrast to correlation, determines the cause-and-effect relationship between x and y, that is, a change in the value of independent variable x causes a corresponding change (effect) in the value of dependent variable y if all other factors that affect y remain unchanged. 3. In linear regression analysis one variable is considered as dependent variable and other as independent variable, while in correlation analysis both variables are considered to be independent