SlideShare a Scribd company logo
4
Most read
5
Most read
6
Most read
INTRODUCTION TO STATISTICS &
PROBABILITY
Chapter 2:
Looking at Data–Relationships (Part 1)
1
Dr. Nahid Sultana
Chapter 2:
Looking at Data–Relationships
2
2.1: Scatterplots
2.2: Correlation
2.3: Least-Squares Regression
2.5: Data Analysis for Two-Way Tables
3
Objectives
 Bivariate data
 Explanatory and response variables
 Scatterplots
 Interpreting scatterplots
 Outliers
 Categorical variables in scatterplots
2.1: Scatterplots
Bivariate data
4
 For each individual studied, we record
data on two variables.
 We then examine whether there is a
relationship between these two
variables: Do changes in one variable
tend to be associated with specific
changes in the other variables?
Student
ID
Number
of Beers
Blood Alcohol
Content
1 5 0.1
2 2 0.03
3 9 0.19
6 7 0.095
7 3 0.07
9 3 0.02
11 4 0.07
13 5 0.085
4 8 0.12
5 3 0.04
8 5 0.06
10 5 0.05
12 6 0.1
14 7 0.09
15 1 0.01
16 4 0.05
Here we have two quantitative variables
recorded for each of 16 students:
1. how many beers they drank
2. their resulting blood alcohol content
(BAC)
5
 Many interesting examples of the use of statistics involve
relationships between pairs of variables.
Two variables measured on the same cases are associated if
knowing the value of one of the variables tells you something about
the values of the other variable that you would not know without this
information.
5
Associations Between Variables
 A response (dependent) variable measures an outcome of a study.
 An explanatory (independent) variable explains changes in the
response variable.
6
Scatterplot
6
 The most useful graph for displaying the relationship between two
quantitative variables on the same individuals is a scatterplot.
1. Decide which variable should go on which axis.
2. Typically, the explanatory or independent variable is plotted
on the x-axis, and the response or dependent variable is plotted
on the y-axis.
3. Label and scale your axes.
4. Plot individual data values.
How to Make a Scatterplot
7
Scatterplot (Cont…)
Example: Make a scatterplot of the relationship between body
weight and backpack weight for a group of hikers.
7
Body weight (lb) 120 187 109 103 131 165 158 116
Backpack weight (lb) 26 30 26 24 29 35 31 28
8
Interpreting Scatterplots
8
 After plotting two variables on a scatterplot, we describe the
overall pattern of the relationship. Specifically, we look for form,
direction, and strength .
Form: linear, curved, clusters, no pattern
Direction: positive, negative, no direction
Strength: how closely the points fit the “form”
… and clear deviations from that pattern
Outliers of the relationship, , an individual value that falls
outside the overall pattern of the relationship
How to Examine a Scatterplot
9
Linear
Nonlinear
No relationship
Interpreting Scatterplots (Cont…)
(Form)
10
Interpreting Scatterplots (Cont…)
(Direction)
Positive association: High values of one variable tend to occur
together with high values of the other variable.
Negative association: High values of one variable tend to occur
together with low values of the other variable
11
Interpreting Scatterplots (Cont…)
No relationship: X and Y vary independently. Knowing X tells you
nothing about Y.
12
Interpreting Scatterplots (Cont…)
(Strength)
The strength of the relationship between the two variables can be
seen by how much variation, or scatter, there is around the main
form.
13
Interpreting Scatterplots (Cont…)
(Outliers)
In a scatterplot, outliers are points that fall outside of the overall
pattern of the relationship.
14
Interpreting Scatterplots (Cont…)
Direction FormStrength
 There is one possible
outlier―the hiker with
the body weight of 187
pounds seems to be
carrying relatively less
weight than are the
other group members.
 There is a moderately strong, positive, linear relationship between body
weight and backpack weight.
 It appears that lighter hikers are carrying lighter backpacks.
How to scale a scatterplot
15
Using an inappropriate
scale for a scatterplot can
give an incorrect
impression.
Both variables should be
given a similar amount of
space:
• Plot roughly square
• Points should occupy all
the plot space (no blank
space)
Same data in all four plots
Categorical variables in scatterplots
16
What may look like a positive
linear relationship is in fact a
series of negative linear
associations.
Plotting different habitats in
different colors allows us to
make that important distinction.
To add a categorical variable, use a different plot color or symbol for
each category.
17
Categorical variables in scatterplots
(Cont…)
Comparison of men and women
racing records over time.
Each group shows a very strong
negative linear relationship that
would not be apparent without the
gender categorization.
Relationship between lean body
mass and metabolic rate in men
and women.
Both men and women follow the
same positive linear trend, but
women show a stronger association.
Categorical explanatory variables
When the explanatory variable is categorical, you cannot make a
scatterplot, but you can compare the different categories side by side on
the same graph (boxplots, or mean +/− standard deviation).
Comparison of income (quantitative
response variable) for different
education levels (five categories).
But be careful in your
interpretation: This is NOT a
positive association, because
education is not quantitative.

More Related Content

PDF
Displaying Distributions with Graphs
PDF
Chapter 2 part3-Least-Squares Regression
PPTX
2.4 Scatterplots, correlation, and regression
PPS
Correlation and regression
PDF
Machine Learning Algorithm - Linear Regression
PPTX
Chapter 3.1
PPTX
Scatterplots, Correlation, and Regression
PPT
cross tabulation
Displaying Distributions with Graphs
Chapter 2 part3-Least-Squares Regression
2.4 Scatterplots, correlation, and regression
Correlation and regression
Machine Learning Algorithm - Linear Regression
Chapter 3.1
Scatterplots, Correlation, and Regression
cross tabulation

What's hot (20)

PDF
Visualization-1
PDF
Assumptions of Linear Regression - Machine Learning
PPT
Regression
PPTX
Simple Linear Regression: Step-By-Step
PPT
More tabs
PPT
Regression
PDF
Graphical presentation of data
PPTX
8 correlation regression
PPTX
Simple regression and correlation
PPTX
Graphs that Enlighten and Graphs that Deceive
PPTX
Correlation and regression analysis
PDF
Statistics in nursing research
DOCX
Spss cross tab n chi sq bivariate analysis
PPTX
Presentation on Regression Analysis
PPTX
Chapter 16: Correlation (enhanced by VisualBee)
PPT
correlation and regression
PPT
Crosstabs
PDF
How to Make a Bar Graph
PPT
Crosstabs
PPTX
03.data presentation(2015) 2
Visualization-1
Assumptions of Linear Regression - Machine Learning
Regression
Simple Linear Regression: Step-By-Step
More tabs
Regression
Graphical presentation of data
8 correlation regression
Simple regression and correlation
Graphs that Enlighten and Graphs that Deceive
Correlation and regression analysis
Statistics in nursing research
Spss cross tab n chi sq bivariate analysis
Presentation on Regression Analysis
Chapter 16: Correlation (enhanced by VisualBee)
correlation and regression
Crosstabs
How to Make a Bar Graph
Crosstabs
03.data presentation(2015) 2
Ad

Viewers also liked (20)

PDF
Chapter-4: More on Direct Proof and Proof by Contrapositive
PDF
Chapter 3 part3-Toward Statistical Inference
PDF
Chapter 5 part1- The Sampling Distribution of a Sample Mean
PPT
Портрет слова группа 2
PDF
DMDL EditorXとToad Editorの紹介
PDF
Receiving your State Pension abroad
PPT
Проект Павленко "Безопасные каникулы".
PPT
proekti
PDF
Cheney Court - Linguarama
PPTX
Laboratory and physical assessment data (1)
PPT
Портфолио Чекусовой
PPT
Портрет слова группа 1
PPT
samoupravlenye
PPTX
LABORATORY AND PHYSICAL ASSESSMENT DATA (1)
PPT
Презентация памятники Волгодонска. Петрова Алла
PDF
Expecting Parents Guide to Birth Defects ebook
PDF
Space Hustlers Comic
PDF
(社)アンチエイジング学会 年会費特典
PDF
2016: A good year to invest in Spanish property?
PPTX
Impact of the greece downturn
Chapter-4: More on Direct Proof and Proof by Contrapositive
Chapter 3 part3-Toward Statistical Inference
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Портрет слова группа 2
DMDL EditorXとToad Editorの紹介
Receiving your State Pension abroad
Проект Павленко "Безопасные каникулы".
proekti
Cheney Court - Linguarama
Laboratory and physical assessment data (1)
Портфолио Чекусовой
Портрет слова группа 1
samoupravlenye
LABORATORY AND PHYSICAL ASSESSMENT DATA (1)
Презентация памятники Волгодонска. Петрова Алла
Expecting Parents Guide to Birth Defects ebook
Space Hustlers Comic
(社)アンチエイジング学会 年会費特典
2016: A good year to invest in Spanish property?
Impact of the greece downturn
Ad

Similar to Chapter 2 part1-Scatterplots (20)

PDF
Chapter 03 scatterplots and correlation
PPT
Scatterplots - LSRLs - RESIDs
PDF
the didactic material of Statistics II .pdf
PPTX
1133629601 400440
PPTX
Scattergrams
DOCX
Requirements.docxRequirementsFont Times New RomanI NEED .docx
PPT
Exploring bivariate data
PPTX
QR II Lect 15 (Bivariate analysis and scatter plot, correlation).pptx
PPTX
Scatter plot- Complete
PPTX
Correlation: Bivariate Data and Scatter Plot
PPT
Frequency Tables - Statistics
PDF
Maths A - Chapter 11
PPT
Stats For Life Module7 Oc
PPT
Data analysis test for association BY Prof Sachin Udepurkar
PPTX
Scatter plot diagram
PPT
Chapter 2 Relationships
PDF
Simple regressionand correlation (2).pdf
PPTX
Bivariate linear regression
PPTX
Intro to Graphing Data Powerpoint-7th and 8th Grade
PPT
Coefficient of Correlation Pearsons .ppt
Chapter 03 scatterplots and correlation
Scatterplots - LSRLs - RESIDs
the didactic material of Statistics II .pdf
1133629601 400440
Scattergrams
Requirements.docxRequirementsFont Times New RomanI NEED .docx
Exploring bivariate data
QR II Lect 15 (Bivariate analysis and scatter plot, correlation).pptx
Scatter plot- Complete
Correlation: Bivariate Data and Scatter Plot
Frequency Tables - Statistics
Maths A - Chapter 11
Stats For Life Module7 Oc
Data analysis test for association BY Prof Sachin Udepurkar
Scatter plot diagram
Chapter 2 Relationships
Simple regressionand correlation (2).pdf
Bivariate linear regression
Intro to Graphing Data Powerpoint-7th and 8th Grade
Coefficient of Correlation Pearsons .ppt

More from nszakir (15)

PDF
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
PDF
Chapter 2: Relations
PDF
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
PDF
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
PDF
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
PDF
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
PDF
Chapter 4 part4- General Probability Rules
PDF
Chapter 4 part3- Means and Variances of Random Variables
PDF
Chapter 4 part2- Random Variables
PDF
Chapter 4 part1-Probability Model
PDF
Chapter 3 part2- Sampling Design
PDF
Chapter 3 part1-Design of Experiments
PDF
Chapter 2 part2-Correlation
PDF
Density Curves and Normal Distributions
PDF
Describing Distributions with Numbers
Chapter-3: DIRECT PROOF AND PROOF BY CONTRAPOSITIVE
Chapter 2: Relations
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Con...
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 4 part4- General Probability Rules
Chapter 4 part3- Means and Variances of Random Variables
Chapter 4 part2- Random Variables
Chapter 4 part1-Probability Model
Chapter 3 part2- Sampling Design
Chapter 3 part1-Design of Experiments
Chapter 2 part2-Correlation
Density Curves and Normal Distributions
Describing Distributions with Numbers

Chapter 2 part1-Scatterplots

  • 1. INTRODUCTION TO STATISTICS & PROBABILITY Chapter 2: Looking at Data–Relationships (Part 1) 1 Dr. Nahid Sultana
  • 2. Chapter 2: Looking at Data–Relationships 2 2.1: Scatterplots 2.2: Correlation 2.3: Least-Squares Regression 2.5: Data Analysis for Two-Way Tables
  • 3. 3 Objectives  Bivariate data  Explanatory and response variables  Scatterplots  Interpreting scatterplots  Outliers  Categorical variables in scatterplots 2.1: Scatterplots
  • 4. Bivariate data 4  For each individual studied, we record data on two variables.  We then examine whether there is a relationship between these two variables: Do changes in one variable tend to be associated with specific changes in the other variables? Student ID Number of Beers Blood Alcohol Content 1 5 0.1 2 2 0.03 3 9 0.19 6 7 0.095 7 3 0.07 9 3 0.02 11 4 0.07 13 5 0.085 4 8 0.12 5 3 0.04 8 5 0.06 10 5 0.05 12 6 0.1 14 7 0.09 15 1 0.01 16 4 0.05 Here we have two quantitative variables recorded for each of 16 students: 1. how many beers they drank 2. their resulting blood alcohol content (BAC)
  • 5. 5  Many interesting examples of the use of statistics involve relationships between pairs of variables. Two variables measured on the same cases are associated if knowing the value of one of the variables tells you something about the values of the other variable that you would not know without this information. 5 Associations Between Variables  A response (dependent) variable measures an outcome of a study.  An explanatory (independent) variable explains changes in the response variable.
  • 6. 6 Scatterplot 6  The most useful graph for displaying the relationship between two quantitative variables on the same individuals is a scatterplot. 1. Decide which variable should go on which axis. 2. Typically, the explanatory or independent variable is plotted on the x-axis, and the response or dependent variable is plotted on the y-axis. 3. Label and scale your axes. 4. Plot individual data values. How to Make a Scatterplot
  • 7. 7 Scatterplot (Cont…) Example: Make a scatterplot of the relationship between body weight and backpack weight for a group of hikers. 7 Body weight (lb) 120 187 109 103 131 165 158 116 Backpack weight (lb) 26 30 26 24 29 35 31 28
  • 8. 8 Interpreting Scatterplots 8  After plotting two variables on a scatterplot, we describe the overall pattern of the relationship. Specifically, we look for form, direction, and strength . Form: linear, curved, clusters, no pattern Direction: positive, negative, no direction Strength: how closely the points fit the “form” … and clear deviations from that pattern Outliers of the relationship, , an individual value that falls outside the overall pattern of the relationship How to Examine a Scatterplot
  • 10. 10 Interpreting Scatterplots (Cont…) (Direction) Positive association: High values of one variable tend to occur together with high values of the other variable. Negative association: High values of one variable tend to occur together with low values of the other variable
  • 11. 11 Interpreting Scatterplots (Cont…) No relationship: X and Y vary independently. Knowing X tells you nothing about Y.
  • 12. 12 Interpreting Scatterplots (Cont…) (Strength) The strength of the relationship between the two variables can be seen by how much variation, or scatter, there is around the main form.
  • 13. 13 Interpreting Scatterplots (Cont…) (Outliers) In a scatterplot, outliers are points that fall outside of the overall pattern of the relationship.
  • 14. 14 Interpreting Scatterplots (Cont…) Direction FormStrength  There is one possible outlier―the hiker with the body weight of 187 pounds seems to be carrying relatively less weight than are the other group members.  There is a moderately strong, positive, linear relationship between body weight and backpack weight.  It appears that lighter hikers are carrying lighter backpacks.
  • 15. How to scale a scatterplot 15 Using an inappropriate scale for a scatterplot can give an incorrect impression. Both variables should be given a similar amount of space: • Plot roughly square • Points should occupy all the plot space (no blank space) Same data in all four plots
  • 16. Categorical variables in scatterplots 16 What may look like a positive linear relationship is in fact a series of negative linear associations. Plotting different habitats in different colors allows us to make that important distinction. To add a categorical variable, use a different plot color or symbol for each category.
  • 17. 17 Categorical variables in scatterplots (Cont…) Comparison of men and women racing records over time. Each group shows a very strong negative linear relationship that would not be apparent without the gender categorization. Relationship between lean body mass and metabolic rate in men and women. Both men and women follow the same positive linear trend, but women show a stronger association.
  • 18. Categorical explanatory variables When the explanatory variable is categorical, you cannot make a scatterplot, but you can compare the different categories side by side on the same graph (boxplots, or mean +/− standard deviation). Comparison of income (quantitative response variable) for different education levels (five categories). But be careful in your interpretation: This is NOT a positive association, because education is not quantitative.