SlideShare a Scribd company logo
Chapter No. 3
variability in data.
Measures of Variability
• Measures of variability describe the spread or
the dispersion of a set of data.
• Common Measures of Variability
–Range
–Interquartile Range
–Mean Absolute Deviation
–Variance
–Standard Deviation
– Z scores
–Coefficient of Variation
Variability
Mean
Mean
Mean
No Variability in Cash Flow
Variability in Cash Flow Mean
Variability
No Variability
Variability
Range
• The difference between the largest and
the smallest values in a set of data
• Simple to compute
• Ignores all data points
except the
two extremes
• Example:
Range =
Largest - Smallest
= 48 - 35 = 13
35
37
37
39
40
40
41
41
43
43
43
43
44
44
44
44
44
45
45
46
46
46
46
48
Interquartile Range
• Range of values between the first and third
quartiles
• Range of the “middle half”
• Less influenced by extremes
Interquartile Range Q Q
 
3 1
Deviation from the Mean
• Data set: 5, 9, 16, 17, 18
• Mean:
• Deviations from the mean: -8, -4, 3, 4, 5
   
 X
N
65
5
13
0 5 10 15 20
-8 -4
+3 +4
+5

Mean Absolute Deviation
• Average of the absolute deviations from
the mean
5
9
16
17
18
-8
-4
+3
+4
+5
0
+8
+4
+3
+4
+5
24
X X   X  
M A D
X
N
. . .
.




 
24
5
4 8
Population Variance
• Average of the squared deviations from
the arithmetic mean
5
9
16
17
18
-8
-4
+3
+4
+5
0
64
16
9
16
25
130
X X    
2
X  
 
2
2
130
5
26 0






 X
N
.
Population Standard Deviation
• Square root of the
variance
 
2
2
2
130
5
26 0
26 0
51











 X
N
.
.
.
5
9
16
17
18
-8
-4
+3
+4
+5
0
64
16
9
16
25
130
X X    
2
X  
Empirical Rule
• Data are normally distributed (or
approximately normal)
 
1
 
2
 
3
95
99.7
68
Distance from
the Mean
Percentage of Values
Falling Within Distance
• The Empirical Rule is a statement about normal
distributions.
Normal Distribution
A specific type of symmetrical distribution, also known as a bell-
shaped distribution
Empirical Rule
On a normal distribution about 68% of data will be within one
standard deviation of the mean, about 95% will be within two
standard deviations of the mean, and about 99.7% will be within
three standard deviations of the mean
Example: Pulse Rates
• Suppose the pulse rates of 200 college men are
bell-shaped with a mean of 72 and standard
deviation of 6.
• About 68% of the men have pulse rates in the
interval
• About 95% of the men have pulse rates in the interval
• About 99.7% of the men have pulse rates in the
interval
Example: IQ scores
• IQ scores are normally distributed with a mean of
100 and a standard deviation of 15.
• About 68% of individuals have IQ scores in the
interval
• About 95% of individuals have IQ scores in the
interval
• About 99.7% of individuals have IQ scores in the
interval
Answers
Sample Variance
• Average of the squared deviations from the
arithmetic mean
2,398
1,844
1,539
1,311
7,092
625
71
-234
-462
0
390,625
5,041
54,756
213,444
663,866
X X X
  
2
X X

 
2
2
1
663 866
3
221 288 67
S
X X
n






,
, .
Sample Standard Deviation
• Square root of the
sample variance  
2
2
2
1
663 866
3
221 288 67
221 288 67
470 41
S
X X
S
n
S









,
, .
, .
.
2,398
1,844
1,539
1,311
7,092
625
71
-234
-462
0
390,625
5,041
54,756
213,444
663,866
X X X
  
2
X X

Coefficient of Variation
• Ratio of the standard deviation to the
mean, expressed as a percentage
• Measurement of relative dispersion
 
C V
. .


100
Coefficient of Variation
 
 
1
29
4 6
100
4 6
29
100
1586
1
1
1
1









.
.
.
. .
CV  
 
2
84
10
100
10
84
100
1190
2
2
2
2









CV
. .
.
Measures of Shape
• Skewness
– Absence of symmetry
– Extreme values in one side of a
distribution
• Kurtosis
– Peakedness of a distribution
• Box and Whisker Plots
– Graphic display of a distribution
– Reveals skewness
Skewness
Negatively
Skewed
Positively
Skewed
Symmetric
(Not Skewed)
Skewness
Negatively
Skewed
Mode
Median
Mean
Symmetric
(Not Skewed)
Mean
Median
Mode
Positively
Skewed
Mode
Median
Mean
Coefficient of Skewness
• Summary measure for skewness
• If S < 0, the distribution is negatively skewed
(skewed to the left).
• If S = 0, the distribution is symmetric (not
skewed).
• If S > 0, the distribution is positively skewed
(skewed to the right).
 
S
Md


3 

Coefficient of Skewness
 
 
1
1
1
1
1 1
1
23
26
12 3
3
3 23 26
12 3
073












M
S
M
d
d
.
.
.
 
 
2
2
2
2
2 2
2
26
26
12 3
3
3 26 26
12 3
0












M
S
M
d
d
.
.
 
 
3
3
3
3
3 3
3
29
26
12 3
3
3 29 26
12 3
073












M
S
M
d
d
.
.
.
Kurtosis
• Peakedness of a distribution
– Leptokurtic: high and thin
– Mesokurtic: normal in shape
– Platykurtic: flat and spread out
Leptokurtic
Mesokurtic
Platykurtic
Box and Whisker Plot
• Five specific values are used:
–Median, Q2
–First quartile, Q1
–Third quartile, Q3
–Minimum value in the data set
–Maximum value in the data set
Box and Whisker Plot, continued
• Inner Fences
– IQR = Q3 - Q1
– Lower inner fence = Q1 - 1.5 IQR
– Upper inner fence = Q3 + 1.5 IQR
• Outer Fences
– Lower outer fence = Q1 - 3.0 IQR
– Upper outer fence = Q3 + 3.0 IQR
Box and Whisker Plot
Q1
Q3
Q2
Minimum Maximum
Skewness: Box and Whisker Plots,
and Coefficient of Skewness
Negatively
Skewed
Positively
Skewed
Symmetric
(Not Skewed)
S < 0 S = 0 S > 0
Statistical Measures of Variability      Range: The difference between the highest and lowest values     Interquartile Range (IQR): Middle 50% of the data
mean
Add everything up and divide by the
number of data points.
Ex. Find the mean of the data set:
3, 4, 6, 8, 3
median
• Middle point of the data
• To calculate:
– Put the values in order from smallest to largest
– If there is an odd number of data points, value in the
middle
– If there is an even number of data points, average the two
middle values
• Ex. Find the median of the data set:
6, 3, 9, 1, 7, 8, 3, 5, 3, 2, 7, 3, 8, 4, 9, 10
mode
• Value that occurs most frequently
• If no value occurs more than once, then the
data set has no mode
• Data set can have more than one mode
• Ex. Find the mode of the data set:
7, 4, 9, 2, 4, 3, 9, 1, 7, 8, 3, 2, 3, 3, 3, 5, 2
Example 1
• Data Set I
Mean is ____
Median is ____
Mode is ____
• Data Set II
Mean is ____
Median is ____
Mode is ____
Data Set I
Data Set II
Solution
• Data Set I
Mean is 483.8461538
Median is 400
Mode is 300
Data Set II
Mean is 474
Median is 350
Mode is 300
Data Set II
Data Set I
Five Number Summary
Used to construct a box-and-whisker plot
1.Minimum
2.Quartile 1 (Q1) – median of the lower data
points
3.Median
4.Quartile 3 (Q3) – median of the upper data
points
5.Maximum
Example 2
Use this set of data to make a box-and-whisker plot:
59, 27, 18, 78, 61, 91, 52, 34, 54, 93, 100, 87, 85, 82, 68
STEP 1: Write the numbers in numerical order, from
smallest to greatest.
STEP 2: Create a five number summary.
Minimum:
Quartile 1:
Median:
Quartile 3:
Max:
STEP 3: Construct and box-and-whisker plot.
Minimum:
Quartile 1:
Median:
Quartile 3:
Max:
Example 3: You try!
Use the following data set to create a five
number summary and construct the box-and-
whisker plot.
87, 7, 41, 50, 15, 220, 23, 99, 11, 45, 11, 61, 3,
39, 21
Solution
Minimum: 3
Q1: 11
Median: 39
Q3: 61
Maximum: 220

More Related Content

PPT
Chapter 3 Ken Black 2.ppt
PPTX
Biosttistics for ayurveda students and yoga students
PPTX
Lect 3 background mathematics
PPTX
Lect 3 background mathematics for Data Mining
PDF
3Measurements of health and disease_MCTD.pdf
PPT
Introduction to Statistics2312.ppt Maths
PPT
Introduction to Statistics measures2312.ppt
PPT
chapter no. 2. describing central tendency and variability .ppt
Chapter 3 Ken Black 2.ppt
Biosttistics for ayurveda students and yoga students
Lect 3 background mathematics
Lect 3 background mathematics for Data Mining
3Measurements of health and disease_MCTD.pdf
Introduction to Statistics2312.ppt Maths
Introduction to Statistics measures2312.ppt
chapter no. 2. describing central tendency and variability .ppt

Similar to Statistical Measures of Variability Range: The difference between the highest and lowest values Interquartile Range (IQR): Middle 50% of the data (20)

PPT
statisticsintroductionofbusinessstats.ppt
PDF
3. Descriptive statistics.pdf
PDF
PLG 500 Penaakulan Statistik dalam pendidikan
PPTX
Normal distribtion curve
PPT
Descriptive statistics -review(2)
PDF
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
PPT
Penggambaran Data Secara Numerik
PPTX
Revisionf2
PPT
Introduction to Statistics2312.ppt
PPT
Introduction to Statistics23122223.ppt
PPTX
ANALYSIS_OF_CONSTANTS[1].pptxggggggggggggggggggggggggggg
PPTX
Statr sessions 4 to 6
PDF
Statistics.pdf
PPTX
Measure of Variability Report.pptx
PDF
Res701 research methodology lecture 7 8-devaprakasam
PPT
T7 data analysis
PPTX
Biostatistics cource for clinical pharmacy
PPTX
Descriptive Statistics.pptx
PPT
Business Statistics Chapter Three Power points
PPTX
NORMAL CURVE in biostatistics and application
statisticsintroductionofbusinessstats.ppt
3. Descriptive statistics.pdf
PLG 500 Penaakulan Statistik dalam pendidikan
Normal distribtion curve
Descriptive statistics -review(2)
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
Penggambaran Data Secara Numerik
Revisionf2
Introduction to Statistics2312.ppt
Introduction to Statistics23122223.ppt
ANALYSIS_OF_CONSTANTS[1].pptxggggggggggggggggggggggggggg
Statr sessions 4 to 6
Statistics.pdf
Measure of Variability Report.pptx
Res701 research methodology lecture 7 8-devaprakasam
T7 data analysis
Biostatistics cource for clinical pharmacy
Descriptive Statistics.pptx
Business Statistics Chapter Three Power points
NORMAL CURVE in biostatistics and application
Ad

Recently uploaded (20)

PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Quality review (1)_presentation of this 21
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Lecture1 pattern recognition............
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Introduction to machine learning and Linear Models
PDF
Foundation of Data Science unit number two notes
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Mega Projects Data Mega Projects Data
Supervised vs unsupervised machine learning algorithms
IBA_Chapter_11_Slides_Final_Accessible.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Quality review (1)_presentation of this 21
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
.pdf is not working space design for the following data for the following dat...
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Qualitative Qantitative and Mixed Methods.pptx
Lecture1 pattern recognition............
Reliability_Chapter_ presentation 1221.5784
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to machine learning and Linear Models
Foundation of Data Science unit number two notes
STUDY DESIGN details- Lt Col Maksud (21).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Mega Projects Data Mega Projects Data
Ad

Statistical Measures of Variability Range: The difference between the highest and lowest values Interquartile Range (IQR): Middle 50% of the data

  • 2. Measures of Variability • Measures of variability describe the spread or the dispersion of a set of data. • Common Measures of Variability –Range –Interquartile Range –Mean Absolute Deviation –Variance –Standard Deviation – Z scores –Coefficient of Variation
  • 3. Variability Mean Mean Mean No Variability in Cash Flow Variability in Cash Flow Mean
  • 5. Range • The difference between the largest and the smallest values in a set of data • Simple to compute • Ignores all data points except the two extremes • Example: Range = Largest - Smallest = 48 - 35 = 13 35 37 37 39 40 40 41 41 43 43 43 43 44 44 44 44 44 45 45 46 46 46 46 48
  • 6. Interquartile Range • Range of values between the first and third quartiles • Range of the “middle half” • Less influenced by extremes Interquartile Range Q Q   3 1
  • 7. Deviation from the Mean • Data set: 5, 9, 16, 17, 18 • Mean: • Deviations from the mean: -8, -4, 3, 4, 5      X N 65 5 13 0 5 10 15 20 -8 -4 +3 +4 +5 
  • 8. Mean Absolute Deviation • Average of the absolute deviations from the mean 5 9 16 17 18 -8 -4 +3 +4 +5 0 +8 +4 +3 +4 +5 24 X X   X   M A D X N . . . .       24 5 4 8
  • 9. Population Variance • Average of the squared deviations from the arithmetic mean 5 9 16 17 18 -8 -4 +3 +4 +5 0 64 16 9 16 25 130 X X     2 X     2 2 130 5 26 0        X N .
  • 10. Population Standard Deviation • Square root of the variance   2 2 2 130 5 26 0 26 0 51             X N . . . 5 9 16 17 18 -8 -4 +3 +4 +5 0 64 16 9 16 25 130 X X     2 X  
  • 11. Empirical Rule • Data are normally distributed (or approximately normal)   1   2   3 95 99.7 68 Distance from the Mean Percentage of Values Falling Within Distance
  • 12. • The Empirical Rule is a statement about normal distributions. Normal Distribution A specific type of symmetrical distribution, also known as a bell- shaped distribution Empirical Rule On a normal distribution about 68% of data will be within one standard deviation of the mean, about 95% will be within two standard deviations of the mean, and about 99.7% will be within three standard deviations of the mean
  • 13. Example: Pulse Rates • Suppose the pulse rates of 200 college men are bell-shaped with a mean of 72 and standard deviation of 6. • About 68% of the men have pulse rates in the interval • About 95% of the men have pulse rates in the interval • About 99.7% of the men have pulse rates in the interval
  • 14. Example: IQ scores • IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. • About 68% of individuals have IQ scores in the interval • About 95% of individuals have IQ scores in the interval • About 99.7% of individuals have IQ scores in the interval
  • 16. Sample Variance • Average of the squared deviations from the arithmetic mean 2,398 1,844 1,539 1,311 7,092 625 71 -234 -462 0 390,625 5,041 54,756 213,444 663,866 X X X    2 X X    2 2 1 663 866 3 221 288 67 S X X n       , , .
  • 17. Sample Standard Deviation • Square root of the sample variance   2 2 2 1 663 866 3 221 288 67 221 288 67 470 41 S X X S n S          , , . , . . 2,398 1,844 1,539 1,311 7,092 625 71 -234 -462 0 390,625 5,041 54,756 213,444 663,866 X X X    2 X X 
  • 18. Coefficient of Variation • Ratio of the standard deviation to the mean, expressed as a percentage • Measurement of relative dispersion   C V . .   100
  • 19. Coefficient of Variation     1 29 4 6 100 4 6 29 100 1586 1 1 1 1          . . . . . CV     2 84 10 100 10 84 100 1190 2 2 2 2          CV . . .
  • 20. Measures of Shape • Skewness – Absence of symmetry – Extreme values in one side of a distribution • Kurtosis – Peakedness of a distribution • Box and Whisker Plots – Graphic display of a distribution – Reveals skewness
  • 23. Coefficient of Skewness • Summary measure for skewness • If S < 0, the distribution is negatively skewed (skewed to the left). • If S = 0, the distribution is symmetric (not skewed). • If S > 0, the distribution is positively skewed (skewed to the right).   S Md   3  
  • 24. Coefficient of Skewness     1 1 1 1 1 1 1 23 26 12 3 3 3 23 26 12 3 073             M S M d d . . .     2 2 2 2 2 2 2 26 26 12 3 3 3 26 26 12 3 0             M S M d d . .     3 3 3 3 3 3 3 29 26 12 3 3 3 29 26 12 3 073             M S M d d . . .
  • 25. Kurtosis • Peakedness of a distribution – Leptokurtic: high and thin – Mesokurtic: normal in shape – Platykurtic: flat and spread out Leptokurtic Mesokurtic Platykurtic
  • 26. Box and Whisker Plot • Five specific values are used: –Median, Q2 –First quartile, Q1 –Third quartile, Q3 –Minimum value in the data set –Maximum value in the data set
  • 27. Box and Whisker Plot, continued • Inner Fences – IQR = Q3 - Q1 – Lower inner fence = Q1 - 1.5 IQR – Upper inner fence = Q3 + 1.5 IQR • Outer Fences – Lower outer fence = Q1 - 3.0 IQR – Upper outer fence = Q3 + 3.0 IQR
  • 28. Box and Whisker Plot Q1 Q3 Q2 Minimum Maximum
  • 29. Skewness: Box and Whisker Plots, and Coefficient of Skewness Negatively Skewed Positively Skewed Symmetric (Not Skewed) S < 0 S = 0 S > 0
  • 31. mean Add everything up and divide by the number of data points. Ex. Find the mean of the data set: 3, 4, 6, 8, 3
  • 32. median • Middle point of the data • To calculate: – Put the values in order from smallest to largest – If there is an odd number of data points, value in the middle – If there is an even number of data points, average the two middle values • Ex. Find the median of the data set: 6, 3, 9, 1, 7, 8, 3, 5, 3, 2, 7, 3, 8, 4, 9, 10
  • 33. mode • Value that occurs most frequently • If no value occurs more than once, then the data set has no mode • Data set can have more than one mode • Ex. Find the mode of the data set: 7, 4, 9, 2, 4, 3, 9, 1, 7, 8, 3, 2, 3, 3, 3, 5, 2
  • 34. Example 1 • Data Set I Mean is ____ Median is ____ Mode is ____ • Data Set II Mean is ____ Median is ____ Mode is ____ Data Set I Data Set II
  • 35. Solution • Data Set I Mean is 483.8461538 Median is 400 Mode is 300 Data Set II Mean is 474 Median is 350 Mode is 300 Data Set II Data Set I
  • 36. Five Number Summary Used to construct a box-and-whisker plot 1.Minimum 2.Quartile 1 (Q1) – median of the lower data points 3.Median 4.Quartile 3 (Q3) – median of the upper data points 5.Maximum
  • 37. Example 2 Use this set of data to make a box-and-whisker plot: 59, 27, 18, 78, 61, 91, 52, 34, 54, 93, 100, 87, 85, 82, 68 STEP 1: Write the numbers in numerical order, from smallest to greatest.
  • 38. STEP 2: Create a five number summary. Minimum: Quartile 1: Median: Quartile 3: Max:
  • 39. STEP 3: Construct and box-and-whisker plot. Minimum: Quartile 1: Median: Quartile 3: Max:
  • 40. Example 3: You try! Use the following data set to create a five number summary and construct the box-and- whisker plot. 87, 7, 41, 50, 15, 220, 23, 99, 11, 45, 11, 61, 3, 39, 21
  • 41. Solution Minimum: 3 Q1: 11 Median: 39 Q3: 61 Maximum: 220