SlideShare a Scribd company logo
1
Measures of Dispersion
Measures of dispersion are also called measures of variation or measures of variability. To
see the need for such measures of dispersion, consider the two frequency curves shown in
the figure below. They are both unimodal and symmetric with the same means and
medians, but while one rises sharply on both sides of the mean the other shows less
concentration of the data near the mean with more dispersion outward.
While both curves differ markedly in dispersion, they have the same range, Xl – XS.
Certainly, the range is important in identifying the outer limits of a data distribution, but it
gives no information on what is occurring between these limits. Also, the range is unreliable,
being highly sensitive to extreme values that tend to vary from sample to sample.
Range
The range of a given set of data is obtained by taking the largest value and subtracting the
smallest value in the data set. It is therefore the difference between the largest and smallest
values in a given set of data.
Range = Largest value (Xl) – Smallest value (Xs)
Example: Given the weights (in kg) of four students as 56, 63, 65, 58, then the range of the
weights among the 4 students is Range = 65 – 56 = 9 kg.
The Mean Deviation
Of all the measures of central tendency presented so far, the arithmetic mean is by far the
most important and commonly used. Because of this it is necessary to have a measure of
dispersion around the mean. In measures of dispersion, the term mean will refer to the
arithmetic mean unless otherwise specified. The mean deviation as a measure should: (1) be
calculated from all the data, (2) show with a single number the typical or average dispersion
from the mean, and (3) increase, from data set to data set, with increasing dispersion. Mean
deviation refers to the arithmetic mean of deviations from the mean, denoted by;
2
Whatever the dispersion of the data, all calculations with this formula will always result in
zero for an answer. There are two accepted ways to solve this problem, both of which
eliminate the negative signs from the calculations. The first way is shown in this formula:
where the numerator is now the sum of the absolute values of the deviations, and absolute
values are always positive in sign. This computation is called the mean deviation (or the
average deviation or the mean absolute deviation). It shows the average size of the
deviations from the mean without regard to direction of deviation. It is zero when all values
in a sample are the same and increases across samples with increasing dispersion. While the
mean deviation is a legitimate measure of dispersion from the mean, it is rarely used
because it has limited value in theoretical statistics.
The second way to solve this problem, which we consider when we deal with the variance
and the standard deviation is to square each deviation and use the sum of the squared
deviations in the calculations.
EXAMPLE: Calculate the range and the mean deviation for the samples (a) x1= 1 g, x2 = 3 g,
x3 = 2 g, x4 = 7 g, x5 = 5 g, x6 = 4 g, x7 = 2 g, (b) 1 g, 3 g, 2 g, 7 g, 5 g, 4 g, 200 g.
Solution:
(a)
3
(b)
Frequency Distribution Formula for Mean Deviation
For calculating the mean deviation, just as there was a frequency-distribution formula for
sample mean, there is also a frequency-distribution formula for sample mean deviation
given by
4
EXAMPLE: Calculate the range and the mean deviation for the sample data given below:
Solution:
Range = 1 .8 cm - 1 .2 cm = 0.6 cm.
Variance
The quantity called the sum of squares (and denoted by SS) for a set of sample data is given
by
5
The variance (or mean squared deviation, or mean sum of squares) of a set of data is the
arithmetic mean of its squared deviations from the arithmetic mean. It is therefore defined
by the definitional formula for the variance
The numerator is the sample sum of squares (SS).
EXAMPLE: Calculate the variance for this sample of lengths (in cm): 3, 4, 5, 6, 7.
Solution:
The algebraically equivalent derived computational formulae for the sample variance are:
6
Standard Deviation
The sample standard deviation is defined by these definitional formulae:
7
And it has these computational formulae:
The standard deviation, on the other hand, is the most important and commonly used
measure of dispersion from the mean in both descriptive and inferential statistics.
Example: Compute the standard deviation of the set of data given by sample of lengths (in
cm): 3, 4, 5, 6, 7.
Solution:
Calculating Standard Deviations from Non-grouped Frequency Distributions
The computational frequency-distribution formulas for sample standard deviations are:
8
Example: Calculate the standard deviation of the following data set
Solution:
9
Therefore, √
Calculating Approximate Standard Deviations from Grouped Frequency Distributions
A standard deviation calculated from a grouped frequency distribution will only
approximate the exact value calculated directly from the data, and it is therefore called an
approximate standard deviation. To make this calculation from grouped data requires the
assumption that all values in a class are equal to the class mark mi. The computational
formula for a standard deviation of a sample data is given by:
Example: Calculate the approximate standard deviation from the grouped frequency
distribution given by:
10
Solution:
Therefore: √
( )
( )
Variance from an Assumed Mean
Letting A to be the assumed mean, let di be the difference between A and observation Xi of
the data set. Here di = Xi – A and the variance S2 of the data set is given by
11
where ̅ is given by
In the case where the observations have corresponding frequencies, the formulae changes to
and
Example
Consider the data set given by 2,3,5,10. Let the assumed mean A be 4. Use this information
to compute the variance of the data.
Solution
The data can be arranged in tabular form as below, and the computations made as shown.
Example
Consider the set of observations in the table below.
12
Take the assumed mean A = 40 and compute the variance of the data.
Solution
One can arrange the data in table form and perform computations as shown below.
The Coefficient of Variation
The coefficient of variation (also called the coefficient of variability, the coefficient of
dispersion, or the relative standard deviation) is defined for a sample data by both:
13
The measures of dispersion we have dealt with previously (range, mean deviation, variance,
standard deviation) are called measures of absolute dispersion because they are calculated
directly from the data and have the units of the original measurements or those units
squared. The coefficient of variation, on the other hand, is called a measure of relative
dispersion because it expresses a measure of absolute dispersion as a proportion (or
percentage) of some measure of average value that is in the same units as the measure of
dispersion. Because the numerator and denominator of the ratios in the measure have the
same units, the resulting measure of relative dispersion has no units.
Example:
You are a biologist studying genetic variation within different species of rodents. One
measure you take for each rodent is body weight in grams. For a sample of 10 males of the
white-footed mouse, you get these results: mean=12.9 g, s = 1.6 g; and for 8 males from the
plains pocket gopher you get these: mean= 545.0 g, S = 32.8 g. Compare the relative
dispersions of these two species.
Solution:
These results show that there is twice as much relative dispersion of body weight among the
mice as there is among the pocket gophers. This greater variation relative to the mean is not
apparent from the standard deviations, which show twenty times more absolute variation
among the pocket gophers.
The Standard Score and The Standardized Variable
For a sample, the standard score (also called the normal deviate, or z score) is defined as
14
For any data distribution, the standard score shows how far any given data value Xi is from
the mean of the distribution in standard deviation units; how many standard deviations the
value is from the mean. A positive z value indicates that Xi is larger than the mean (to its
right in a histogram or polygon) and a negative z value indicates that Xi is smaller than the
mean (to its left). Like the coefficient of variation, the standard score is a relative measure;
while the coefficient shows absolute dispersions relative to their means, the standard score
shows deviations from the mean relative to the standard deviation. Because its units are
numbers of standard deviations, the standard score allows comparisons of relative positions
within distributions that have very different means or different measurement units. When
for any variable X each measurement value in a sample or population is transformed into a z
value, this process is known as standardizing (or normalizing) the variable, and the
resulting variable Z is called a standardized variable.
Example:
Standardize the sample: 3,5,7,9,11.
Solution:
To standardize the sample is to calculate a standard score Zi for each Xi. These scores are
typically reported, as shown below, rounded to the nearest hundredth.

More Related Content

PDF
variability final Range std deviation hardest topic so ready carefully
PDF
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
PPTX
dispersion1.pptx
PPTX
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
PDF
MEASURE-OF-VARIABILITY- for students. Ppt
PPT
Measure of Dispersion - Grade 8 Statistics.ppt
PPTX
5th lecture on Measures of dispersion for
PPTX
Measures of Dispersion .pptx
variability final Range std deviation hardest topic so ready carefully
Unit-I Measures of Dispersion- Biostatistics - Ravinandan A P.pdf
dispersion1.pptx
Lecture. Introduction to Statistics (Measures of Dispersion).pptx
MEASURE-OF-VARIABILITY- for students. Ppt
Measure of Dispersion - Grade 8 Statistics.ppt
5th lecture on Measures of dispersion for
Measures of Dispersion .pptx

Similar to Lesson 5.pdf ....probability and statistics (20)

PPTX
Measures of Dispersion
PPT
Measures of dispersions
PPT
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
PPTX
JCS-PPT(Measures of Central Tendency and Variability).pptx
PPT
Dispersion according to geography statistic.ppt
DOCX
Measure of dispersion
PDF
Measures of dispersion
PPT
lecture No, 4 center tendendy and dispersion.ppt
PDF
Unit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdf
PPTX
Measures of Dispersion.pptx
PPTX
Statr sessions 4 to 6
PPT
Measures of dispersion
PPTX
Biosttistics for ayurveda students and yoga students
PDF
Measures of Dispersion - Thiyagu
PPT
dispersion...............................
PDF
PG STAT 531 Lecture 2 Descriptive statistics
PPT
Statistics-Measures of dispersions
PPTX
Measures of dispersion
PPTX
measures of central tendency.pptx
PPTX
Measures of Dispersion (Variability)
Measures of Dispersion
Measures of dispersions
Measure of dispersion by Neeraj Bhandari ( Surkhet.Nepal )
JCS-PPT(Measures of Central Tendency and Variability).pptx
Dispersion according to geography statistic.ppt
Measure of dispersion
Measures of dispersion
lecture No, 4 center tendendy and dispersion.ppt
Unit 1 - Measures of Dispersion - 18MAB303T - PPT - Part 2.pdf
Measures of Dispersion.pptx
Statr sessions 4 to 6
Measures of dispersion
Biosttistics for ayurveda students and yoga students
Measures of Dispersion - Thiyagu
dispersion...............................
PG STAT 531 Lecture 2 Descriptive statistics
Statistics-Measures of dispersions
Measures of dispersion
measures of central tendency.pptx
Measures of Dispersion (Variability)
Ad

Recently uploaded (20)

PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
An interstellar mission to test astrophysical black holes
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
BIOMOLECULES PPT........................
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPT
protein biochemistry.ppt for university classes
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
HPLC-PPT.docx high performance liquid chromatography
An interstellar mission to test astrophysical black holes
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
7. General Toxicologyfor clinical phrmacy.pptx
The KM-GBF monitoring framework – status & key messages.pptx
Comparative Structure of Integument in Vertebrates.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Placing the Near-Earth Object Impact Probability in Context
2. Earth - The Living Planet Module 2ELS
bbec55_b34400a7914c42429908233dbd381773.pdf
Derivatives of integument scales, beaks, horns,.pptx
Biophysics 2.pdffffffffffffffffffffffffff
AlphaEarth Foundations and the Satellite Embedding dataset
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
INTRODUCTION TO EVS | Concept of sustainability
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
BIOMOLECULES PPT........................
ECG_Course_Presentation د.محمد صقران ppt
protein biochemistry.ppt for university classes
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Ad

Lesson 5.pdf ....probability and statistics

  • 1. 1 Measures of Dispersion Measures of dispersion are also called measures of variation or measures of variability. To see the need for such measures of dispersion, consider the two frequency curves shown in the figure below. They are both unimodal and symmetric with the same means and medians, but while one rises sharply on both sides of the mean the other shows less concentration of the data near the mean with more dispersion outward. While both curves differ markedly in dispersion, they have the same range, Xl – XS. Certainly, the range is important in identifying the outer limits of a data distribution, but it gives no information on what is occurring between these limits. Also, the range is unreliable, being highly sensitive to extreme values that tend to vary from sample to sample. Range The range of a given set of data is obtained by taking the largest value and subtracting the smallest value in the data set. It is therefore the difference between the largest and smallest values in a given set of data. Range = Largest value (Xl) – Smallest value (Xs) Example: Given the weights (in kg) of four students as 56, 63, 65, 58, then the range of the weights among the 4 students is Range = 65 – 56 = 9 kg. The Mean Deviation Of all the measures of central tendency presented so far, the arithmetic mean is by far the most important and commonly used. Because of this it is necessary to have a measure of dispersion around the mean. In measures of dispersion, the term mean will refer to the arithmetic mean unless otherwise specified. The mean deviation as a measure should: (1) be calculated from all the data, (2) show with a single number the typical or average dispersion from the mean, and (3) increase, from data set to data set, with increasing dispersion. Mean deviation refers to the arithmetic mean of deviations from the mean, denoted by;
  • 2. 2 Whatever the dispersion of the data, all calculations with this formula will always result in zero for an answer. There are two accepted ways to solve this problem, both of which eliminate the negative signs from the calculations. The first way is shown in this formula: where the numerator is now the sum of the absolute values of the deviations, and absolute values are always positive in sign. This computation is called the mean deviation (or the average deviation or the mean absolute deviation). It shows the average size of the deviations from the mean without regard to direction of deviation. It is zero when all values in a sample are the same and increases across samples with increasing dispersion. While the mean deviation is a legitimate measure of dispersion from the mean, it is rarely used because it has limited value in theoretical statistics. The second way to solve this problem, which we consider when we deal with the variance and the standard deviation is to square each deviation and use the sum of the squared deviations in the calculations. EXAMPLE: Calculate the range and the mean deviation for the samples (a) x1= 1 g, x2 = 3 g, x3 = 2 g, x4 = 7 g, x5 = 5 g, x6 = 4 g, x7 = 2 g, (b) 1 g, 3 g, 2 g, 7 g, 5 g, 4 g, 200 g. Solution: (a)
  • 3. 3 (b) Frequency Distribution Formula for Mean Deviation For calculating the mean deviation, just as there was a frequency-distribution formula for sample mean, there is also a frequency-distribution formula for sample mean deviation given by
  • 4. 4 EXAMPLE: Calculate the range and the mean deviation for the sample data given below: Solution: Range = 1 .8 cm - 1 .2 cm = 0.6 cm. Variance The quantity called the sum of squares (and denoted by SS) for a set of sample data is given by
  • 5. 5 The variance (or mean squared deviation, or mean sum of squares) of a set of data is the arithmetic mean of its squared deviations from the arithmetic mean. It is therefore defined by the definitional formula for the variance The numerator is the sample sum of squares (SS). EXAMPLE: Calculate the variance for this sample of lengths (in cm): 3, 4, 5, 6, 7. Solution: The algebraically equivalent derived computational formulae for the sample variance are:
  • 6. 6 Standard Deviation The sample standard deviation is defined by these definitional formulae:
  • 7. 7 And it has these computational formulae: The standard deviation, on the other hand, is the most important and commonly used measure of dispersion from the mean in both descriptive and inferential statistics. Example: Compute the standard deviation of the set of data given by sample of lengths (in cm): 3, 4, 5, 6, 7. Solution: Calculating Standard Deviations from Non-grouped Frequency Distributions The computational frequency-distribution formulas for sample standard deviations are:
  • 8. 8 Example: Calculate the standard deviation of the following data set Solution:
  • 9. 9 Therefore, √ Calculating Approximate Standard Deviations from Grouped Frequency Distributions A standard deviation calculated from a grouped frequency distribution will only approximate the exact value calculated directly from the data, and it is therefore called an approximate standard deviation. To make this calculation from grouped data requires the assumption that all values in a class are equal to the class mark mi. The computational formula for a standard deviation of a sample data is given by: Example: Calculate the approximate standard deviation from the grouped frequency distribution given by:
  • 10. 10 Solution: Therefore: √ ( ) ( ) Variance from an Assumed Mean Letting A to be the assumed mean, let di be the difference between A and observation Xi of the data set. Here di = Xi – A and the variance S2 of the data set is given by
  • 11. 11 where ̅ is given by In the case where the observations have corresponding frequencies, the formulae changes to and Example Consider the data set given by 2,3,5,10. Let the assumed mean A be 4. Use this information to compute the variance of the data. Solution The data can be arranged in tabular form as below, and the computations made as shown. Example Consider the set of observations in the table below.
  • 12. 12 Take the assumed mean A = 40 and compute the variance of the data. Solution One can arrange the data in table form and perform computations as shown below. The Coefficient of Variation The coefficient of variation (also called the coefficient of variability, the coefficient of dispersion, or the relative standard deviation) is defined for a sample data by both:
  • 13. 13 The measures of dispersion we have dealt with previously (range, mean deviation, variance, standard deviation) are called measures of absolute dispersion because they are calculated directly from the data and have the units of the original measurements or those units squared. The coefficient of variation, on the other hand, is called a measure of relative dispersion because it expresses a measure of absolute dispersion as a proportion (or percentage) of some measure of average value that is in the same units as the measure of dispersion. Because the numerator and denominator of the ratios in the measure have the same units, the resulting measure of relative dispersion has no units. Example: You are a biologist studying genetic variation within different species of rodents. One measure you take for each rodent is body weight in grams. For a sample of 10 males of the white-footed mouse, you get these results: mean=12.9 g, s = 1.6 g; and for 8 males from the plains pocket gopher you get these: mean= 545.0 g, S = 32.8 g. Compare the relative dispersions of these two species. Solution: These results show that there is twice as much relative dispersion of body weight among the mice as there is among the pocket gophers. This greater variation relative to the mean is not apparent from the standard deviations, which show twenty times more absolute variation among the pocket gophers. The Standard Score and The Standardized Variable For a sample, the standard score (also called the normal deviate, or z score) is defined as
  • 14. 14 For any data distribution, the standard score shows how far any given data value Xi is from the mean of the distribution in standard deviation units; how many standard deviations the value is from the mean. A positive z value indicates that Xi is larger than the mean (to its right in a histogram or polygon) and a negative z value indicates that Xi is smaller than the mean (to its left). Like the coefficient of variation, the standard score is a relative measure; while the coefficient shows absolute dispersions relative to their means, the standard score shows deviations from the mean relative to the standard deviation. Because its units are numbers of standard deviations, the standard score allows comparisons of relative positions within distributions that have very different means or different measurement units. When for any variable X each measurement value in a sample or population is transformed into a z value, this process is known as standardizing (or normalizing) the variable, and the resulting variable Z is called a standardized variable. Example: Standardize the sample: 3,5,7,9,11. Solution: To standardize the sample is to calculate a standard score Zi for each Xi. These scores are typically reported, as shown below, rounded to the nearest hundredth.