5. • it is the typical (standard) difference
(deviation) of an observation from the mean
• think of it as the average distance a data
point is from the mean, although this is not
strictly true
What is a standard deviation?
7. Measures of Variation -
Some Comments
• Range is the simplest, but is very sensitive to
outliers
• Variance units are the square of the original
units
• Interquartile range is mainly used with skewed
data (or data with outliers)
• We will use the standard deviation as a
measure of variation often in this course
8. Boxplot - a 5 number summary
• smallest observation (min)
• Q1
• Q2 (median)
• Q3
• largest observation (max)
10. Creating a Boxplot
• Create a scale covering the smallest to largest values
• Mark the location of the five numbers
• Draw a rectangle beginning at Q1 and ending at Q3
• Draw a line in the box representing Q2, the median
• Draw lines from the ends of the box to the smallest
and largest values
• Some software packages that create boxplots include
an algorithm to detect outliers. They will plot points
considered to be outliers individually.
12. Boxplot Interpretation
• The box represents the middle 50% of the data,
i.e., IQR = length of box
• The difference of the ends of the whiskers is the
range (if there are no outliers)
• Outliers are marked by an * by most software
packages.
• Boxplots are useful for comparing two or more
samples
– Compare center (median line)
– Compare variation (length of box or whiskers)