SlideShare a Scribd company logo
DESCRIPTIVE STATISTICS
Descriptive Statistics 
Class A--IQs of 13 Students 
102 115 
128 109 
131 89 
98106 
140 119 
9397 
110 
Class B--IQs of 13 Students 
127 162 
131 103 
96 111 
80 109 
93 87 
120 105 
109 
An Illustration: 
Which Group is Smarter? 
Each individual may be different. If you try to understand a group by remembering the 
qualities of each member, you become overwhelmed and fail to understand the group.
Descriptive Statistics 
Which group is smarter now? 
Class A--Average IQ Class B--Average IQ 
110.54 110.23 
They’re roughly the same! 
With a summary descriptive statistic, it is much 
easier to answer our question.
Ordered sample characteristics 
• MMaaxxiimmuumm –– tthhee llaasstt nnuummbbeerr iinn oorrddeerreedd ssaammppllee >> 
mmaaxx((xx)) 
• MMiinniimmuumm –– tthhee ffiirrsstt nnuummbbeerr iinn oorrddeerreedd ssaammppllee >> 
mmiinn((xx)) 
• MMeeddiiaann –– tthhee nnuummbbeerr llooccaatteedd aatt tthhee cceenntteerr ooff oorrddeerreedd 
ssaammppllee >> mmeeddiiaann((xx)) 
• QQuuaannttiillee –– ssuucchh nnuummbbeerr xxpp,, tthhaatt tthhee mmaaxxiimmuumm vvaalluuee ooff 
pp--tthh ppaarrtt ooff ssaammppllee iiss lleessss oorr eeqquuaall xxpp >> qquuaannttiillee((xx,,pp)) 
• QQuuaarrttiilleess -- 00.2255--qquuaannttiillee iiss ccaalllleedd tthhee ffiirrsstt ((oorr lloowweerr)) 
qquuaarrttiillee;; 00.55--qquuaannttiillee iiss ccaalllleedd mmeeddiiaann oorr tthhee sseeccoonndd 
qquuaarrttiillee;; 00.7755--qquuaannttiillee iiss ccaalllleedd tthhee tthhiirrdd ((oorr uuppppeerr)) 
qquuaarrttiillee 
• IInntteerrqquuaarrttiillee rraannggee –– tthhee ddiiffffeerreennccee bbeettwweeeenn tthhee tthhiirrdd 
aanndd tthhee ffiirrsstt qquuaarrttiillee,, ii.ee. xx00.7755 −− xx00.2255 >> IIQQRR((xx))
Descriptive Statistics 
Types of descriptive statistics: 
• Organize Data 
–Tables 
–Graphs 
• Summarize Data 
–Central Tendency 
–Variation
Descriptive Statistics 
Summarizing Data: 
– Central Tendency (or Groups’ “Middle Values”) 
• Mean 
• Median 
• Mode 
– Variation (or Summary of Differences Within Groups) 
• Range 
• Interquartile Range 
• Variance 
• Standard Deviation
Mean 
The mean is the “balance point.” 
Each person’s score is like 1 pound placed at the 
score’s position on a see-saw. Below, on a 200 cm 
see-saw, the mean equals 110, the place on the 
see-saw where a fulcrum finds balance: 
17 
units 
below 
4 
units 
below 
110 cm 
21 
units 
above 
0 
units 
The scale is balanced because… 
17 + 4 on the left = 21 on the right 
1 lb at 
93 cm 
1 lb at 
106 cm 
1 lb at 
131 cm
Mean 
1. Means can be badly affected by outliers 
(data points with extreme values unlike 
the rest) 
2. Outliers can make the mean a bad 
measure of central tendency or common 
experience 
Income in the world 
All of Us Bill Gates 
Mean Outlier
Median 
The middle value when a variable’s values are 
ranked in order; the point that divides a 
distribution into two equal halves. 
When data are listed in order, the median is the 
point at which 50% of the cases are above and 
50% below it. 
The 50th percentile.
Median 
Median = 109 
(six cases above, six below) 
Class A--IQs of 13 Students 
89 
93 
97 
98 
102 
106 
109 
110 
115 
119 
128 
131 
140
If the first student were to drop out of Class A, 
there would be a new median: 
89 
93 
97 
98 
102 
106 
109 
110 
115 
119 
128 
131 
140 
Median 
Median = 109.5 
109 + 110 = 219/2 = 109.5 
(six cases above, six below)
Median 
1. The median is unaffected by outliers, 
making it a better measure of central 
tendency, better describing the “typical 
person” than the mean when data are 
skewed. 
All of Us Bill Gates 
outlier
Median 
2. If the recorded values for a variable form 
a symmetric distribution, the median and 
mean are identical. 
3. In skewed data, the mean lies further 
toward the skew than the median. 
Symmetric Skewed 
Mean 
Median 
Mean 
Median
Mode 
The most common data point is called the 
mode. 
The combined IQ scores for Classes A & B: 
80 87 89 93 93 96 97 98 102 103 105 106 109 109 109 110 111 115 119 
120 127 128 131 131 140 162 
mode 
BTW, It is possible to have more than one mode!
Mode 
It may mot be at the 
center of a 
distribution. 
Data distribution on the 
2.0 
1.8 
1.6 
1.4 
right is “bimodal” 
(even statistics can 
be open-minded) 82.00 
89.00 
87.00 
96.00 
93.00 
98.00 
97.00 
103.00 
102.00 
106.00 
105.00 
109.00 
107.00 
115.00 
111.00 
120.00 
119.00 
128.00 
127.00 
140.00 
131.00 
162.00 
IQ 
1.2 
1.0 
Count
Descriptive stat
Descriptive Statistics 
Summarizing Data: 
Central Tendency (or Groups’ “Middle Values”) 
Mean 
Median 
Mode 
– Variation (or Summary of Differences Within Groups) 
• Range 
• Interquartile Range 
• Variance 
• Standard Deviation
Measures of Variability 
• Range (largest score – smallest score) 
• Variance (S2=Σ(x-M)2/N) 
• Standard deviation 
– Square root of the variance, so it’s in the same units 
as the mean 
– In a normal distribution, 68.26% of scores fall within 
+/- 1 sd of the mean; 95.44% fall within +/- 2 sd of the 
mean. 
• Coefficient of variation = the standard deviation 
divided by the sample mean
Range 
The spread, or the distance, between the lowest and 
highest values of a variable. 
To get the range for a variable, you subtract its lowest 
value from its highest value. 
Class A--IQs of 13 Students 
102 115 
128 109 
131 89 
98106 
140 119 
9397 
110 
Class A Range = 140 - 89 = 51 
Class B--IQs of 13 Students 
127 162 
131 103 
96111 
80109 
9387 
120 105 
109 
Class B Range = 162 - 80 = 82
Interquartile Range 
A quartile is the value that marks one of the divisions that breaks a series of values 
into four equal parts. 
The median is a quartile and divides the cases in half. 
25th percentile is a quartile that divides the first ¼ of cases from the latter ¾. 
75th percentile is a quartile that divides the first ¾ of cases from the latter ¼. 
The interquartile range is the distance or range between the 25th percentile and the 
75th percentile. Below, what is the interquartile range? 
25% 
of 
cases 
25% 25% 25% 
of 
cases 
0 250 500 750 1000
Variance 
A measure of the spread of the recorded values on a variable. A 
measure of dispersion. 
The larger the variance, the further the individual cases are from 
the mean. 
Mean 
The smaller the variance, the closer the individual scores are to 
the mean. 
Mean
Variance 
Variance is a number that at first seems 
complex to calculate. 
Calculating variance starts with a “deviation.” 
A deviation is the distance away from the mean of a case’s 
score. 
Yi – Y-bar 
If the average person’s car costs $20,000, 
my deviation from the mean is - $14,000! 
6K - 20K = -14K
Variance 
The deviation of 102 from 110.54 is? 
Deviation of 115? 
Class A--IQs of 13 Students 
102 115 
128 109 
131 89 
98106 
140 119 
9397 
110 
Y-barA = 110.54
Variance 
The deviation of 102 from 110.54 is? Deviation of 115? 
102 - 110.54 = -8.54 115 - 110.54 = 4.46 
Class A--IQs of 13 Students 
102 115 
128 109 
131 89 
98106 
140 119 
9397 
110 
Y-barA = 110.54
Variance 
• We want to add these to get total deviations, but if 
we were to do that, we would get zero every time. 
Why? 
• We need a way to eliminate negative signs. 
Squaring the deviations will eliminate negative signs... 
A Deviation Squared: (Yi – Y-bar)2 
Back to the IQ example, 
A deviation squared for 102 is: of 115: 
(102 - 110.54)2 = (-8.54)2 = 72.93 (115 - 110.54)2 = (4.46)2 = 19.89
Variance 
If you were to add all the squared deviations 
together, you’d get what we call the 
“Sum of Squares.” 
Sum of Squares (SS) = Σ (Yi – Y-bar)2 
SS = (Y1 – Y-bar)2 + (Y2 – Y-bar)2 + . . . + (Yn – Y-bar)2
Variance 
Class A, sum of squares: 
(102 – 110.54)2 + (115 – 110.54)2 
+ 
(126 – 110.54)2 + (109 – 110.54)2 
+ 
(131 – 110.54)2 + (89 – 110.54)2 
+ 
(98 – 110.54)2 + (106 – 110.54)2 
+ 
(140 – 110.54)2 + (119 – 110.54)2 
+ 
(93 – 110.54)2 + (97 – 110.54)2 + 
Class A--IQs of 13 Students 
102 115 
128 109 
131 89 
98106 
140 119 
9397 
110 
Y-bar = 110.54
Variance 
The last step… 
The approximate average sum of squares is the 
variance. 
SS/N = Variance for a population. 
SS/n-1 = Variance for a sample. 
Variance = Σ(Yi – Y-bar)2 / n – 1
Variance 
For Class A, Variance = 2825.39 / n - 1 
= 2825.39 / 12 = 235.45 
How helpful is that???
Standard Deviation 
To convert variance into something of meaning, let’s 
create standard deviation. 
The square root of the variance reveals the average 
deviation of the observations from the mean. 
s.d. = Σ(Yi – Y-bar)2 
n - 1
Standard Deviation 
For Class A, the standard deviation is: 
235.45 = 15.34 
The average of persons’ deviation from the mean IQ of 
110.54 is 15.34 IQ points. 
Review: 
1. Deviation 
2. Deviation squared 
3. Sum of squares 
4. Variance 
5. Standard deviation
Standard Deviation 
1. Larger s.d. = greater amounts of variation around the mean. 
For example: 
19 25 31 13 25 37 
Y = 25 Y = 25 
s.d. = 3 s.d. = 6 
2. s.d. = 0 only when all values are the same (only when you have a 
constant and not a “variable”) 
3. If you were to “rescale” a variable, the s.d. would change by the same 
magnitude—if we changed units above so the mean equaled 250, the 
s.d. on the left would be 30, and on the right, 60 
4. Like the mean, the s.d. will be inflated by an outlier case value.
Practical Application for Understanding 
Variance and Standard Deviation 
Even though we live in a world where we pay real dollars for 
goods and services (not percentages of income), most 
employers issue raises based on percent of salary. 
Why do supervisors think the most fair raise is a percentage 
raise? 
Answer: 1) Because higher paid persons win the most money. 
2) The easiest thing to do is raise everyone’s salary by 
a fixed percent. 
If your budget went up by 5%, salaries can go up by 5%. 
The problem is that the flat percent raise gives unequal 
increased rewards. . .
Practical Application for Understanding 
Variance and Standard Deviation 
Acme Toilet Cleaning Services 
Salary Pool: $200,000 
Incomes: 
President: $100K; Manager: 50K; Secretary: 40K; and Toilet Cleaner: 10K 
Mean: $50K 
Range: $90K 
Variance: $1,050,000,000 These can be considered 
“measures of inequality” 
Standard Deviation: $32.4K 
Now, let’s apply a 5% raise.
Practical Application for Understanding 
Variance and Standard Deviation 
After a 5% raise, the pool of money increases by $10K to $210,000 
Incomes: 
President: $105K; Manager: 52.5K; Secretary: 42K; and Toilet Cleaner: 10.5K 
Mean: $52.5K – went up by 5% 
Range: $94.5K – went up by 5% 
Variance: $1,157,625,000 Measures of Inequality 
Standard Deviation: $34K –went up by 5% 
The flat percentage raise increased inequality. The top earner got 50% of the new 
money. The bottom earner got 5% of the new money. Measures of inequality 
went up by 5%. 
Last year’s statistics: 
Acme Toilet Cleaning Services annual payroll of $200K 
Incomes: 
$100K, 50K, 40K, and 10K 
Mean: $50K 
Range: $90K; Variance: $1,050,000,000; Standard Deviation: $32.4K
Practical Application for Understanding 
Variance and Standard Deviation 
The flat percentage raise increased inequality. The top earner got 50% of 
the new money. The bottom earner got 5% of the new money. 
Inequality increased by 5%. 
Since we pay for goods and services in real dollars, not in percentages, 
there are substantially more new things the top earners can purchase 
compared with the bottom earner for the rest of their employment years. 
Acme Toilet Cleaning Services is giving the earners $5,000, $2,500, 
$2,000, and $500 more respectively each and every year forever. 
What does this mean in terms of compounding raises? 
Acme is essentially saying: “Each year we’ll buy you a new TV, in 
addition to everything else you buy, here’s what you’ll get:”
Practical Application for Understanding 
Variance and Standard Deviation 
Toilet Cleaner Secretary Manager President 
The gap between the rich and poor expands. 
This is why some progressive organizations give a percentage raise 
with a flat increase for lowest wage earners. For example, 5% or 
$1,000, whichever is greater.
Descriptive Statistics 
Summarizing Data: 
Central Tendency (or Groups’ “Middle Values”) 
Mean 
Median 
Mode 
Variation (or Summary of Differences Within Groups) 
Range 
Interquartile Range 
Variance 
Standard Deviation 
– …Wait! There’s more
Box-Plots 
A way to graphically portray almost all the 
descriptive statistics at once is the box-plot. 
A box-plot shows: Upper and lower 
quartiles 
Mean 
Median 
Range 
Outliers (1.5 IQR)
Box-Plots 
IQ 
180.00 
160.00 
140.00 
120.00 
100.00 
80.00 
123.5 
106.5 
96.5 
162 
82 
M=110.5 
IQR = 27; There 
is no outlier.
Descriptive stat
Confidence Intervals 
• Confidence intervals express the range in 
which the true value of a population 
parameter (as estimated by the sample 
statistic) falls, with a high degree of 
confidence (usually 95% or 99%).
Standard Deviation Versus 
Standard Error 
• The mean of the sampling distribution equals the 
population mean. 
• The standard deviation of the sampling 
distribution (also called the standard error of the 
mean) equals the population standard deviation / 
the square root of the sample size. 
• The standard error is an index of sampling error 
—an estimate of how much any sample can be 
expected to vary from the actual population 
value.

More Related Content

PPT
Normal Distribution
PDF
Standard Score And The Normal Curve
PPTX
Measures of Dispersion: Standard Deviation and Co- efficient of Variation
PPT
Standard Scores
PPTX
Normal distribution slide share
PPTX
3.2 Measures of variation
PPS
Standard deviationnormal distributionshow
PPT
Statistics 3, 4
Normal Distribution
Standard Score And The Normal Curve
Measures of Dispersion: Standard Deviation and Co- efficient of Variation
Standard Scores
Normal distribution slide share
3.2 Measures of variation
Standard deviationnormal distributionshow
Statistics 3, 4

What's hot (17)

PPT
Measures of Variation
PPSX
Measures of variation and dispersion report
PPT
Measures of variability
PPTX
Normal Curve
PPTX
Measures of central tendency
PPT
Measure of Dispersion
PDF
Density Curves and Normal Distributions
PPT
Z scores
PPTX
measure of dispersion
PPTX
Dispersion 2
PDF
Dispersion stati
PPTX
Measures of dispersion range qd md
PPT
M.Ed Tcs 2 seminar ppt npc to submit
PPT
Averages and range
PPT
Mean, median, and mode ug
Measures of Variation
Measures of variation and dispersion report
Measures of variability
Normal Curve
Measures of central tendency
Measure of Dispersion
Density Curves and Normal Distributions
Z scores
measure of dispersion
Dispersion 2
Dispersion stati
Measures of dispersion range qd md
M.Ed Tcs 2 seminar ppt npc to submit
Averages and range
Mean, median, and mode ug
Ad

Similar to Descriptive stat (20)

PDF
Descriptive statistics
PPTX
UNIT IV probability and standard distribution
PPTX
3. BIOSTATISTICS III measures of central tendency and dispersion by SM - Cop...
PPTX
measure of variability (windri). In research include example
PPTX
descriptive statistics- 1.pptx
PPT
Aron chpt 2
PPT
asDescriptive_Statistics2.ppt
PPT
Descriptive_Statistics .ppt8788798989i9999999999999
PPT
1. descriptive statistics
PPTX
Measures of Variability.pptx
PPTX
Basics of Stats (2).pptx
PPTX
Measures of Central Tendency With Variance and Ranges.pptx
PPTX
Measures of Variability
PPTX
Measure of Variability Report.pptx
PPT
Business statistics
PDF
Research Method for Business chapter 12
PPT
Stat11t chapter3
PPTX
Analysis, Interpretation, and Use of Test Lesson 8.pptx
PPT
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
DOCX
Statistics and probability
Descriptive statistics
UNIT IV probability and standard distribution
3. BIOSTATISTICS III measures of central tendency and dispersion by SM - Cop...
measure of variability (windri). In research include example
descriptive statistics- 1.pptx
Aron chpt 2
asDescriptive_Statistics2.ppt
Descriptive_Statistics .ppt8788798989i9999999999999
1. descriptive statistics
Measures of Variability.pptx
Basics of Stats (2).pptx
Measures of Central Tendency With Variance and Ranges.pptx
Measures of Variability
Measure of Variability Report.pptx
Business statistics
Research Method for Business chapter 12
Stat11t chapter3
Analysis, Interpretation, and Use of Test Lesson 8.pptx
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
Statistics and probability
Ad

More from o_devinyak (6)

PPT
Lecture2 hypothesis testing
PPT
Motivation for biostatistics
PPT
Introduction to biostatistics
PPT
презентація медичного факультету УжНУ
DOC
Notes for macc
PPT
ANTICANCER THIAZOLIDINONES DESIGN: Mining of 60-Cell Lines Experimental Data
Lecture2 hypothesis testing
Motivation for biostatistics
Introduction to biostatistics
презентація медичного факультету УжНУ
Notes for macc
ANTICANCER THIAZOLIDINONES DESIGN: Mining of 60-Cell Lines Experimental Data

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
Business Ethics Teaching Materials for college
PDF
Basic Mud Logging Guide for educational purpose
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Insiders guide to clinical Medicine.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Cell Types and Its function , kingdom of life
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Pharma ospi slides which help in ospi learning
STATICS OF THE RIGID BODIES Hibbelers.pdf
RMMM.pdf make it easy to upload and study
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Business Ethics Teaching Materials for college
Basic Mud Logging Guide for educational purpose
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Classroom Observation Tools for Teachers
Insiders guide to clinical Medicine.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Microbial disease of the cardiovascular and lymphatic systems
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Cell Types and Its function , kingdom of life
Renaissance Architecture: A Journey from Faith to Humanism
Module 4: Burden of Disease Tutorial Slides S2 2025
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pharma ospi slides which help in ospi learning

Descriptive stat

  • 2. Descriptive Statistics Class A--IQs of 13 Students 102 115 128 109 131 89 98106 140 119 9397 110 Class B--IQs of 13 Students 127 162 131 103 96 111 80 109 93 87 120 105 109 An Illustration: Which Group is Smarter? Each individual may be different. If you try to understand a group by remembering the qualities of each member, you become overwhelmed and fail to understand the group.
  • 3. Descriptive Statistics Which group is smarter now? Class A--Average IQ Class B--Average IQ 110.54 110.23 They’re roughly the same! With a summary descriptive statistic, it is much easier to answer our question.
  • 4. Ordered sample characteristics • MMaaxxiimmuumm –– tthhee llaasstt nnuummbbeerr iinn oorrddeerreedd ssaammppllee >> mmaaxx((xx)) • MMiinniimmuumm –– tthhee ffiirrsstt nnuummbbeerr iinn oorrddeerreedd ssaammppllee >> mmiinn((xx)) • MMeeddiiaann –– tthhee nnuummbbeerr llooccaatteedd aatt tthhee cceenntteerr ooff oorrddeerreedd ssaammppllee >> mmeeddiiaann((xx)) • QQuuaannttiillee –– ssuucchh nnuummbbeerr xxpp,, tthhaatt tthhee mmaaxxiimmuumm vvaalluuee ooff pp--tthh ppaarrtt ooff ssaammppllee iiss lleessss oorr eeqquuaall xxpp >> qquuaannttiillee((xx,,pp)) • QQuuaarrttiilleess -- 00.2255--qquuaannttiillee iiss ccaalllleedd tthhee ffiirrsstt ((oorr lloowweerr)) qquuaarrttiillee;; 00.55--qquuaannttiillee iiss ccaalllleedd mmeeddiiaann oorr tthhee sseeccoonndd qquuaarrttiillee;; 00.7755--qquuaannttiillee iiss ccaalllleedd tthhee tthhiirrdd ((oorr uuppppeerr)) qquuaarrttiillee • IInntteerrqquuaarrttiillee rraannggee –– tthhee ddiiffffeerreennccee bbeettwweeeenn tthhee tthhiirrdd aanndd tthhee ffiirrsstt qquuaarrttiillee,, ii.ee. xx00.7755 −− xx00.2255 >> IIQQRR((xx))
  • 5. Descriptive Statistics Types of descriptive statistics: • Organize Data –Tables –Graphs • Summarize Data –Central Tendency –Variation
  • 6. Descriptive Statistics Summarizing Data: – Central Tendency (or Groups’ “Middle Values”) • Mean • Median • Mode – Variation (or Summary of Differences Within Groups) • Range • Interquartile Range • Variance • Standard Deviation
  • 7. Mean The mean is the “balance point.” Each person’s score is like 1 pound placed at the score’s position on a see-saw. Below, on a 200 cm see-saw, the mean equals 110, the place on the see-saw where a fulcrum finds balance: 17 units below 4 units below 110 cm 21 units above 0 units The scale is balanced because… 17 + 4 on the left = 21 on the right 1 lb at 93 cm 1 lb at 106 cm 1 lb at 131 cm
  • 8. Mean 1. Means can be badly affected by outliers (data points with extreme values unlike the rest) 2. Outliers can make the mean a bad measure of central tendency or common experience Income in the world All of Us Bill Gates Mean Outlier
  • 9. Median The middle value when a variable’s values are ranked in order; the point that divides a distribution into two equal halves. When data are listed in order, the median is the point at which 50% of the cases are above and 50% below it. The 50th percentile.
  • 10. Median Median = 109 (six cases above, six below) Class A--IQs of 13 Students 89 93 97 98 102 106 109 110 115 119 128 131 140
  • 11. If the first student were to drop out of Class A, there would be a new median: 89 93 97 98 102 106 109 110 115 119 128 131 140 Median Median = 109.5 109 + 110 = 219/2 = 109.5 (six cases above, six below)
  • 12. Median 1. The median is unaffected by outliers, making it a better measure of central tendency, better describing the “typical person” than the mean when data are skewed. All of Us Bill Gates outlier
  • 13. Median 2. If the recorded values for a variable form a symmetric distribution, the median and mean are identical. 3. In skewed data, the mean lies further toward the skew than the median. Symmetric Skewed Mean Median Mean Median
  • 14. Mode The most common data point is called the mode. The combined IQ scores for Classes A & B: 80 87 89 93 93 96 97 98 102 103 105 106 109 109 109 110 111 115 119 120 127 128 131 131 140 162 mode BTW, It is possible to have more than one mode!
  • 15. Mode It may mot be at the center of a distribution. Data distribution on the 2.0 1.8 1.6 1.4 right is “bimodal” (even statistics can be open-minded) 82.00 89.00 87.00 96.00 93.00 98.00 97.00 103.00 102.00 106.00 105.00 109.00 107.00 115.00 111.00 120.00 119.00 128.00 127.00 140.00 131.00 162.00 IQ 1.2 1.0 Count
  • 17. Descriptive Statistics Summarizing Data: Central Tendency (or Groups’ “Middle Values”) Mean Median Mode – Variation (or Summary of Differences Within Groups) • Range • Interquartile Range • Variance • Standard Deviation
  • 18. Measures of Variability • Range (largest score – smallest score) • Variance (S2=Σ(x-M)2/N) • Standard deviation – Square root of the variance, so it’s in the same units as the mean – In a normal distribution, 68.26% of scores fall within +/- 1 sd of the mean; 95.44% fall within +/- 2 sd of the mean. • Coefficient of variation = the standard deviation divided by the sample mean
  • 19. Range The spread, or the distance, between the lowest and highest values of a variable. To get the range for a variable, you subtract its lowest value from its highest value. Class A--IQs of 13 Students 102 115 128 109 131 89 98106 140 119 9397 110 Class A Range = 140 - 89 = 51 Class B--IQs of 13 Students 127 162 131 103 96111 80109 9387 120 105 109 Class B Range = 162 - 80 = 82
  • 20. Interquartile Range A quartile is the value that marks one of the divisions that breaks a series of values into four equal parts. The median is a quartile and divides the cases in half. 25th percentile is a quartile that divides the first ¼ of cases from the latter ¾. 75th percentile is a quartile that divides the first ¾ of cases from the latter ¼. The interquartile range is the distance or range between the 25th percentile and the 75th percentile. Below, what is the interquartile range? 25% of cases 25% 25% 25% of cases 0 250 500 750 1000
  • 21. Variance A measure of the spread of the recorded values on a variable. A measure of dispersion. The larger the variance, the further the individual cases are from the mean. Mean The smaller the variance, the closer the individual scores are to the mean. Mean
  • 22. Variance Variance is a number that at first seems complex to calculate. Calculating variance starts with a “deviation.” A deviation is the distance away from the mean of a case’s score. Yi – Y-bar If the average person’s car costs $20,000, my deviation from the mean is - $14,000! 6K - 20K = -14K
  • 23. Variance The deviation of 102 from 110.54 is? Deviation of 115? Class A--IQs of 13 Students 102 115 128 109 131 89 98106 140 119 9397 110 Y-barA = 110.54
  • 24. Variance The deviation of 102 from 110.54 is? Deviation of 115? 102 - 110.54 = -8.54 115 - 110.54 = 4.46 Class A--IQs of 13 Students 102 115 128 109 131 89 98106 140 119 9397 110 Y-barA = 110.54
  • 25. Variance • We want to add these to get total deviations, but if we were to do that, we would get zero every time. Why? • We need a way to eliminate negative signs. Squaring the deviations will eliminate negative signs... A Deviation Squared: (Yi – Y-bar)2 Back to the IQ example, A deviation squared for 102 is: of 115: (102 - 110.54)2 = (-8.54)2 = 72.93 (115 - 110.54)2 = (4.46)2 = 19.89
  • 26. Variance If you were to add all the squared deviations together, you’d get what we call the “Sum of Squares.” Sum of Squares (SS) = Σ (Yi – Y-bar)2 SS = (Y1 – Y-bar)2 + (Y2 – Y-bar)2 + . . . + (Yn – Y-bar)2
  • 27. Variance Class A, sum of squares: (102 – 110.54)2 + (115 – 110.54)2 + (126 – 110.54)2 + (109 – 110.54)2 + (131 – 110.54)2 + (89 – 110.54)2 + (98 – 110.54)2 + (106 – 110.54)2 + (140 – 110.54)2 + (119 – 110.54)2 + (93 – 110.54)2 + (97 – 110.54)2 + Class A--IQs of 13 Students 102 115 128 109 131 89 98106 140 119 9397 110 Y-bar = 110.54
  • 28. Variance The last step… The approximate average sum of squares is the variance. SS/N = Variance for a population. SS/n-1 = Variance for a sample. Variance = Σ(Yi – Y-bar)2 / n – 1
  • 29. Variance For Class A, Variance = 2825.39 / n - 1 = 2825.39 / 12 = 235.45 How helpful is that???
  • 30. Standard Deviation To convert variance into something of meaning, let’s create standard deviation. The square root of the variance reveals the average deviation of the observations from the mean. s.d. = Σ(Yi – Y-bar)2 n - 1
  • 31. Standard Deviation For Class A, the standard deviation is: 235.45 = 15.34 The average of persons’ deviation from the mean IQ of 110.54 is 15.34 IQ points. Review: 1. Deviation 2. Deviation squared 3. Sum of squares 4. Variance 5. Standard deviation
  • 32. Standard Deviation 1. Larger s.d. = greater amounts of variation around the mean. For example: 19 25 31 13 25 37 Y = 25 Y = 25 s.d. = 3 s.d. = 6 2. s.d. = 0 only when all values are the same (only when you have a constant and not a “variable”) 3. If you were to “rescale” a variable, the s.d. would change by the same magnitude—if we changed units above so the mean equaled 250, the s.d. on the left would be 30, and on the right, 60 4. Like the mean, the s.d. will be inflated by an outlier case value.
  • 33. Practical Application for Understanding Variance and Standard Deviation Even though we live in a world where we pay real dollars for goods and services (not percentages of income), most employers issue raises based on percent of salary. Why do supervisors think the most fair raise is a percentage raise? Answer: 1) Because higher paid persons win the most money. 2) The easiest thing to do is raise everyone’s salary by a fixed percent. If your budget went up by 5%, salaries can go up by 5%. The problem is that the flat percent raise gives unequal increased rewards. . .
  • 34. Practical Application for Understanding Variance and Standard Deviation Acme Toilet Cleaning Services Salary Pool: $200,000 Incomes: President: $100K; Manager: 50K; Secretary: 40K; and Toilet Cleaner: 10K Mean: $50K Range: $90K Variance: $1,050,000,000 These can be considered “measures of inequality” Standard Deviation: $32.4K Now, let’s apply a 5% raise.
  • 35. Practical Application for Understanding Variance and Standard Deviation After a 5% raise, the pool of money increases by $10K to $210,000 Incomes: President: $105K; Manager: 52.5K; Secretary: 42K; and Toilet Cleaner: 10.5K Mean: $52.5K – went up by 5% Range: $94.5K – went up by 5% Variance: $1,157,625,000 Measures of Inequality Standard Deviation: $34K –went up by 5% The flat percentage raise increased inequality. The top earner got 50% of the new money. The bottom earner got 5% of the new money. Measures of inequality went up by 5%. Last year’s statistics: Acme Toilet Cleaning Services annual payroll of $200K Incomes: $100K, 50K, 40K, and 10K Mean: $50K Range: $90K; Variance: $1,050,000,000; Standard Deviation: $32.4K
  • 36. Practical Application for Understanding Variance and Standard Deviation The flat percentage raise increased inequality. The top earner got 50% of the new money. The bottom earner got 5% of the new money. Inequality increased by 5%. Since we pay for goods and services in real dollars, not in percentages, there are substantially more new things the top earners can purchase compared with the bottom earner for the rest of their employment years. Acme Toilet Cleaning Services is giving the earners $5,000, $2,500, $2,000, and $500 more respectively each and every year forever. What does this mean in terms of compounding raises? Acme is essentially saying: “Each year we’ll buy you a new TV, in addition to everything else you buy, here’s what you’ll get:”
  • 37. Practical Application for Understanding Variance and Standard Deviation Toilet Cleaner Secretary Manager President The gap between the rich and poor expands. This is why some progressive organizations give a percentage raise with a flat increase for lowest wage earners. For example, 5% or $1,000, whichever is greater.
  • 38. Descriptive Statistics Summarizing Data: Central Tendency (or Groups’ “Middle Values”) Mean Median Mode Variation (or Summary of Differences Within Groups) Range Interquartile Range Variance Standard Deviation – …Wait! There’s more
  • 39. Box-Plots A way to graphically portray almost all the descriptive statistics at once is the box-plot. A box-plot shows: Upper and lower quartiles Mean Median Range Outliers (1.5 IQR)
  • 40. Box-Plots IQ 180.00 160.00 140.00 120.00 100.00 80.00 123.5 106.5 96.5 162 82 M=110.5 IQR = 27; There is no outlier.
  • 42. Confidence Intervals • Confidence intervals express the range in which the true value of a population parameter (as estimated by the sample statistic) falls, with a high degree of confidence (usually 95% or 99%).
  • 43. Standard Deviation Versus Standard Error • The mean of the sampling distribution equals the population mean. • The standard deviation of the sampling distribution (also called the standard error of the mean) equals the population standard deviation / the square root of the sample size. • The standard error is an index of sampling error —an estimate of how much any sample can be expected to vary from the actual population value.