Torturing Numbers
Dr. Jason S.T. Deveau
Application Technology Specialist
OMAFRA, Simcoe Station
A Grower’s Guide to Descriptive Statistics
"If you torture the data long enough,
it will confess"
– Ronald Harry Coase, Economist
why do we need statistics?
• Descriptive statistics are math tools we use to:
 Describe data
 Find trends in data
against variation
 Determine if a sample
represents a population
 Draw conclusions about
data
describing data
• In 1950, 25 university graduates were asked what
they earned in their first year of work
$45,000
$15,000
$10,000
$5,700
$5,000
$3,700
$3,000
$2,000
$2,000
$2,000
$10,000
$5,000
$2,000
$2,000
$5,000
$2,000
$3,700
$3,700
$3,700
$2,000
$2,000
$2,000
$2,000
$2,000
$2,000
• What do these data tell you?
describing data
• Here is the same data ordered from greatest to
least and weighted to show how many times each
value occurs in the data set
• Now what do the data tell you?
• What is the average income?
$45,000
$15,000
$10,000
$5,700
$5,000
$3,700
$3,000
$2,000
$45,000
$15,000
$10,000
$5,700
$5,000
$3,700
$3,000
$2,000
describing data
• BEWARE! The reported ‘average’ might depend
on what you are meant to see. Which would you
use on your taxes?
MEAN (arithmetic average)
MEDIAN (midpoint in range)
MODE (most frequent)
• So, to really understand the
data set you need more than
just the ‘average’
spread and variability
• You need to know the spread of the data
• This histogram
shows the ages
of smart people
that attend
spray demos
• Is it typical for
90 year olds to
attend spray
demos?
spread and variability
• When the mean and median are the same, you
have a special situation called a ‘normal’ curve
• On this
symmetrical
curve, the
variability can
be described
using standard
deviations (SD)
spread and variability
• SD is a way to determine how far a data point is
from the mean
• You can now
say that 90
year olds fall
more than 2
SD from the
mean, or that
they make up
less than 2.5%
of the data set
spread and variability
• If we collapse the whole data set to one bar, we
can show the mean with some measure of
variability (std dev, std error, etc.)
• Without some indication of variability, you cannot
effectively compare two data sets
spread and variability
• Often, data sets are skewed. Here is the effect of a
new herbicide on quackgrass.
• Means and
standard
deviations
don’t help
here…
spread and variability
Min Q1 Median Q3 Max
• Perhaps the best way to describe any data set is
with five numbers: Minimum, Q1, Median, Q3,
Maximum. This helps when comparing data sets,
and when there are oddities called outliers
25% 25% 25% 25%
*
Outlier
Torturing numbers - Descriptive Statistics for Growers (2013)
a sample study
• Researchers want to know
which of three fertilizers
produce the highest wheat
yield in kg/plot
a sample study
• They design a study with three treatments and
five replications for each treatment
3 Treatments (Fertilizers 1, 2 and 3)
5Replicates
a sample study
• Could a
nearby forest
or river be a
confounding
variable?
• Variables like soil type and other local influences
may have unexpected impacts…
a sample study
• This is why a good study is randomized, to
defeat potentially confounding variables
• Does the sample
plot in our study
represent all the
wheat in all the
world?
POPULATION SAMPLE
uncertainty
• With all the unknown variables, there will always
be a degree of uncertainty that our sample
represents the population
• That’s why the more samples we have, the more
confident we are that our study represents the
population
confidence
•Any confidence interval
could be used, but 95% is
often chosen
•This means that 95% of the
time, you expect your data
represents reality
•BEWARE reports with no
confidence interval
two ways to present data
Fertilizer 1 Fertilizer 2 Fertilizer 3
64.8 56.5 65.8
60.5 53.8 73.2
63.4 59.4 59.5
48.2 61.1 66.3
55.5 58.8 70.2
• Tables are the preferred way to show data, but
graphs paint a quick, easy and seductive picture
drawing conclusions
• A presenter may want you to see a relationship
between two variables
• Fertilizer 3 appears to increase the average yield
of wheat – but what kind of average is this? How
big was the sample? Where is the indication of
variability? Where is the confidence interval?
drawing conclusions
• A presenter may want you to see a relationship
between two variables
• Fertilizer 3 appears to increase the average yield
of wheat – but what kind of average is this? How
big was the sample? Where is the indication of
variability? Where is the confidence interval?
• Bad stats
and bad
experimental
design may
lead to bad
conclusions
2 SD
drawing conclusions
• Correlation does not imply causation
The more firemen fighting a fire, the bigger the
fire is observed to be. Therefore more firemen
cause an increase in the size of a fire.
• Often, a presenter wants to lead you to a
conclusion. Newspapers, TV and online articles
should be scrutinized!
• BEWARE:
• “This is not a scientific poll…”
• “These results may not be representative of
the population”
• “…based on a list of those that responded”
• “Data showed a trend but was not statistically
significant” (I’ve used this one!!!)
it’s all in how you show it
it’s all in how you show it
• Pies are for eating, and possibly throwing…
• It’s very hard to see differences
• BEWARE CHARTJUNK!
it’s all in how you show it
• Amusing graphics are nothing but distractions
• Again, it’s very hard to see differences
• BEWARE CHARTJUNK!
it’s all in how you show it
• Here is the same population growth data
shown on two scales. Which would you use
to demonstrate rapid growth?
• BEWARE tricky scales!
it’s all in how you show it
• BEWARE statements with no context. Here’s a
made-up example, but it’s no worse than other
‘factoids’ I’ve encountered
Did you know that even speaking to someone that
once sprayed pesticides DOUBLES your chance
of getting cancer?!
• Your odds go from 0.000000001:1
to 0.000000002:1
conclusion
• We started by stating that descriptive statistics
are tools
• Like any tool, stats can be misused
(intentionally or unintentionally)
• Maintain a healthy scepticism and question
charts, tables and conclusions where
insufficient information is provided
Three statisticians were hunting when they
came across a big buck. The first statistician
fired, but missed by a meter to the left. The
second statistician
fired, but missed
by a meter to the
right.
The third statistician
threw down his rifle
and cheered “We got it!"
…one last joke
- The Cartoon Guide to Statistics (1993)
- Larry Gonick and Woolcott Smith
references
- How to Lie with Statistics (1954)
- Darrel Huff
Tom Wolf
@nozzle_guy
Jason Deveau
@spray_guy
Learn moreabout spraying
www.sprayers101.com

More Related Content

PDF
Descriptive Statistics
PPTX
Descriptive statistics
PPTX
Range, quartiles, and interquartile range
PPTX
Statistical distributions
PPTX
"A basic guide to SPSS"
PPTX
Basic Descriptive statistics
PPTX
3.1 Measures of center
PPTX
Introduction to Descriptive Statistics
Descriptive Statistics
Descriptive statistics
Range, quartiles, and interquartile range
Statistical distributions
"A basic guide to SPSS"
Basic Descriptive statistics
3.1 Measures of center
Introduction to Descriptive Statistics

What's hot (20)

PPT
Basic concept of probability
PPT
Measures of central tendency dispersion
PPTX
Basics of Educational Statistics (Descriptive statistics)
PPT
Measures Of Central Tendencies
PDF
Data management in Stata
PPTX
Presentation of Data
PPTX
PPTX
Measures of central tendency mean
PPTX
Statistics & probability
PDF
Descriptive statistics
PPT
Inferential Statistics
PPTX
Descriptive Statistics
PPTX
Descriptive statistics
PPTX
PPT
Chap07 interval estimation
PDF
Basic Biostatistics and Data managment
PPTX
Business Statistics
PPTX
outliers
PDF
Multinomial Logistic Regression
Basic concept of probability
Measures of central tendency dispersion
Basics of Educational Statistics (Descriptive statistics)
Measures Of Central Tendencies
Data management in Stata
Presentation of Data
Measures of central tendency mean
Statistics & probability
Descriptive statistics
Inferential Statistics
Descriptive Statistics
Descriptive statistics
Chap07 interval estimation
Basic Biostatistics and Data managment
Business Statistics
outliers
Multinomial Logistic Regression
Ad

Viewers also liked (20)

PPTX
Descriptive statistics
PPT
Malimu descriptive statistics.
PDF
02 descriptive statistics
PPT
Descriptive statistics
PPT
Descriptive statistics
PPT
Descriptive Statistics
PPTX
Data Analysis: Descriptive Statistics
PPT
Basic Sprayer Education for Applying Organic Products (2014)
PPT
Descstats
PPT
Malimu sources of errors
PPT
Mean medianmode3
PDF
Table quantitative techniques
PPT
Malimu research protocol
PPT
Malimu principles of outbreak investigation
PPT
Chapter 1 descriptive_stats_2_rev_2009
PPT
Malimu statistical significance testing.
PPTX
Measures of central tendency
PPTX
Descriptive Statistics, Numerical Description
PDF
Lect w2 measures_of_location_and_spread
PDF
Introduction to Statistical Applications for Process Validation
Descriptive statistics
Malimu descriptive statistics.
02 descriptive statistics
Descriptive statistics
Descriptive statistics
Descriptive Statistics
Data Analysis: Descriptive Statistics
Basic Sprayer Education for Applying Organic Products (2014)
Descstats
Malimu sources of errors
Mean medianmode3
Table quantitative techniques
Malimu research protocol
Malimu principles of outbreak investigation
Chapter 1 descriptive_stats_2_rev_2009
Malimu statistical significance testing.
Measures of central tendency
Descriptive Statistics, Numerical Description
Lect w2 measures_of_location_and_spread
Introduction to Statistical Applications for Process Validation
Ad

Similar to Torturing numbers - Descriptive Statistics for Growers (2013) (20)

PDF
A Visual Guide for Describing Numbers
PPT
Stat11t chapter1
PPT
Stat11t Chapter1
PPT
Business statistics (Basics)
PPT
businessstatistics-stat10022-200411201812.ppt
PDF
Statistics A Gentle Introduction 4th Edition Frederick L Coolidge
PDF
Introduction to biostatistics
PDF
Statistics: A Gentle Introduction 4th Edition Frederick L. Coolidge
PDF
assignment of statistics 2.pdf
PPTX
An Overview of Basic Statistics
PDF
Explore, Analyze and Present your data
PDF
statistics.pdf
PDF
Gaise pre k-12_full
PPT
PPT
3. descriptive statistics
PPTX
Introduction to Statistics
PDF
Statistics of engineer’s with basic concepts in statistics
DOCX
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
PPT
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
PPT
Introduction_to_Statistics_as_used_in_th.ppt
A Visual Guide for Describing Numbers
Stat11t chapter1
Stat11t Chapter1
Business statistics (Basics)
businessstatistics-stat10022-200411201812.ppt
Statistics A Gentle Introduction 4th Edition Frederick L Coolidge
Introduction to biostatistics
Statistics: A Gentle Introduction 4th Edition Frederick L. Coolidge
assignment of statistics 2.pdf
An Overview of Basic Statistics
Explore, Analyze and Present your data
statistics.pdf
Gaise pre k-12_full
3. descriptive statistics
Introduction to Statistics
Statistics of engineer’s with basic concepts in statistics
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docx
Statistics and Public Health. Curso de Inglés Técnico para profesionales de S...
Introduction_to_Statistics_as_used_in_th.ppt

More from jasondeveau (9)

PPTX
Dicamba in Canada - 2017
PPT
The USDA Intelligent Sprayer (2017)
PDF
Crop-Adapted Spraying (2017)
PPT
Vegetative Barriers to Spray Drift
PPTX
Spraying cane, bush and bramble berries - 2016
PPTX
Spraying Asparagus in Fern - 2017
PDF
Sweet Corn Coverage via Cannon Sprayer
PPTX
Drive slower, spray more - Sprayer Productivity
PPTX
2017 continuous rinsing
Dicamba in Canada - 2017
The USDA Intelligent Sprayer (2017)
Crop-Adapted Spraying (2017)
Vegetative Barriers to Spray Drift
Spraying cane, bush and bramble berries - 2016
Spraying Asparagus in Fern - 2017
Sweet Corn Coverage via Cannon Sprayer
Drive slower, spray more - Sprayer Productivity
2017 continuous rinsing

Recently uploaded (20)

PPTX
Introduction to pro and eukaryotes and differences.pptx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PPTX
Climate Change and Its Global Impact.pptx
PDF
Journal of Dental Science - UDMY (2022).pdf
PDF
Race Reva University – Shaping Future Leaders in Artificial Intelligence
PDF
International_Financial_Reporting_Standa.pdf
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PPTX
Education and Perspectives of Education.pptx
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
English Textual Question & Ans (12th Class).pdf
PDF
CRP102_SAGALASSOS_Final_Projects_2025.pdf
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
PDF
semiconductor packaging in vlsi design fab
PDF
Climate and Adaptation MCQs class 7 from chatgpt
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PPTX
What’s under the hood: Parsing standardized learning content for AI
PPTX
DRUGS USED FOR HORMONAL DISORDER, SUPPLIMENTATION, CONTRACEPTION, & MEDICAL T...
Introduction to pro and eukaryotes and differences.pptx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Climate Change and Its Global Impact.pptx
Journal of Dental Science - UDMY (2022).pdf
Race Reva University – Shaping Future Leaders in Artificial Intelligence
International_Financial_Reporting_Standa.pdf
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Education and Perspectives of Education.pptx
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Share_Module_2_Power_conflict_and_negotiation.pptx
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
English Textual Question & Ans (12th Class).pdf
CRP102_SAGALASSOS_Final_Projects_2025.pdf
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
semiconductor packaging in vlsi design fab
Climate and Adaptation MCQs class 7 from chatgpt
FORM 1 BIOLOGY MIND MAPS and their schemes
What’s under the hood: Parsing standardized learning content for AI
DRUGS USED FOR HORMONAL DISORDER, SUPPLIMENTATION, CONTRACEPTION, & MEDICAL T...

Torturing numbers - Descriptive Statistics for Growers (2013)

  • 1. Torturing Numbers Dr. Jason S.T. Deveau Application Technology Specialist OMAFRA, Simcoe Station A Grower’s Guide to Descriptive Statistics
  • 2. "If you torture the data long enough, it will confess" – Ronald Harry Coase, Economist
  • 3. why do we need statistics? • Descriptive statistics are math tools we use to:  Describe data  Find trends in data against variation  Determine if a sample represents a population  Draw conclusions about data
  • 4. describing data • In 1950, 25 university graduates were asked what they earned in their first year of work $45,000 $15,000 $10,000 $5,700 $5,000 $3,700 $3,000 $2,000 $2,000 $2,000 $10,000 $5,000 $2,000 $2,000 $5,000 $2,000 $3,700 $3,700 $3,700 $2,000 $2,000 $2,000 $2,000 $2,000 $2,000 • What do these data tell you?
  • 5. describing data • Here is the same data ordered from greatest to least and weighted to show how many times each value occurs in the data set • Now what do the data tell you? • What is the average income? $45,000 $15,000 $10,000 $5,700 $5,000 $3,700 $3,000 $2,000
  • 6. $45,000 $15,000 $10,000 $5,700 $5,000 $3,700 $3,000 $2,000 describing data • BEWARE! The reported ‘average’ might depend on what you are meant to see. Which would you use on your taxes? MEAN (arithmetic average) MEDIAN (midpoint in range) MODE (most frequent) • So, to really understand the data set you need more than just the ‘average’
  • 7. spread and variability • You need to know the spread of the data • This histogram shows the ages of smart people that attend spray demos • Is it typical for 90 year olds to attend spray demos?
  • 8. spread and variability • When the mean and median are the same, you have a special situation called a ‘normal’ curve • On this symmetrical curve, the variability can be described using standard deviations (SD)
  • 9. spread and variability • SD is a way to determine how far a data point is from the mean • You can now say that 90 year olds fall more than 2 SD from the mean, or that they make up less than 2.5% of the data set
  • 10. spread and variability • If we collapse the whole data set to one bar, we can show the mean with some measure of variability (std dev, std error, etc.) • Without some indication of variability, you cannot effectively compare two data sets
  • 11. spread and variability • Often, data sets are skewed. Here is the effect of a new herbicide on quackgrass. • Means and standard deviations don’t help here…
  • 12. spread and variability Min Q1 Median Q3 Max • Perhaps the best way to describe any data set is with five numbers: Minimum, Q1, Median, Q3, Maximum. This helps when comparing data sets, and when there are oddities called outliers 25% 25% 25% 25% * Outlier
  • 14. a sample study • Researchers want to know which of three fertilizers produce the highest wheat yield in kg/plot
  • 15. a sample study • They design a study with three treatments and five replications for each treatment 3 Treatments (Fertilizers 1, 2 and 3) 5Replicates
  • 16. a sample study • Could a nearby forest or river be a confounding variable? • Variables like soil type and other local influences may have unexpected impacts…
  • 17. a sample study • This is why a good study is randomized, to defeat potentially confounding variables
  • 18. • Does the sample plot in our study represent all the wheat in all the world? POPULATION SAMPLE
  • 19. uncertainty • With all the unknown variables, there will always be a degree of uncertainty that our sample represents the population • That’s why the more samples we have, the more confident we are that our study represents the population
  • 20. confidence •Any confidence interval could be used, but 95% is often chosen •This means that 95% of the time, you expect your data represents reality •BEWARE reports with no confidence interval
  • 21. two ways to present data Fertilizer 1 Fertilizer 2 Fertilizer 3 64.8 56.5 65.8 60.5 53.8 73.2 63.4 59.4 59.5 48.2 61.1 66.3 55.5 58.8 70.2 • Tables are the preferred way to show data, but graphs paint a quick, easy and seductive picture
  • 22. drawing conclusions • A presenter may want you to see a relationship between two variables • Fertilizer 3 appears to increase the average yield of wheat – but what kind of average is this? How big was the sample? Where is the indication of variability? Where is the confidence interval?
  • 23. drawing conclusions • A presenter may want you to see a relationship between two variables • Fertilizer 3 appears to increase the average yield of wheat – but what kind of average is this? How big was the sample? Where is the indication of variability? Where is the confidence interval? • Bad stats and bad experimental design may lead to bad conclusions 2 SD
  • 24. drawing conclusions • Correlation does not imply causation The more firemen fighting a fire, the bigger the fire is observed to be. Therefore more firemen cause an increase in the size of a fire.
  • 25. • Often, a presenter wants to lead you to a conclusion. Newspapers, TV and online articles should be scrutinized! • BEWARE: • “This is not a scientific poll…” • “These results may not be representative of the population” • “…based on a list of those that responded” • “Data showed a trend but was not statistically significant” (I’ve used this one!!!) it’s all in how you show it
  • 26. it’s all in how you show it • Pies are for eating, and possibly throwing… • It’s very hard to see differences • BEWARE CHARTJUNK!
  • 27. it’s all in how you show it • Amusing graphics are nothing but distractions • Again, it’s very hard to see differences • BEWARE CHARTJUNK!
  • 28. it’s all in how you show it • Here is the same population growth data shown on two scales. Which would you use to demonstrate rapid growth? • BEWARE tricky scales!
  • 29. it’s all in how you show it • BEWARE statements with no context. Here’s a made-up example, but it’s no worse than other ‘factoids’ I’ve encountered Did you know that even speaking to someone that once sprayed pesticides DOUBLES your chance of getting cancer?! • Your odds go from 0.000000001:1 to 0.000000002:1
  • 30. conclusion • We started by stating that descriptive statistics are tools • Like any tool, stats can be misused (intentionally or unintentionally) • Maintain a healthy scepticism and question charts, tables and conclusions where insufficient information is provided
  • 31. Three statisticians were hunting when they came across a big buck. The first statistician fired, but missed by a meter to the left. The second statistician fired, but missed by a meter to the right. The third statistician threw down his rifle and cheered “We got it!" …one last joke
  • 32. - The Cartoon Guide to Statistics (1993) - Larry Gonick and Woolcott Smith references - How to Lie with Statistics (1954) - Darrel Huff
  • 33. Tom Wolf @nozzle_guy Jason Deveau @spray_guy Learn moreabout spraying www.sprayers101.com