THE BASICS


What are statistics?

Statistics can summarize and simplify large amounts
of numerical data.

Using statistics one can draw conclusions about data.

Statistics is a discipline that examines data and can
calculate numerical estimates of "true" values.

Statistics can not prove anything- estimates are
normally presented in probabilistic terms (e.g. we are
95% sure ...)

Statistics can not make bad data better - "garbage
in, garbage out"


Why use statistics?

Want to characterize something (species, community
composition, stratigraphic range, average grain size,
etc...) for which we have only a limited sample- we
must therefore estimate the "true" parameters by
employing statistical methods.

                         1.1
Statistics may reveal underlying patterns in data not
normally observable (especially true in multivariate
analyses).

If used correctly, statistics can separate the probable
from the possible


Types of Data

• ratio-scale data. Measurements along a
  continuous scale whose scale begins at 0 (e.g.
  lengths or widths in mm).

• interval-scale data. Same as ratio, but data do
  not have 0 as low end of scale (e.g. temperature).

• ordinal-scale data. Generally used for irregular
  scaled data converted to ranks or relative position
  (e.g. position of stratigraphic stages).

• discrete data. Not continuous, usually counts
  (e.g. number of individuals per sample).

• nominal or categorical data. Includes binary
  data (e.g. presence/absence) or group data (e.g.
  sandstone/siltstone/mudstone).

                        1.2
Some Basic Definitions

• variable: Anything that varies and can be
  measured (e.g. measurement, property, quantity.
  and attribute). Determining the relationships
  between variables is the realm of R-mode analysis.

• object: Unit of study on which variables can be
  measured (e.g. case, individual, specimen).
  Determining the relationships between objects is
  the realm of Q-mode analysis.

• population: The total set of measurements. The
  limits of the population should be designated
  before any analysis. (e.g. the size of all specimens
  of brachiopod species x.)

  Usually the population is unknowable and must be
  estimated by a sample.

• sample: Collection of objects which are a subset
  of the population of interest and are taken as
  representative of the population. (e.g. the size of
  20 specimens of brachiopod species x from a
  particular outcrop.)


                         1.3
Note: if the sample is not representative of the
  population, it is said to be biased. e.g. collecting
  only large specimens of brachiopod species x only
  within arm's reach of an outcrop would lead to a
  biased sample.

• sample size: How big must it be for the sample
  to represent the population? No real answer as it
  depends upon the variability of the population and
  the degree of precision one wants to achieve in
  answering the question.

 more on how to determine sample size later ....

• parametric statistics: Statistical procedures
  used on interval or ratio data. Usually many
  assumptions must be made.

• nonparametric statistics: Statistical procedures
  used on ordinal data based on ranks. Not so many
  assumptions are necessary.

• precision: Reliability of a measurement. Usually
  determined by taking repeated measurements.




                        1.4
• accuracy: The closeness of a measurement to the
  true value. Usually unknown in biology, geology and
  paleontology, but can sometimes can be
  determined from a known standard.


What statistics to use? answer depends on . . .

• what questions you want to ask

• types and quality of data available (e.g.
  parametric or nonparametric)

• can be descriptive, comparative, or classificatory

• can involve one or more samples (one-way or
  two-way analyses) and one or more variables (e.g.
  univariate , bivariate, or multivariate)

• topic of course ! ! !




                          1.5
SIGNIFICANT FIGURES


•   Should always maintain significant digits through
    all calculations


•   The last digit should imply precision and is an
    estimate


•   For example:

     45.346 implies any number between 45.3455
     and 45.3465


•   The last digit, including 0 (zero) to the right of
    decimal points, is always significant!


•   Should use enough significant figures to have at
    least 30-300 divisions between lowest and
    highest measurements


•   Because an error of 1 digit in samples with less
    than 30 divisions will have an error of more than
    is acceptable for most statistical tests



                          1.6
ROUNDING NUMBERS
•   Should round numbers to get desired significant
    number of digits


•   The Rules:


     – Not changed if it is followed by a number
       equal to or less than 5


     – Add 1 to the number if it is followed by a
       number greater than 5


      Number       Significant Figures Answer


         26.58                2           27


       133.7137               5         133.71


        0.03725               3         0.0372


       0.037152               3         0.0372




                        1.7

More Related Content

PPT
Introduction to spss
PPTX
Inferential statistics nominal data
PPTX
Inferential statistics correlations
PDF
20151120221133 how to analyze survey research data
PPT
Burns And Bush Chapter 16
PPT
Burns And Bush Chapter 15
PPTX
Inferential Statistics
PPTX
Statistical Analysis for Educational Outcomes Measurement in CME
Introduction to spss
Inferential statistics nominal data
Inferential statistics correlations
20151120221133 how to analyze survey research data
Burns And Bush Chapter 16
Burns And Bush Chapter 15
Inferential Statistics
Statistical Analysis for Educational Outcomes Measurement in CME

What's hot (17)

PDF
Applied statistics lecture 1
PPTX
Statistics for Librarians, Session 2: Descriptive statistics
PPT
Statistical Analysis Overview
PPT
Confirmatory Factor Analysis
PPT
Descriptive statistics
PPTX
Factor Analysis in Research
PPT
Chapter6
PPTX
Statistics for Physical Education
PPTX
Data Analysis: Descriptive Statistics
PPTX
Introduction to data analysis using excel
PPTX
Statistical analysis in SPSS_
PPT
Inferential statistics
PPTX
Quantitative data analysis final
PPTX
Non parametrics
PDF
Factor analysis
PPTX
Quants
PPTX
Factor Analysis Prakash Poddar
Applied statistics lecture 1
Statistics for Librarians, Session 2: Descriptive statistics
Statistical Analysis Overview
Confirmatory Factor Analysis
Descriptive statistics
Factor Analysis in Research
Chapter6
Statistics for Physical Education
Data Analysis: Descriptive Statistics
Introduction to data analysis using excel
Statistical analysis in SPSS_
Inferential statistics
Quantitative data analysis final
Non parametrics
Factor analysis
Quants
Factor Analysis Prakash Poddar
Ad

Similar to 01 introduction stat (20)

PPTX
Biostatistics
PPTX
Machine learning pre requisite
PPT
Chapter01
PPT
Chapter01
PPTX
Medical Statistics.pptx
PPT
1.1 STATISTICS
PDF
1 lab basicstatisticsfall2013
PPT
Introduction To Statistics.ppt
PPT
Introduction to statistics
PPTX
Math221 week3
PPTX
Week 3 lecture_math_221_mar_2012
PPTX
Stat-Lesson.pptx
PPTX
Presentation1
PPT
Introduction-To-Statistics-18032022-010747pm (1).ppt
PPTX
Week 3 lecture_math_221_nov_2012
PPT
grade7statistics-150427083137-conversion-gate01.ppt
DOCX
Statistics
PPTX
Basic geostatistics
PDF
Lesson 1.pdf probability and statistics.
PPT
Statistical methods
Biostatistics
Machine learning pre requisite
Chapter01
Chapter01
Medical Statistics.pptx
1.1 STATISTICS
1 lab basicstatisticsfall2013
Introduction To Statistics.ppt
Introduction to statistics
Math221 week3
Week 3 lecture_math_221_mar_2012
Stat-Lesson.pptx
Presentation1
Introduction-To-Statistics-18032022-010747pm (1).ppt
Week 3 lecture_math_221_nov_2012
grade7statistics-150427083137-conversion-gate01.ppt
Statistics
Basic geostatistics
Lesson 1.pdf probability and statistics.
Statistical methods
Ad

01 introduction stat

  • 1. THE BASICS What are statistics? Statistics can summarize and simplify large amounts of numerical data. Using statistics one can draw conclusions about data. Statistics is a discipline that examines data and can calculate numerical estimates of "true" values. Statistics can not prove anything- estimates are normally presented in probabilistic terms (e.g. we are 95% sure ...) Statistics can not make bad data better - "garbage in, garbage out" Why use statistics? Want to characterize something (species, community composition, stratigraphic range, average grain size, etc...) for which we have only a limited sample- we must therefore estimate the "true" parameters by employing statistical methods. 1.1
  • 2. Statistics may reveal underlying patterns in data not normally observable (especially true in multivariate analyses). If used correctly, statistics can separate the probable from the possible Types of Data • ratio-scale data. Measurements along a continuous scale whose scale begins at 0 (e.g. lengths or widths in mm). • interval-scale data. Same as ratio, but data do not have 0 as low end of scale (e.g. temperature). • ordinal-scale data. Generally used for irregular scaled data converted to ranks or relative position (e.g. position of stratigraphic stages). • discrete data. Not continuous, usually counts (e.g. number of individuals per sample). • nominal or categorical data. Includes binary data (e.g. presence/absence) or group data (e.g. sandstone/siltstone/mudstone). 1.2
  • 3. Some Basic Definitions • variable: Anything that varies and can be measured (e.g. measurement, property, quantity. and attribute). Determining the relationships between variables is the realm of R-mode analysis. • object: Unit of study on which variables can be measured (e.g. case, individual, specimen). Determining the relationships between objects is the realm of Q-mode analysis. • population: The total set of measurements. The limits of the population should be designated before any analysis. (e.g. the size of all specimens of brachiopod species x.) Usually the population is unknowable and must be estimated by a sample. • sample: Collection of objects which are a subset of the population of interest and are taken as representative of the population. (e.g. the size of 20 specimens of brachiopod species x from a particular outcrop.) 1.3
  • 4. Note: if the sample is not representative of the population, it is said to be biased. e.g. collecting only large specimens of brachiopod species x only within arm's reach of an outcrop would lead to a biased sample. • sample size: How big must it be for the sample to represent the population? No real answer as it depends upon the variability of the population and the degree of precision one wants to achieve in answering the question. more on how to determine sample size later .... • parametric statistics: Statistical procedures used on interval or ratio data. Usually many assumptions must be made. • nonparametric statistics: Statistical procedures used on ordinal data based on ranks. Not so many assumptions are necessary. • precision: Reliability of a measurement. Usually determined by taking repeated measurements. 1.4
  • 5. • accuracy: The closeness of a measurement to the true value. Usually unknown in biology, geology and paleontology, but can sometimes can be determined from a known standard. What statistics to use? answer depends on . . . • what questions you want to ask • types and quality of data available (e.g. parametric or nonparametric) • can be descriptive, comparative, or classificatory • can involve one or more samples (one-way or two-way analyses) and one or more variables (e.g. univariate , bivariate, or multivariate) • topic of course ! ! ! 1.5
  • 6. SIGNIFICANT FIGURES • Should always maintain significant digits through all calculations • The last digit should imply precision and is an estimate • For example: 45.346 implies any number between 45.3455 and 45.3465 • The last digit, including 0 (zero) to the right of decimal points, is always significant! • Should use enough significant figures to have at least 30-300 divisions between lowest and highest measurements • Because an error of 1 digit in samples with less than 30 divisions will have an error of more than is acceptable for most statistical tests 1.6
  • 7. ROUNDING NUMBERS • Should round numbers to get desired significant number of digits • The Rules: – Not changed if it is followed by a number equal to or less than 5 – Add 1 to the number if it is followed by a number greater than 5 Number Significant Figures Answer 26.58 2 27 133.7137 5 133.71 0.03725 3 0.0372 0.037152 3 0.0372 1.7