SlideShare a Scribd company logo
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.1
Lesson 2:
STATISTICS
Data Collection and Sampling
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2
Recall…
Statistics is a tool for converting data into information:
Data
Statistics
Information
• But where then does data come from?
• How is it gathered?
• How do we ensure its accurate?
• Is the data reliable?
• Is it representative of the population from which it was drawn?
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3
Methods of Collecting Data…
There are many methods used to collect or obtain data
for statistical analysis. Three of the most popular
methods are:
• Direct Observation
• Experiments, and
• Surveys.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4
Surveys…
A survey solicits information from people;
examples: polls; pre-election polls; marketing surveys.
Surveys may be administered in a variety of ways.
Example given:
• Personal Interview,
• Telephone Interview,
• Internet, and
• Self Administered Questionnaire
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5
Questionnaire Design…
Over the years, a lot of thought has been put into the science of the design
of survey questions. Key design principles:
1. Keep the questionnaire as short as possible.
2. Ask short, simple, and clearly worded questions.
3. Start with demographic questions to help respondents get started
comfortably.
4. Use dichotomous (yes|no) and multiple choice questions.
5. Use open-ended questions cautiously.
6. Avoid using leading-questions.
7. Pretest a questionnaire on a small number of people.
8. Think about the way you intend to use the collected data when preparing
the questionnaire.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.6
Sampling Plans…
A sampling plan is just a method or procedure for specifying how a
sample will be taken from a population.
We will focus our attention on these three methods:
• Simple Random Sampling,
• Stratified Random Sampling, and
• Cluster Sampling.
• Random sampling, by far, is the most common one used.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.7
Simple Random Sampling…
A simple random sample is a sample selected in such a way that
every possible sample of the same size is equally likely to be
chosen.
Drawing three names from a hat containing all the names of the
students in the class is an example of a simple random sample: any
group of three names is as equally likely as picking any other group
of three names.
VERY EASY TO DEFINE!
VERY, VERY DIFFICULT TO DO!
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.8
Simple Random Sampling…
A government income tax auditor must choose a sample of 5 of 11
returns to audit…[Can do many different ways]
Person
baker 0.87487
george 0.89068
ralph 0.11597
mary 0.58635
sally 0.34346
joe 0.24662
andrea 0.47609
mark 0.08350
greg 0.53542
aaron 0.37239
kim 0.73809
Generate
Random # Person
1 mark 0.08350
2 ralph 0.11597
3 joe 0.24662
4 sally 0.34346
5 aaron 0.37239
andrea 0.47609
greg 0.53542
mary 0.58635
kim 0.73809
baker 0.87487
george 0.89068
Sorted
Random #
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.9
Stratified Random Sampling…
A stratified random sample is obtained by separating the population
into mutually exclusive sets, or strata, and then drawing simple
random samples from each stratum.
Strata 1 : Gender
Male
Female
Strata 2 : Age
< 20
20-30
31-40
41-50
51-60
> 60
Strata 3 : Occupation
professional
clerical
blue collar
other
We can acquire about the total population, make inferences
within a stratum or make comparisons across strata
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.10
Stratified Random Sampling…
After the population has been stratified, we can use simple random
sampling to generate the complete sample:
If we only have sufficient resources to sample 400 people total,
we would draw 100 of them from the low income group…
…if we are sampling 1000 people, we’d draw
50 of them from the high income group.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.11
Cluster Sampling…
A cluster sample is a simple random sample of groups or clusters of
elements (vs. a simple random sample of individual objects).
This method is useful when it is difficult or costly to develop a
complete list of the population members or when the population
elements are widely dispersed geographically. Used more in the
“old days”.
Cluster sampling may increase sampling error due to similarities
among cluster members.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.12
Sample Size…
Numerical techniques for determining sample sizes
will be described later, but suffice it to say that the
larger the sample size is, the more accurate we
can expect the sample estimates to be.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.13
Sampling and Non-Sampling Errors…
Two major types of error can arise when a sample of observations is
taken from a population:
sampling error and nonsampling error.
 Sampling error refers to differences between the sample and the
population that exist only because of the observations that happened to
be selected for the sample. Random and we have no control over.
 Nonsampling errors are more serious and are due to mistakes made in
the acquisition of data or due to the sample observations being selected
improperly. Most likely caused be poor planning, sloppy work, act of
the Goddess of Statistics, etc.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.14
Sampling Error…
Sampling error refers to differences between the sample and the
population that exist only because of the observations that
happened to be selected for the sample.
Increasing the sample size will reduce this type of error.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.15
Nonsampling Error…
Nonsampling errors are more serious and are due to mistakes made
in the acquisition of data or due to the sample observations being
selected improperly. Three types of nonsampling errors:
Errors in data acquisition,
Nonresponse errors, and
Selection bias.
Note: increasing the sample size will not reduce this type of error.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.16
Errors in data acquisition…
…arises from the recording of incorrect responses, due to:
— incorrect measurements being taken because of faulty equipment,
— mistakes made during transcription from primary sources,
— inaccurate recording of data due to misinterpretation of terms, or
— inaccurate responses to questions concerning sensitive issues.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.17
Nonresponse Error…
…refers to error (or bias) introduced when responses are not
obtained from some members of the sample, i.e. the sample
observations that are collected may not be representative of the
target population.
As mentioned earlier, the Response Rate (i.e. the proportion of all
people selected who complete the survey) is a key survey parameter
and helps in the understanding in the validity of the survey and
sources of nonresponse error.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.18
Selection Bias…
…occurs when the sampling plan is such that some members of the
target population cannot possibly be selected for inclusion in the
sample.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

More Related Content

PDF
Data collection and_sampling sample an method
PDF
Sampling as data collection
DOC
Sampling
PPT
Data gathering section1.1
PPT
Chap 5
PPT
Sampling ppt my report
PPT
Adler clark 4e ppt 05
PPT
Sampling The process of drawing a number of individual cases from a larger p...
Data collection and_sampling sample an method
Sampling as data collection
Sampling
Data gathering section1.1
Chap 5
Sampling ppt my report
Adler clark 4e ppt 05
Sampling The process of drawing a number of individual cases from a larger p...

Similar to Lesson 2...STATISTICS.ppt Data collection (20)

PPT
Sampling The process of drawing a number of individual cases from a larger p...
PPT
Sampling The process of drawing a number of individual cases from a larger p...
PPT
Sampling The process of drawing a number of individual cases from a larger p...
PPT
PPT
135-Ch5.ppt
PPTX
Research methodology for natural resource management students
PPT
Week 15 PowerPoint
PPT
chapter_5.ppt
PPTX
Basic Statistics Sampling Techniques 2.pptx
PPT
Chapter 015
PPT
CH 3 Sampling (3).pptx.ppt
PPTX
Research methodology – unit 4
PPTX
Research methodology unit four
PPTX
collection of sample of research methodology.pptx
PPT
Sampling Methods, Types of Sampling Methods
PPT
7027203.ppt
PPTX
Research 1: Sampling
PPTX
Sampling in Market Research
PPTX
sampling and its types sdfsffvsdfvsdfvsdfdsf
DOCX
handouts-in-Stat-unit-7.docx
Sampling The process of drawing a number of individual cases from a larger p...
Sampling The process of drawing a number of individual cases from a larger p...
Sampling The process of drawing a number of individual cases from a larger p...
135-Ch5.ppt
Research methodology for natural resource management students
Week 15 PowerPoint
chapter_5.ppt
Basic Statistics Sampling Techniques 2.pptx
Chapter 015
CH 3 Sampling (3).pptx.ppt
Research methodology – unit 4
Research methodology unit four
collection of sample of research methodology.pptx
Sampling Methods, Types of Sampling Methods
7027203.ppt
Research 1: Sampling
Sampling in Market Research
sampling and its types sdfsffvsdfvsdfvsdfdsf
handouts-in-Stat-unit-7.docx
Ad

Recently uploaded (20)

PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Classroom Observation Tools for Teachers
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
01-Introduction-to-Information-Management.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Lesson notes of climatology university.
PPTX
Cell Structure & Organelles in detailed.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
O7-L3 Supply Chain Operations - ICLT Program
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Pharma ospi slides which help in ospi learning
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Classroom Observation Tools for Teachers
VCE English Exam - Section C Student Revision Booklet
TR - Agricultural Crops Production NC III.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pre independence Education in Inndia.pdf
Anesthesia in Laparoscopic Surgery in India
O5-L3 Freight Transport Ops (International) V1.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
01-Introduction-to-Information-Management.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Supply Chain Operations Speaking Notes -ICLT Program
Lesson notes of climatology university.
Cell Structure & Organelles in detailed.
Final Presentation General Medicine 03-08-2024.pptx
O7-L3 Supply Chain Operations - ICLT Program
Ad

Lesson 2...STATISTICS.ppt Data collection

  • 1. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.1 Lesson 2: STATISTICS Data Collection and Sampling
  • 2. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.2 Recall… Statistics is a tool for converting data into information: Data Statistics Information • But where then does data come from? • How is it gathered? • How do we ensure its accurate? • Is the data reliable? • Is it representative of the population from which it was drawn?
  • 3. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.3 Methods of Collecting Data… There are many methods used to collect or obtain data for statistical analysis. Three of the most popular methods are: • Direct Observation • Experiments, and • Surveys.
  • 4. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.4 Surveys… A survey solicits information from people; examples: polls; pre-election polls; marketing surveys. Surveys may be administered in a variety of ways. Example given: • Personal Interview, • Telephone Interview, • Internet, and • Self Administered Questionnaire
  • 5. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.5 Questionnaire Design… Over the years, a lot of thought has been put into the science of the design of survey questions. Key design principles: 1. Keep the questionnaire as short as possible. 2. Ask short, simple, and clearly worded questions. 3. Start with demographic questions to help respondents get started comfortably. 4. Use dichotomous (yes|no) and multiple choice questions. 5. Use open-ended questions cautiously. 6. Avoid using leading-questions. 7. Pretest a questionnaire on a small number of people. 8. Think about the way you intend to use the collected data when preparing the questionnaire.
  • 6. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.6 Sampling Plans… A sampling plan is just a method or procedure for specifying how a sample will be taken from a population. We will focus our attention on these three methods: • Simple Random Sampling, • Stratified Random Sampling, and • Cluster Sampling. • Random sampling, by far, is the most common one used.
  • 7. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.7 Simple Random Sampling… A simple random sample is a sample selected in such a way that every possible sample of the same size is equally likely to be chosen. Drawing three names from a hat containing all the names of the students in the class is an example of a simple random sample: any group of three names is as equally likely as picking any other group of three names. VERY EASY TO DEFINE! VERY, VERY DIFFICULT TO DO!
  • 8. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.8 Simple Random Sampling… A government income tax auditor must choose a sample of 5 of 11 returns to audit…[Can do many different ways] Person baker 0.87487 george 0.89068 ralph 0.11597 mary 0.58635 sally 0.34346 joe 0.24662 andrea 0.47609 mark 0.08350 greg 0.53542 aaron 0.37239 kim 0.73809 Generate Random # Person 1 mark 0.08350 2 ralph 0.11597 3 joe 0.24662 4 sally 0.34346 5 aaron 0.37239 andrea 0.47609 greg 0.53542 mary 0.58635 kim 0.73809 baker 0.87487 george 0.89068 Sorted Random #
  • 9. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.9 Stratified Random Sampling… A stratified random sample is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum. Strata 1 : Gender Male Female Strata 2 : Age < 20 20-30 31-40 41-50 51-60 > 60 Strata 3 : Occupation professional clerical blue collar other We can acquire about the total population, make inferences within a stratum or make comparisons across strata
  • 10. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.10 Stratified Random Sampling… After the population has been stratified, we can use simple random sampling to generate the complete sample: If we only have sufficient resources to sample 400 people total, we would draw 100 of them from the low income group… …if we are sampling 1000 people, we’d draw 50 of them from the high income group.
  • 11. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.11 Cluster Sampling… A cluster sample is a simple random sample of groups or clusters of elements (vs. a simple random sample of individual objects). This method is useful when it is difficult or costly to develop a complete list of the population members or when the population elements are widely dispersed geographically. Used more in the “old days”. Cluster sampling may increase sampling error due to similarities among cluster members.
  • 12. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.12 Sample Size… Numerical techniques for determining sample sizes will be described later, but suffice it to say that the larger the sample size is, the more accurate we can expect the sample estimates to be.
  • 13. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.13 Sampling and Non-Sampling Errors… Two major types of error can arise when a sample of observations is taken from a population: sampling error and nonsampling error.  Sampling error refers to differences between the sample and the population that exist only because of the observations that happened to be selected for the sample. Random and we have no control over.  Nonsampling errors are more serious and are due to mistakes made in the acquisition of data or due to the sample observations being selected improperly. Most likely caused be poor planning, sloppy work, act of the Goddess of Statistics, etc.
  • 14. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.14 Sampling Error… Sampling error refers to differences between the sample and the population that exist only because of the observations that happened to be selected for the sample. Increasing the sample size will reduce this type of error.
  • 15. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.15 Nonsampling Error… Nonsampling errors are more serious and are due to mistakes made in the acquisition of data or due to the sample observations being selected improperly. Three types of nonsampling errors: Errors in data acquisition, Nonresponse errors, and Selection bias. Note: increasing the sample size will not reduce this type of error.
  • 16. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.16 Errors in data acquisition… …arises from the recording of incorrect responses, due to: — incorrect measurements being taken because of faulty equipment, — mistakes made during transcription from primary sources, — inaccurate recording of data due to misinterpretation of terms, or — inaccurate responses to questions concerning sensitive issues.
  • 17. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.17 Nonresponse Error… …refers to error (or bias) introduced when responses are not obtained from some members of the sample, i.e. the sample observations that are collected may not be representative of the target population. As mentioned earlier, the Response Rate (i.e. the proportion of all people selected who complete the survey) is a key survey parameter and helps in the understanding in the validity of the survey and sources of nonresponse error.
  • 18. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 5.18 Selection Bias… …occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample.
  • 19. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Editor's Notes

  • #2: April 13, 2024