2. CONTENT
1.1 What is statistics?
1.2 Need for Statistics
1.3 Statistical Problem Solving Methodology
1.4 Role of Computer in Statistics
3. OBJECTIVE
By the end of this chapter, you should be able to
Define the meaning of statistics, population, sample,
parameter, statistic, descriptive statistics and inferential
statistics.
Understand and explain why a knowledge of statistics is
needed
Outline the basic steps in the statistical problem solving
methodology.
Identifies various method to obtain samples
Discuss the role of computers and data analysis software
in statistical work.
4. 1.1 What is Statistics?
Most people become familiar with probability and statistics
through radio, television, newspapers, and magazines. For
example, the following statements were found in newspapers
•Based on the 2000 census, 40.5 million households have two vehicles.
• The average annual salary for a professional football player for the year 2001
was $1,100,500.
• The average cost of a wedding is nearly RM10,000.
• In USA, the median salary for men with a bachelor’s degree is $49,982, while the
median salary for women with a bachelor’s degree is $35,408.
• Based on a survey of 250,000 individual auto leases signed from March 1
through April 15, 2002, 73% were for a Jaguar.
• Women who eat fish once a week are 29% less likely to develop heart disease.
5. is the science of conducting
studies to collect, organize,
summarize, analyze,
present, interpret and draw
conclusions from data.
Any values (observations or
measurements) that have been collected
Statistics
6. The basic idea behind all statistical methods of data analysis
is to make inferences about a population by studying small
sample chosen from it
Population
The complete collection of
measurements outcomes, object
or individual under study
Sample
A subset of a population,
containing the objects or outcomes
that are actually observed
Parameter
A number that describes a
population characteristics
Statistic
A number that describes a
sample characteristics
Tangible
Always finite & after a population is sampled,
the population size decrease by 1
The total number of members is fixed &
could be listed
Conceptual
Population that consists of all the
value that might possibly have been
observed & has an unlimited number
of members
7. Descriptive & Inferential Statistics
Inferential statistics
consists of generalizing from
samples to populations,
performing estimations
hypothesis testing,
determining relationships
among variables, and making
predictions.
Used when we want to draw a
conclusion for the data obtain
from the sample
Used to describe, infer,
estimate, approximate the
characteristics of the target
population
Descriptive statistics
consists of the collection,
organization,
classification,
summarization, and
presentation of data
obtain from the sample.
Used to describe the
characteristics of the
sample
Used to determine
whether the sample
represent the target
population by comparing
sample statistic and
population parameter
8. An overview of descriptive statistics
and statistical inference
START
Gathering of
Data
Classification,
Summarization, and
Processing of data
Presentation and
Communication of
Summarized information
Is Information from a
sample?
Use cencus data to
analyze the population
characteristic under study
Use sample information
to make inferences about
the population
Draw conclusions about
the population
characteristic (parameter)
under study
STOP
Yes
No
Statistical
Inference
Descripti
ve
Statistics
Statistical
Inference
Descriptive
Statistics
No
Yes
9. 1.2 Need for Statistics
It is a fact that, you need a knowledge of
statistics to help you
1. Describe and understand numerical relationship
2. Make better decision
10. Describing relationship between
variables
1. A management consultant wants to compare a client’s
investment return for this year with related figures from last
year. He summarizes masses of revenue and cost data
from both periods and based on his findings, presents his
recommendations to his client.
2. A college admission director needs to find an effective way
of selecting student applicants. He design a statistical study
to see if there’s a significance relationship between SPM
result and the gpa achieved by freshmen at his school. If
there is a strong relationship, high SPM result will become
an important criteria for acceptance.
11. Aiding in Decision Making
1. Suppose that the manager of Big-Wig Executive Hair Stylist,
Hugo Bald, has advertised that 90% of the firm’s customers
are satisfied with the company’s services. If Pamela, a
consumer activist, feels that this is an exaggerated
statement that might require legal action, she can use
statistical inference techniques to decide whether or not to
sue Hugo.
2. Students and professional people can also use the
knowledge gained from studying statistics to become better
consumers and citizens. For example, they can make
intelligent decisions about what products to purchase based
on consumer studies about government spending based on
utilization studies, and so on.
12. 1.3 Statistical problem solving
Methodology
6 Basic Steps
1. Identifying the problem or opportunity
2. Deciding on the method of data collection
3. Collecting the data
4. Classifying and summarizing the data
5. Presenting and analyzing the data
6. Making the decision
13. STEP 1
Identifying the problem or opportunity
Must clearly understand & correctly define exactly what it is
that the study is to accomplish
If not, time & effort are waste
Is the goal to study some population?
Is it to impose some treatment on the group & then gauge the
response?
Can the study goal be achieved through mere counts or
measurements of the group?
Must an experiment be performed on the group?
If sample are needed, how large?, how should they be
taken?
14. STEP 2
Deciding on the Method of Data Collection
Data must be gathered that are accurate, as
complete as possible & relevant to the
problem
Data can be obtained in 3 ways
1. Data that are made available by others
(internal, external, primary or secondary data)
2. Data resulting from an experiment
(experimental study)
3. Data collected in an observational study
(observation, survey, questionnaire)
15. STEP 3
Collecting the data
Nonprobability data
Is one in which the judgment of the experimenter,
the method in which the data are collected or
other factors could affect the results of the
sample
Probability data
Is one in which the chance of selection of each
item in the population is known before the
sample is picked
16. Nonprobability data samples
Judgment samples
Base on opinion of one or more expert person
Ex: A political campaign manager intuitively picks certain
voting districts as reliable places to measure the public
opinion of his candidate
Voluntary samples
Question are posed to the public by publishing them over
radio or tv (phone or sms)
Convenience samples
Take an ‘easy sample’
Ex: A surveyor will stand in one location & ask passerby
their questions
17. Probability data samples
Random samples
Selected using chance method or random methods
Systematic samples
Numbering each subject of the populations & select every kth
number
Stratified samples
Dividing the population into groups according some
characteristic that is important to the study, then sampling from
each group
Cluster samples
Dividing the population into sections/clusters, then randomly
select some of those cluster & then chose all members from
those selected cluster
18. Identified the type of sampled obtain
Example 1
A physical education professor wants to study the
physical fitness levels of students at her university. There are
20,000 students enrolled at the university, and she wants to draw
a sample of size 100 to take a physical fitness test. She obtains a
list of all 20,000 students, numbered it from 1 to 20,000 and then
invites the 100 students corresponding to those numbers to
participate in the study.
Example 2
A quality engineer wants to inspect rolls of wallpaper in order
to obtain information on the rate at which flows in the printing are
occurring. She decides to draw a sample of 50 rolls of wallpaper from
a day’s production. Each hour for 5 hours, she takes the 10 most
recently produced rolls and counts the number of flaws on each. Is
this a simple random sample?
19. Example 3
Suppose we have a list of 1000 registered voters in a community and we
want to pick a probability sample of 50. We can use a random number table to
pick one of the first 20 voters (1000/50 = 20) on our list. If the table gave us the
number of 16, the 16th voter on the list would be the first to be selected. We
would then pick every 20th name after this random start (the 36th voter, the 56th
voter, etc) to produce a systematic sample.
Example 4
Consumer surveys of large cities often employ cluster sampling. The
usual procedure is to divide a map of the city into small blocks each blocks
containing a cluster are surveyed. A number of clusters are selected for the
sample, and all the households in a cluster are surveyed. Using a cluster
sampling can reduce cost and time. Less energy and money are expended if an
interviewer stays within a specific area rather than traveling across stretches of
the cities.
20. STEP 4
Classifying and Summarizing the data
Organize or group the facts for study
Classifying- identifying items with like
characteristics & arranging them into groups or
classes
Ex: Production data (product make, location, production
process ext..)
Summarization
Graphical & Descriptive statistics ( tables, charts, measure
of central tendency, measure of variation, measure of
position)
21. Types of
Data
Qualitative
(categorical/Attributes)
1* Data that refers only to
name classification (done
using numbers)
2* Can be placed into
distinct categories
according to some
characteristic or attribute.
Quantitative
(Numerical)
1* Data that represent
counts or measurements
(can be count or measure)
2* Are numerical in nature
and can be ordered or
ranked.
Nominal Data (can’t be rank)
Gender, race, citizenship. ext
Ordinal Data (can be rank)
Feeling (dislike – like),
color (dark – bright) , ext
Discrete Variables
Assume values that can be
counted and finite
Ex : no of something
Continuous variables
Can assume all values
between any two specific
values & it obtained by
measuring
Ex: weight, age, salary, height,
temperature, ext
Use code
numbers (1,
2,…)
22. Example
The Lemon Marketing Corporation has asked you for information about the car
you drive. For each question, identify each of the types of data requested as
either attribute data or numeric data. When numeric data is requested,
identify the variable as discrete or continuous.
1. What is the weight of your car?
2. In what city was your car made?
3. How many people can be seated in your car?
4. What’s the distance traveled from your home to your school?
5. What’s the color of your car?
6. How many cars are in your household?
7. What’s the length of your car?
8. What’s the normal operating temperature (in degree Fahrenheit) of your car’s
engine?
9. What gas mileage (miles per gallon) do you get in city driving?
10. Who made your car?
11. How many cylinders are there in your car’s engine?
12. How many miles have you put on your car’s current set of tyres?
23. Level of Measurements of Data
Nominal-level
data
Ordinal-level
data
Interval-level
data
Ratio-level
data
classifies data
into mutually
exclusive (non
overlapping),
exhausting
categories in
which no order or
ranking can be
imposed on the
data
classifies data
into categories
that can be
ranked;
however, precise
differences
between the
ranks do not
exist
ranks data, and
precise
differences
between units of
measure do exist;
however, there is
no meaningful
zero
Possesses all the
characteristics of
interval
measurement,
and there exists a
true zero.
Examples
24. STEP 5
Presenting and Analyzing the data
Summarized & analyzed information given
by the graphical & descriptive statistics
Identify the relationship of the information
Making any relevant statistical inferences
(hypothesis testing, confidence interval,
anova, control charts, ext…)
25. STEP 6
Making the decision
The analyst weighs the options in light of
established goals to arrive at the plan or
decision that represents the ‘best’ solution
to the problem
The correctness of this choice depends on
analytical skill and information quality
26. Statistical
Problem
Solving
Methodology
START
Identify the problem or
opportunity
Gather available internal and
external facts relevant to the
problem
Gather new data from populations and
samples using instruments, interviews,
questionnaire, etc
Classify, summarize, and
process data using tables,
charts, and numerical
descriptive measure
Present and communicate
summarized information in
form of tables, charts and
descriptive measure
Use cencus information to
evaluate alternative courses of
action and make decisions
Use sample information to
1. Estimate value of parameter
2. Test assumptions about
parameter
Interpret the results, draw
conclusions, and make decisions
STOP
Are available facts
sufficient?
Is information from
a sample?
No
No
Yes
Yes
27. 1.4 Role of the Computer in Statistics
Two software tools commonly used for data
analysis
1. Spreadsheets
Microsoft Excel & Lotus 1-2-3
2. Statistical Packages
MINITAB, SAS, SPSS and SPlus
28. Conclusion
The applications of
statistics are many
and varied. People
encounter them in
everyday life, such as
in reading newspapers
or magazines,
listening to the radio,
or watching television.