SlideShare a Scribd company logo
Data Analysis Part 1:
Preparation, Frequencies,
Hypothesis Testing
MBA2216 BUSINESS RESEARCH PROJECT
by
Stephen Ong
Visiting Fellow, Birmingham City University, UK
Visiting Professor, Shenzhen University
19–2
LEARNING OUTCOMES
1. Know when a response is really an error and
should be edited
2. Appreciate coding of pure qualitative research
3. Understand the way data are represented in a data
file
4. Understand the coding of structured responses
including a dummy variable approach
5. Appreciate the ways that technological advances
have simplified the coding process
After this lecture, you should be able to
6. Know what descriptive statistics are and why they are
used
7. Create and interpret simple tabulation tables
8. Understand how cross-tabulations can reveal
relationships
9. Perform basic data transformations
10. List different computer software products designed for
descriptive statistical analysis
11. Understand a researcher’s role in interpreting the data
12. Implement the hypothesis-testing procedure
13. Use p-values to assess statistical significance
19–3
LEARNING OUTCOMES
14. Test a hypothesis about an observed mean compared to
some standard
15. Know the difference between Type I and Type II errors
16. Know when a univariate χ2 test is appropriate and how
to conduct one
17. Recognize when a bivariate statistical test is
appropriate
18. Calculate and interpret a χ2 test for a contingency table
19. Calculate and interpret an independent samples t-test
comparing two means
19–4
LEARNING OUTCOMES
Remember this,
 Garbage in, garbage out!
 If data is collected improperly, or coded
incorrectly, then the research results
are ―garbage‖.
Stages of Data Analysis
 Raw Data
 The unedited responses from a respondent
exactly as indicated by that respondent.
 Nonrespondent Error
 Error that the respondent is not responsible
for creating, such as when the interviewer
marks a response incorrectly.
 Data Integrity
 The notion that the data file actually contains
the information that the researcher is trying to
obtain to adequately address research
questions.
19–7
EXHIBIT 19.1 Overview of the Stages of Data Analysis
Editing
 Editing
 The process of checking the completeness,
consistency, and legibility of data and making the
data ready for coding and transfer to storage.
 E.g. How long you have stayed at your current address?
45
 The researchers need to make adjustment/reconstruct
responses
 Field Editing – useful in personal interview
 Preliminary editing by a field supervisor on the same
day as the interview to catch technical omissions,
check legibility of handwriting, and clarify responses
that are logically or conceptually inconsistent.
 In-House Editing
 A rigorous editing job performed by a centralized
office staff.
Editing – what to do?
 Checking for Consistency
 Respondents match defined population – e.g.
SBS?
 Check for consistency within the data collection
framework – e.g. items listed by the respondents
are within the definition.
 Taking Action When Response is Obviously
in Error
 Change/correct responses only when there are
multiple pieces of evidence for doing so.
 Editing Technology
 Computer routines can check for consistency
automatically.
19–10
Editing for Completeness
 Item Nonresponse
 The technical term for an unanswered question on an
otherwise complete questionnaire resulting in missing data.
 Most of the time the researchers will do nothing to it.
 But sometimes the question is linked to another question
therefore the researchers have to fill-in-the blank.
 Plug Value
 An answer that an editor ―plugs in‖ to replace blanks or
missing values so as to permit data analysis.
 Choice of value is based on a predetermined decision
rule, e.g. take an average value or neutral value.
 Several choices:
 Leave it blank
 Plug in alternate choices.
 Randomly select an answer.
 Impute a missing value.
Editing …
 Impute
 To fill in a missing data point through the use of a
statistical process providing an educated guess for
the missing response based on available
information.
 I.e. based on the respondent’s choices to other
questions.
Editing for Completeness
(cont’d)
 What about missing data? Many statistical software
programs required complete data for an analysis to
take place.
 List-wise deletion
 The entire record for a respondent that has left
a response missing is excluded from use in
statistical analysis.
 Pair-wise deletion
 Only the actual variables for a respondent that
do not contain information are eliminated from
use in statistical analysis.
Please take note,
 When a questionnaire has too many
missing answer, it may not be suitable
for the planned data analysis. In such
situation, that particular questionnaire
has to be dropped from the sample.
Facilitating the Coding
Process
 Editing And Tabulating ―Don’t Know‖
Answers
 Legitimate don’t know (no opinion)
 Reluctant don’t know (refusal to answer)
 Confused don’t know (does not
understand)
Editing (cont’d)
 Pitfalls of Editing
 Allowing subjectivity to enter into the editing process.
 Data editors should be intelligent, experienced, and
objective.
 A systematic procedure for assessing the
questionnaire should be developed by the research
analyst so that the editor has clearly defined decision
rules.
 Pretesting Edit
 Editing during the pretest stage can prove very
valuable for improving questionnaire format,
identifying poor instructions or inappropriate question
wording.
Coding Qualitative Responses
 Coding
 The process of assigning a numerical score
or other character symbol to previously
edited data.
 Codes
 Rules for interpreting, classifying, and
recording data in the coding process.
 The actual numerical or other character
symbols assigned to raw data.
 Dummy Coding
 Numeric ―1‖ or ―0‖ coding where each
number represents an alternate response
such as ―female‖ or ―male.‖
 If k is the number of categories for a
qualitative variable, k-1 dummy variables are
needed.
Data File Terminology
 Field
 A collection of characters that represents a
single type of data—usually a variable.
 String Characters
 Computer terminology to represent formatting
a variable using a series of alphabetic
characters (nonnumeric characters) that may
form a word.
 Record
 A collection of related fields that represents
the responses from one sampling unit.
Data File Terminology (cont’d)
 Data File
 The way a data set is stored electronically
in spreadsheet-like form in which the rows
represent sampling units and the columns
represent variables.
 Value Labels
 Unique labels assigned to each possible
numeric code for a response.
Code Construction
 Two Basic Rules for Coding Categories:
1. They should be exhaustive, meaning that a coding
category should exist for all possible responses.
2. They should be mutually exclusive and independent,
meaning that there should be no overlap among the
categories to ensure that a subject or response can be
placed in only one category.
 Test Tabulation – especially useful for open-ended
questions
 Tallying of a small sample of the total number of replies
to a particular question in order to construct coding
categories.
 Purpose is to preliminarily identify the stability and
distribution of answers that will determine a coding
scheme.
Test Tabulation
 E.g.
 1st respondent: I don’t like to use Facebook
because it is wasting time.
 2nd respondent: I don’t know what is Facebook.
 3rd respondent: Facebook takes me a lot of time.
 Based on the above 3 answer, you can have 2
groups of answer:
 1st group: Time factor
 2nd group: No knowledge on Facebook
Devising the Coding Scheme
 A coding scheme should not be too
elaborate.
 The coder’s task is only to summarize the
data.
 Categories should be sufficiently
unambiguous that coders will not classify
items in different ways.
 Code book
 Identifies each variable in a study and gives
the variable’s description, code name, and
position in the data matrix.
The Nature of Descriptive
Analysis
 Descriptive Analysis
 The elementary transformation of raw data
in a way that describes the basic
characteristics such as central tendency,
distribution, and variability.
 Histogram
 A graphical way of showing a frequency
distribution in which the height of a bar
corresponds to the observed frequency of
the category.
20–23
EXHIBIT 20.1 Levels of Scale Measurement and Suggested Descriptive Statistics
Creating and Interpreting
Tabulation
 Tabulation
 The orderly arrangement of data in a table or
other summary format showing the number of
responses to each response category.
 Tallying is the term when the process is done
by hand.
 Frequency Table
 A table showing the different ways
respondents answered a question.
 Sometimes called a marginal tabulation.
Frequency Table Example
Cross-Tabulation
 Cross-Tabulation
 Addresses research questions involving
relationships among multiple less-than interval
variables.
 Results in a combined frequency table displaying
one variable in rows and another variable in
columns.
 Contingency Table
 A data matrix that displays the frequency of some
combination of responses to multiple variables.
 Marginals
 Row and column totals in a contingency table,
which are shown in its margins.
20–27
EXHIBIT 20.2 Cross-Tabulation Tables from a Survey Regarding AIG and
Government Bailouts
20–28
EXHIBIT 20.3 Different Ways of Depicting the Cross-Tabulation of Biological Sex
and Target Patronage
Cross-Tabulation (cont’d)
 Percentage Cross-Tabulations
 Statistical base – the number of respondents
or observations (in a row or column) used as
a basis for computing percentages.
 Elaboration and Refinement
 Elaboration analysis – an analysis of the basic
cross-tabulation for each level of a variable
not previously considered, such as
subgroups of the sample.
 Moderator variable – a third variable that
changes the nature of a relationship between
the original independent and dependent
variables.
EXHIBIT 20.4 Cross-Tabulation of Marital Status, Sex, and Responses to the
Question ―Do You Shop at Target?‖
Cross-Tabulation (cont’d)
 How Many Cross-Tabulations?
 Every possible response becomes a possible
explanatory variable.
 When hypotheses involve relationships
among two categorical variables, cross-
tabulations are the right tool for the job.
 Quadrant Analysis
 An extension of cross-tabulation in which
responses to two rating-scale questions are
plotted in four quadrants of a two-dimensional
table.
 Importance-performance analysis
EXHIBIT 20.5 An Importance-Performance or Quadrant Analysis of Hotels
20–33
Data Transformation
 Data Transformation
 Process of changing the data from their original
form to a format suitable for performing a data
analysis addressing research objectives.
Bimodal
20–34
Problems with Data
Transformations
 Median Split
 Dividing a data set into two categories by placing
respondents below the median in one category
and respondents above the median in another.
 The approach is best applied only when the data
do indeed exhibit bimodal characteristics.
 Inappropriate collapsing of continuous variables
into categorical variables ignores the information
contained within the untransformed values.
20–35
EXHIBIT 20.6 Bimodal Distributions Are Consistent with
Transformations into Categorical Values
20–36
EXHIBIT 20.7 The Problem with Median Splits with Unimodal Data
20–37
Index Numbers
 Index Numbers
 Scores or observations recalibrated to
indicate how they relate to a base number.
 Price indexes
 Represent simple data transformations that
allow researchers to track a variable’s value
over time and compare a variable(s) with
other variables.
 Recalibration allows scores or observations
to be related to a certain base period or base
number.
20–38
EXHIBIT 20.8 Hours of Television Usage per Week
20–39
Calculating Rank Order
 Rank Order
 Ranking data can be summarized by
performing a data transformation.
 The transformation involves multiplying
the frequency by the ranking score for
each choice resulting in a new scale.
20–40
EXHIBIT 20.9 Executive Rankings of Potential Conference Destinations
20–41
EXHIBIT 20.10 Frequencies of Conference Destination Rankings
20–42
EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations
20–43
Computer Programs for
Analysis
 Statistical
Packages
 Spreadsheets
 Excel
 Statistical software:
 SAS
 SPSS (Statistical
Package for Social
Sciences)
 MINITAB
20–44
Computer Graphics and
Computer Mapping
 Box and Whisker Plots
 Graphic representations of central
tendencies, percentiles, variabilities, and
the shapes of frequency distributions.
 Interquartile Range
 A measure of variability.
 Outlier
 A value that lies outside the normal range
of the data.
20–45
EXHIBIT 20.15 Computer Drawn
Box and Whisker
Plot
SPSS Windows
 The main program in SPSS is FREQUENCIES. It produces a
table of frequency counts, percentages, and cumulative
percentages for the values of each variable. It gives all of the
associated statistics.
 If the data are interval scaled and only the summary statistics
are desired, the DESCRIPTIVES procedure can be used.
 The EXPLORE procedure produces summary statistics and
graphical displays, either for all of the cases or separately for
groups of cases. Mean, median, variance, standard deviation,
minimum, maximum, and range are some of the statistics that
can be calculated.
SPSS Windows
To select these procedures click:
Analyze>Descriptive Statistics>Frequencies
Analyze>Descriptive Statistics>Descriptives
Analyze>Descriptive Statistics>Explore
The major cross-tabulation program is CROSSTABS.
This program will display the cross-classification tables and
provide cell counts, row and column percentages, the
chi-square test for significance, and all the measures of the
strength of the association that have been discussed.
To select these procedures, click:
Analyze>Descriptive Statistics>Crosstabs
SPSS Windows
The major program for conducting parametric tests in SPSS is
COMPARE MEANS. This program can be used to conduct t tests
on one sample or independent or paired samples. To select these
procedures using SPSS for Windows, click:
Analyze>Compare Means>Means …
Analyze>Compare Means>One-Sample T Test …
Analyze>Compare Means>Independent-Samples T Test …
Analyze>Compare Means>Paired-Samples T Test …
SPSS Windows
The nonparametric tests discussed in this chapter can
be conducted using NONPARAMETRIC TESTS.
To select these procedures using SPSS for Windows,
click:
Analyze>Nonparametric Tests>Chi-Square …
Analyze>Nonparametric Tests>Binomial …
Analyze>Nonparametric Tests>Runs …
Analyze>Nonparametric Tests>1-Sample K-S …
Analyze>Nonparametric Tests>2 Independent Samples …
Analyze>Nonparametric Tests>2 Related Samples …
1 - 50
SPSS Windows:
Frequencies
1. Select ANALYZE on the SPSS menu bar.
2. Click DESCRIPTIVE STATISTICS and
select FREQUENCIES.
3. Move the variable ―Familiarity [familiar]‖
to the VARIABLE(s) box.
4. Click STATISTICS.
5. Select MEAN, MEDIAN, MODE, STD.
DEVIATION, VARIANCE, and RANGE.
SPSS Windows:
Frequencies
6. Click CONTINUE.
7. Click CHARTS.
8. Click HISTOGRAMS, then
click CONTINUE.
9. Click OK.
Introduction of a Third Variable in
Cross-Tabulation
Refined Association
between the Two
Variables
No Association
between the Two
Variables
No Change in
the Initial
Pattern
Some Association
between the Two
Variables
Some Association
between the Two
Variables
No Association
between the Two
Variables
Introduce a Third
Variable
Introduce a Third
Variable
Original Two Variables
1 - 54
SPSS Windows: Cross-
tabulations
1. Select ANALYZE on the SPSS menu bar.
2. Click on DESCRIPTIVE STATISTICS and select
CROSSTABS.
3. Move the variable ―Internet Usage Group [iusagegr]‖ to
the ROW(S) box.
4. Move the variable ―Sex[sex]‖ to the COLUMN(S) box.
5. Click on CELLS.
6. Select OBSERVED under COUNTS and COLUMN under
PERCENTAGES.
SPSS Windows: Cross-
tabulations
7. Click CONTINUE.
8. Click STATISTICS.
9. Click on CHI-SQUARE, PHI AND
CRAMER’S V.
10. Click CONTINUE.
11. Click OK.
20–57
Interpretation
 Interpretation
 The process of drawing inferences from
the analysis results.
 Inferences drawn from interpretations
lead to managerial implications and
decisions.
 From a management perspective, the
qualitative meaning of the data and their
managerial implications are an important
aspect of the interpretation.
Hypothesis Testing
 Types of Hypotheses
 Relational hypotheses
 Examine how changes in one variable vary with
changes in another.
 Hypotheses about differences between
groups
 Examine how some variable varies from one group
to another.
 Hypotheses about differences from some
standard
 Examine how some variable differs from some
preconceived standard. These tests typify
univariate statistical tests.
21–59
Types of Statistical Analysis
 Univariate Statistical Analysis
 Tests of hypotheses involving only one
variable.
 Testing of statistical significance
 Bivariate Statistical Analysis
 Tests of hypotheses involving two variables.
 Multivariate Statistical Analysis
 Statistical analysis involving three or more
variables or sets of variables.
21–60
The Hypothesis-Testing
Procedure
 Process
1. The specifically stated hypothesis is derived
from the research objectives.
2. A sample is obtained and the relevant
variable is measured.
3. The measured sample value is compared to
the value either stated explicitly or implied in
the hypothesis.
 If the value is consistent with the hypothesis, the
hypothesis is supported.
 If the value is not consistent with the hypothesis,
the hypothesis is not supported.
20–61
EXHIBIT 20.10 Frequencies of Conference Destination Rankings
20–62
EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations
20–63
Computer Programs for
Analysis
 Statistical
Packages
 Spreadsheets
 Excel
 Statistical software:
 SAS
 SPSS (Statistical
Package for Social
Sciences)
 MINITAB
20–64
Computer Graphics and
Computer Mapping
 Box and Whisker Plots
 Graphic representations of central
tendencies, percentiles, variabilities, and
the shapes of frequency distributions.
 Interquartile Range
 A measure of variability.
 Outlier
 A value that lies outside the normal range
of the data.
20–65
EXHIBIT 20.15 Computer Drawn
Box and Whisker
Plot
SPSS Windows
 The main program in SPSS is FREQUENCIES. It produces a
table of frequency counts, percentages, and cumulative
percentages for the values of each variable. It gives all of the
associated statistics.
 If the data are interval scaled and only the summary statistics
are desired, the DESCRIPTIVES procedure can be used.
 The EXPLORE procedure produces summary statistics and
graphical displays, either for all of the cases or separately for
groups of cases. Mean, median, variance, standard deviation,
minimum, maximum, and range are some of the statistics that
can be calculated.
SPSS Windows
To select these procedures click:
Analyze>Descriptive Statistics>Frequencies
Analyze>Descriptive Statistics>Descriptives
Analyze>Descriptive Statistics>Explore
The major cross-tabulation program is CROSSTABS.
This program will display the cross-classification tables and
provide cell counts, row and column percentages, the
chi-square test for significance, and all the measures of the
strength of the association that have been discussed.
To select these procedures, click:
Analyze>Descriptive Statistics>Crosstabs
SPSS Windows
The major program for conducting parametric tests in SPSS is
COMPARE MEANS. This program can be used to conduct t tests
on one sample or independent or paired samples. To select these
procedures using SPSS for Windows, click:
Analyze>Compare Means>Means …
Analyze>Compare Means>One-Sample T Test …
Analyze>Compare Means>Independent-Samples T Test …
Analyze>Compare Means>Paired-Samples T Test …
SPSS Windows
The nonparametric tests discussed in this chapter can
be conducted using NONPARAMETRIC TESTS.
To select these procedures using SPSS for Windows,
click:
Analyze>Nonparametric Tests>Chi-Square …
Analyze>Nonparametric Tests>Binomial …
Analyze>Nonparametric Tests>Runs …
Analyze>Nonparametric Tests>1-Sample K-S …
Analyze>Nonparametric Tests>2 Independent Samples …
Analyze>Nonparametric Tests>2 Related Samples …
1 - 70
SPSS Windows:
Frequencies
1. Select ANALYZE on the SPSS menu bar.
2. Click DESCRIPTIVE STATISTICS and
select FREQUENCIES.
3. Move the variable ―Familiarity [familiar]‖
to the VARIABLE(s) box.
4. Click STATISTICS.
5. Select MEAN, MEDIAN, MODE, STD.
DEVIATION, VARIANCE, and RANGE.
SPSS Windows:
Frequencies
6. Click CONTINUE.
7. Click CHARTS.
8. Click HISTOGRAMS, then
click CONTINUE.
9. Click OK.
Introduction of a Third Variable in
Cross-Tabulation
Refined Association
between the Two
Variables
No Association
between the Two
Variables
No Change in
the Initial
Pattern
Some Association
between the Two
Variables
Some Association
between the Two
Variables
No Association
between the Two
Variables
Introduce a Third
Variable
Introduce a Third
Variable
Original Two Variables
1 - 74
SPSS Windows: Cross-
tabulations
1. Select ANALYZE on the SPSS menu bar.
2. Click on DESCRIPTIVE STATISTICS and select
CROSSTABS.
3. Move the variable ―Internet Usage Group [iusagegr]‖ to
the ROW(S) box.
4. Move the variable ―Sex[sex]‖ to the COLUMN(S) box.
5. Click on CELLS.
6. Select OBSERVED under COUNTS and COLUMN under
PERCENTAGES.
SPSS Windows: Cross-
tabulations
7. Click CONTINUE.
8. Click STATISTICS.
9. Click on CHI-SQUARE, PHI AND
CRAMER’S V.
10. Click CONTINUE.
11. Click OK.
20–77
Interpretation
 Interpretation
 The process of drawing inferences from
the analysis results.
 Inferences drawn from interpretations
lead to managerial implications and
decisions.
 From a management perspective, the
qualitative meaning of the data and their
managerial implications are an important
aspect of the interpretation.
Hypothesis Testing
 Types of Hypotheses
 Relational hypotheses
 Examine how changes in one variable vary with
changes in another.
 Hypotheses about differences between
groups
 Examine how some variable varies from one group
to another.
 Hypotheses about differences from some
standard
 Examine how some variable differs from some
preconceived standard. These tests typify
univariate statistical tests.
21–79
Types of Statistical Analysis
 Univariate Statistical Analysis
 Tests of hypotheses involving only one
variable.
 Testing of statistical significance
 Bivariate Statistical Analysis
 Tests of hypotheses involving two variables.
 Multivariate Statistical Analysis
 Statistical analysis involving three or more
variables or sets of variables.
21–80
The Hypothesis-Testing
Procedure
 Process
1. The specifically stated hypothesis is derived
from the research objectives.
2. A sample is obtained and the relevant
variable is measured.
3. The measured sample value is compared to
the value either stated explicitly or implied in
the hypothesis.
 If the value is consistent with the hypothesis, the
hypothesis is supported.
 If the value is not consistent with the hypothesis,
the hypothesis is not supported.
200 :H
Univariate Hypothesis Test
Utilizing the t-Distribution: An
Example
The sample mean is
equal to 20.
The sample mean is
equal not to 20.
201 :H
nSSX
/ 25/5 1
Univariate Hypothesis Test
Utilizing the t-Distribution: An
Example (cont’d)
 The researcher desired a 95 percent
confidence; the significance level
becomes 0.05.
 The researcher must then find the upper
and lower limits of the confidence
interval to determine the region of
rejection.
 Thus, the value of t is needed.
 For 24 degrees of freedom (n-1= 25-1),
the t-value is 2.064.
Univariate Hypothesis Test Utilizing
the t-Distribution: An Example
(cont’d)
93617
25
5
064220 .... Xlc StLower limit
=
06422
25
5
064220 .... Xlc StUpper limit
=
Univariate Hypothesis Test
Utilizing the t-Distribution:
An Example (cont’d)
Univariate Hypothesis Test t-Test
X
obs
S
X
t
1
2022
1
2
2
This is less than the critical t-value of 2.064 at the
0.05 level with 24 degrees of freedom 
hypothesis is not supported.
21–85
The Chi-Square Test for
Goodness of Fit
 Chi-square (χ2) test
 Tests for statistical significance.
 Is particularly appropriate for testing
hypotheses about frequencies arranged in a
frequency or contingency table.
 Goodness-of-Fit (GOF)
 A general term representing how well some
computed table or matrix of values matches
some population or predetermined table or
matrix of the same size.
The Chi-Square Test for
Goodness of Fit: An Example
The Chi-Square Test for Goodness
of Fit: An Example (cont’d)
i
ii(
²
E
E )²O
χ² = chi-square statistics
Oi = observed frequency in the ith cell
Ei = expected frequency on the ith cell
n
CR
E
ji
ij
Chi-Square Test: Estimation for
Expected Number for Each Cell
Ri = total observed frequency in the ith row
Cj = total observed frequency in the jth column
n = sample size
Hypothesis Test of a Proportion
 Hypothesis Test of a Proportion
 Is conceptually similar to the one used when
the mean is the characteristic of interest but
that differs in the mathematical formulation
of the standard error of the proportion.
p
obs
S
p
Z
π is the population proportion
p is the sample proportion
π is estimated with p
What Is the Appropriate Test
of Difference?
 Test of Differences
 An investigation of a hypothesis that two (or
more) groups differ with respect to measures
on a variable.
 Behaviour, characteristics, beliefs, opinions,
emotions, or attitudes
 Bivariate Tests of Differences
 Involve only two variables: a variable that acts
like a dependent variable and a variable that
acts as a classification variable.
 Differences in mean scores between groups or in
comparing how two groups’ scores are distributed
across possible response categories.
22–91
EXHIBIT 22.1 Some Bivariate Hypotheses
Cross-Tabulation Tables: The χ2
Test for Goodness-of-Fit
 Cross-Tabulation (Contingency) Table
 A joint frequency distribution of observations
on two more variables.
 χ2 Distribution
 Provides a means for testing the statistical
significance of a contingency table.
 Involves comparing observed frequencies (Oi)
with expected frequencies (Ei) in each cell of the
table.
 Captures the goodness- (or closeness-) of-fit of
the observed distribution with the expected
distribution.
Chi-Square Test
i
ii
E
)²E(O
χ²
χ² = chi-square statistic
Oi = observed frequency in the ith cell
Ei = expected frequency on the ith cell
n
CR
E
ji
ij
Ri = total observed frequency in the ith row
Cj = total observed frequency in the jth column
n = sample size
Degrees of Freedom (d.f.)
d.f.=(R-1)(C-1)
22–95
Example: Papa John’s Restaurants
Univariate Hypothesis:
Papa John’s restaurants are
more likely to be located in a
stand-alone location or in a
shopping center.
Bivariate Hypothesis:
Stand-alone locations
are more likely to be
profitable than are
shopping center
locations.
Example: Papa John’s
Restaurants (cont’d)
 In this example, χ2 = 22.16 with 1 d.f.
 From Table A.4, the critical value at the
0.05 level with 1 d.f. is 3.84.
 Thus, we are 95 percent confident that
the observed values do not equal the
expected values.
 But are the deviations from the
expected values in the hypothesized
direction?
χ2 Test for Goodness-of-Fit
Recap
Testing the hypothesis involves two key
steps:
1. Examine the statistical significance of the
observed contingency table.
2. Examine whether the differences between
the observed and expected values are
consistent with the hypothesized
prediction.
The t-Test for Comparing Two Means
 Independent Samples t-Test
 A test for hypotheses stating that the mean
scores for some interval- or ratio-scaled
variable grouped based on some less-than-
interval classificatory variable are not the
same.
meansrandomofyVariabilit
2MeanSample-1MeanSample
t
21
21
XX
S
t
The t-Test for Comparing
Two Means (cont’d)
 Pooled Estimate of the Standard Error
 An estimate of the standard error for a t-
test of independent means that assumes
the variances of both groups are equal.
2121
2
22
2
11 11
2
11
21
nnnn
SnSn
S XX
))(
© 2010 South-Western/Cengage
Learning. All rights reserved. May not
be scanned, copied or duplicated, or
posted to a publically accessible
website, in whole or in part.
22–100
EXHIBIT 22.2 Independent Samples t-Test Results
Comparing Two Means (cont’d)
 Paired-Samples t-Test
 Compares the scores of two interval
variables drawn from related populations.
 Used when means need to be compared
that are not from independent samples.
© 2010 South-Western/Cengage
Learning. All rights reserved. May not
be scanned, copied or duplicated, or
posted to a publically accessible
website, in whole or in part.
22–102
EXHIBIT 22.4 Example Results for a Paired Samples t-Test
1 - 103
SPSS Windows: One
Sample t Test
1. Select ANALYZE from the SPSS
menu bar.
2. Click COMPARE MEANS and then
ONE SAMPLE T TEST.
3. Move ―Familiarity [familiar]‖ in to the
TEST VARIABLE(S) box.
4. Type ―4‖ in the TEST VALUE box.
5. Click OK.
SPSS Windows:
Two Independent Samples t Test
1. Select ANALYZE from the SPSS menu bar.
2. Click COMPARE MEANS and then INDEPENDENT
SAMPLES T TEST.
3. Move ―Internet Usage Hrs/Week [iusage]‖ in to the TEST
VARIABLE(S) box.
4. Move ―Sex[sex]‖ to GROUPING VARIABLE box.
5. Click DEFINE GROUPS.
6. Type ―1‖ in GROUP 1 box and ―2‖ in GROUP 2 box.
7. Click CONTINUE.
8. Click OK.
SPSS Windows: Paired Samples t
Test
1. Select ANALYZE from the SPSS menu bar.
2. Click COMPARE MEANS and then PAIRED
SAMPLES T TEST.
3. Select ―Attitude toward Internet [iattitude]‖ and
then select ―Attitude toward technology
[tattitude].‖ Move these variables in to the PAIRED
VARIABLE(S) box.
4. Click OK.
1 - 107
Further Reading
 COOPER, D.R. AND SCHINDLER, P.S. (2011)
BUSINESS RESEARCH METHODS, 11TH EDN,
MCGRAW HILL
 ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND
GRIFFIN, M. (2010) BUSINESS RESEARCH
METHODS, 8TH EDN, SOUTH-WESTERN
 SAUNDERS, M., LEWIS, P. AND THORNHILL, A.
(2012) RESEARCH METHODS FOR BUSINESS
STUDENTS, 6TH EDN, PRENTICE HALL.
 SAUNDERS, M. AND LEWIS, P. (2012) DOING
RESEARCH IN BUSINESS & MANAGEMENT, FT
PRENTICE HALL.

More Related Content

PPTX
Logistic regression
PPTX
Covariance vs Correlation
PPTX
Presentation on nominal and ordinal scales of measurement
PPTX
Basic Descriptive statistics
PPTX
Introduction to Statistics (Part -I)
PPTX
Classification of Data
PDF
Simple linear regression
PDF
Ordinal logistic regression
Logistic regression
Covariance vs Correlation
Presentation on nominal and ordinal scales of measurement
Basic Descriptive statistics
Introduction to Statistics (Part -I)
Classification of Data
Simple linear regression
Ordinal logistic regression

What's hot (20)

PPT
Bivariate analysis
PDF
Categorical data analysis
PPT
Linear regression
PPTX
Diff rel gof-fit - jejit - practice (5)
PPTX
Regression vs correlation and causation
PPTX
Chi squared test
PPTX
Testing of hypothesis - large sample test
PPT
06 cohort studies
PPTX
Multidimensional scaling1
PPT
Topic 7 measurement in research
PDF
Nominal data vs ordinal data - comparison chart
PPT
Regression analysis ppt
DOC
Statistics student sample project (1)
PPT
Auto Correlation Presentation
PPTX
Logistic regression
PDF
Introduction to Statistics
PPT
Testing of hypothesis
PPTX
Scaling technique
PPTX
Types of scales
PPT
Simple linear regression (final)
Bivariate analysis
Categorical data analysis
Linear regression
Diff rel gof-fit - jejit - practice (5)
Regression vs correlation and causation
Chi squared test
Testing of hypothesis - large sample test
06 cohort studies
Multidimensional scaling1
Topic 7 measurement in research
Nominal data vs ordinal data - comparison chart
Regression analysis ppt
Statistics student sample project (1)
Auto Correlation Presentation
Logistic regression
Introduction to Statistics
Testing of hypothesis
Scaling technique
Types of scales
Simple linear regression (final)
Ad

Viewers also liked (6)

PPT
Abdm4064 week 11 data analysis
PPTX
Review & Hypothesis Testing
PPTX
Mba2216 week 11 data analysis part 03 appendix
PPTX
Mba2216 week 11 data analysis part 02
PPT
Mdr tb & xdr tb ppt.
PDF
Hypothesis testing; z test, t-test. f-test
Abdm4064 week 11 data analysis
Review & Hypothesis Testing
Mba2216 week 11 data analysis part 03 appendix
Mba2216 week 11 data analysis part 02
Mdr tb & xdr tb ppt.
Hypothesis testing; z test, t-test. f-test
Ad

Similar to Mba2216 week 11 data analysis part 01 (20)

PPTX
Data analysis copy
PPT
Mba ii rm unit-4.1 data analysis & presentation a
PPTX
Editing, coding and tabulation of data
PPT
Business Research Methods. data collection preparation and analysis
PPTX
Analysis of data.pptx
PPTX
BRM ppt 1.pptx
PPTX
dataanalysisandinterpretation-231025045220-81d52e02.pptx
PPTX
DATA PROCESSING on marketing research...
PPTX
Ansalysis of daata w- roough slides.pptx
PDF
editing ,coding ,classification and tabulation in research methodology.pdf
PPT
Chap 8
PPTX
the data analysis and preparation of data
PPTX
Data analysis and Presentation
PPT
Research methodology - Analysis of Data
PPTX
Coding, editing, Tabulation and validation.pptx
PPTX
1. Data Process.pptx
PPTX
Data analysis.pptx
PPTX
8. data analysis in research practice.pptx
PPTX
MOdule IV- Data Processing.pptx
Data analysis copy
Mba ii rm unit-4.1 data analysis & presentation a
Editing, coding and tabulation of data
Business Research Methods. data collection preparation and analysis
Analysis of data.pptx
BRM ppt 1.pptx
dataanalysisandinterpretation-231025045220-81d52e02.pptx
DATA PROCESSING on marketing research...
Ansalysis of daata w- roough slides.pptx
editing ,coding ,classification and tabulation in research methodology.pdf
Chap 8
the data analysis and preparation of data
Data analysis and Presentation
Research methodology - Analysis of Data
Coding, editing, Tabulation and validation.pptx
1. Data Process.pptx
Data analysis.pptx
8. data analysis in research practice.pptx
MOdule IV- Data Processing.pptx

More from Stephen Ong (20)

PPTX
Tcm step 3 venture assessment
PPTX
Tcm step 2 market needs analysis
PPTX
Tcm step 1 technology analysis
PPTX
Tcm Workshop 1 Technology analysis
PPTX
Tcm step 3 venture assessment
PPTX
Tcm step 2 market needs analysis
PPTX
Tcm step 1 technology analysis
PPTX
Tcm concept discovery stage introduction
PPT
Mod001093 german sme hidden champions 120415
PPT
Tbs910 linear programming
PPT
Mod001093 family businesses 050415
PPT
Gs503 vcf lecture 8 innovation finance ii 060415
PPT
Gs503 vcf lecture 7 innovation finance i 300315
PPT
Tbs910 regression models
PPT
Tbs910 sampling hypothesis regression
PPT
Mod001093 intrapreneurship 290315
PPT
Gs503 vcf lecture 6 partial valuation ii 160315
PPT
Gs503 vcf lecture 5 partial valuation i 140315
PPT
Mod001093 context of sme 220315
PPT
Mod001093 from innovation business model to startup 140315
Tcm step 3 venture assessment
Tcm step 2 market needs analysis
Tcm step 1 technology analysis
Tcm Workshop 1 Technology analysis
Tcm step 3 venture assessment
Tcm step 2 market needs analysis
Tcm step 1 technology analysis
Tcm concept discovery stage introduction
Mod001093 german sme hidden champions 120415
Tbs910 linear programming
Mod001093 family businesses 050415
Gs503 vcf lecture 8 innovation finance ii 060415
Gs503 vcf lecture 7 innovation finance i 300315
Tbs910 regression models
Tbs910 sampling hypothesis regression
Mod001093 intrapreneurship 290315
Gs503 vcf lecture 6 partial valuation ii 160315
Gs503 vcf lecture 5 partial valuation i 140315
Mod001093 context of sme 220315
Mod001093 from innovation business model to startup 140315

Recently uploaded (20)

PPT
Data mining for business intelligence ch04 sharda
PPTX
5 Stages of group development guide.pptx
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
How to Get Funding for Your Trucking Business
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Laughter Yoga Basic Learning Workshop Manual
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
Chapter 5_Foreign Exchange Market in .pdf
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Business model innovation report 2022.pdf
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PPTX
Amazon (Business Studies) management studies
PDF
Nidhal Samdaie CV - International Business Consultant
Data mining for business intelligence ch04 sharda
5 Stages of group development guide.pptx
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
How to Get Funding for Your Trucking Business
DOC-20250806-WA0002._20250806_112011_0000.pdf
Laughter Yoga Basic Learning Workshop Manual
ICG2025_ICG 6th steering committee 30-8-24.pptx
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
Reconciliation AND MEMORANDUM RECONCILATION
Chapter 5_Foreign Exchange Market in .pdf
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
unit 1 COST ACCOUNTING AND COST SHEET
Euro SEO Services 1st 3 General Updates.docx
Unit 1 Cost Accounting - Cost sheet
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Business model innovation report 2022.pdf
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
Amazon (Business Studies) management studies
Nidhal Samdaie CV - International Business Consultant

Mba2216 week 11 data analysis part 01

  • 1. Data Analysis Part 1: Preparation, Frequencies, Hypothesis Testing MBA2216 BUSINESS RESEARCH PROJECT by Stephen Ong Visiting Fellow, Birmingham City University, UK Visiting Professor, Shenzhen University
  • 2. 19–2 LEARNING OUTCOMES 1. Know when a response is really an error and should be edited 2. Appreciate coding of pure qualitative research 3. Understand the way data are represented in a data file 4. Understand the coding of structured responses including a dummy variable approach 5. Appreciate the ways that technological advances have simplified the coding process After this lecture, you should be able to
  • 3. 6. Know what descriptive statistics are and why they are used 7. Create and interpret simple tabulation tables 8. Understand how cross-tabulations can reveal relationships 9. Perform basic data transformations 10. List different computer software products designed for descriptive statistical analysis 11. Understand a researcher’s role in interpreting the data 12. Implement the hypothesis-testing procedure 13. Use p-values to assess statistical significance 19–3 LEARNING OUTCOMES
  • 4. 14. Test a hypothesis about an observed mean compared to some standard 15. Know the difference between Type I and Type II errors 16. Know when a univariate χ2 test is appropriate and how to conduct one 17. Recognize when a bivariate statistical test is appropriate 18. Calculate and interpret a χ2 test for a contingency table 19. Calculate and interpret an independent samples t-test comparing two means 19–4 LEARNING OUTCOMES
  • 5. Remember this,  Garbage in, garbage out!  If data is collected improperly, or coded incorrectly, then the research results are ―garbage‖.
  • 6. Stages of Data Analysis  Raw Data  The unedited responses from a respondent exactly as indicated by that respondent.  Nonrespondent Error  Error that the respondent is not responsible for creating, such as when the interviewer marks a response incorrectly.  Data Integrity  The notion that the data file actually contains the information that the researcher is trying to obtain to adequately address research questions.
  • 7. 19–7 EXHIBIT 19.1 Overview of the Stages of Data Analysis
  • 8. Editing  Editing  The process of checking the completeness, consistency, and legibility of data and making the data ready for coding and transfer to storage.  E.g. How long you have stayed at your current address? 45  The researchers need to make adjustment/reconstruct responses  Field Editing – useful in personal interview  Preliminary editing by a field supervisor on the same day as the interview to catch technical omissions, check legibility of handwriting, and clarify responses that are logically or conceptually inconsistent.  In-House Editing  A rigorous editing job performed by a centralized office staff.
  • 9. Editing – what to do?  Checking for Consistency  Respondents match defined population – e.g. SBS?  Check for consistency within the data collection framework – e.g. items listed by the respondents are within the definition.  Taking Action When Response is Obviously in Error  Change/correct responses only when there are multiple pieces of evidence for doing so.  Editing Technology  Computer routines can check for consistency automatically.
  • 10. 19–10 Editing for Completeness  Item Nonresponse  The technical term for an unanswered question on an otherwise complete questionnaire resulting in missing data.  Most of the time the researchers will do nothing to it.  But sometimes the question is linked to another question therefore the researchers have to fill-in-the blank.  Plug Value  An answer that an editor ―plugs in‖ to replace blanks or missing values so as to permit data analysis.  Choice of value is based on a predetermined decision rule, e.g. take an average value or neutral value.  Several choices:  Leave it blank  Plug in alternate choices.  Randomly select an answer.  Impute a missing value.
  • 11. Editing …  Impute  To fill in a missing data point through the use of a statistical process providing an educated guess for the missing response based on available information.  I.e. based on the respondent’s choices to other questions.
  • 12. Editing for Completeness (cont’d)  What about missing data? Many statistical software programs required complete data for an analysis to take place.  List-wise deletion  The entire record for a respondent that has left a response missing is excluded from use in statistical analysis.  Pair-wise deletion  Only the actual variables for a respondent that do not contain information are eliminated from use in statistical analysis.
  • 13. Please take note,  When a questionnaire has too many missing answer, it may not be suitable for the planned data analysis. In such situation, that particular questionnaire has to be dropped from the sample.
  • 14. Facilitating the Coding Process  Editing And Tabulating ―Don’t Know‖ Answers  Legitimate don’t know (no opinion)  Reluctant don’t know (refusal to answer)  Confused don’t know (does not understand)
  • 15. Editing (cont’d)  Pitfalls of Editing  Allowing subjectivity to enter into the editing process.  Data editors should be intelligent, experienced, and objective.  A systematic procedure for assessing the questionnaire should be developed by the research analyst so that the editor has clearly defined decision rules.  Pretesting Edit  Editing during the pretest stage can prove very valuable for improving questionnaire format, identifying poor instructions or inappropriate question wording.
  • 16. Coding Qualitative Responses  Coding  The process of assigning a numerical score or other character symbol to previously edited data.  Codes  Rules for interpreting, classifying, and recording data in the coding process.  The actual numerical or other character symbols assigned to raw data.  Dummy Coding  Numeric ―1‖ or ―0‖ coding where each number represents an alternate response such as ―female‖ or ―male.‖  If k is the number of categories for a qualitative variable, k-1 dummy variables are needed.
  • 17. Data File Terminology  Field  A collection of characters that represents a single type of data—usually a variable.  String Characters  Computer terminology to represent formatting a variable using a series of alphabetic characters (nonnumeric characters) that may form a word.  Record  A collection of related fields that represents the responses from one sampling unit.
  • 18. Data File Terminology (cont’d)  Data File  The way a data set is stored electronically in spreadsheet-like form in which the rows represent sampling units and the columns represent variables.  Value Labels  Unique labels assigned to each possible numeric code for a response.
  • 19. Code Construction  Two Basic Rules for Coding Categories: 1. They should be exhaustive, meaning that a coding category should exist for all possible responses. 2. They should be mutually exclusive and independent, meaning that there should be no overlap among the categories to ensure that a subject or response can be placed in only one category.  Test Tabulation – especially useful for open-ended questions  Tallying of a small sample of the total number of replies to a particular question in order to construct coding categories.  Purpose is to preliminarily identify the stability and distribution of answers that will determine a coding scheme.
  • 20. Test Tabulation  E.g.  1st respondent: I don’t like to use Facebook because it is wasting time.  2nd respondent: I don’t know what is Facebook.  3rd respondent: Facebook takes me a lot of time.  Based on the above 3 answer, you can have 2 groups of answer:  1st group: Time factor  2nd group: No knowledge on Facebook
  • 21. Devising the Coding Scheme  A coding scheme should not be too elaborate.  The coder’s task is only to summarize the data.  Categories should be sufficiently unambiguous that coders will not classify items in different ways.  Code book  Identifies each variable in a study and gives the variable’s description, code name, and position in the data matrix.
  • 22. The Nature of Descriptive Analysis  Descriptive Analysis  The elementary transformation of raw data in a way that describes the basic characteristics such as central tendency, distribution, and variability.  Histogram  A graphical way of showing a frequency distribution in which the height of a bar corresponds to the observed frequency of the category.
  • 23. 20–23 EXHIBIT 20.1 Levels of Scale Measurement and Suggested Descriptive Statistics
  • 24. Creating and Interpreting Tabulation  Tabulation  The orderly arrangement of data in a table or other summary format showing the number of responses to each response category.  Tallying is the term when the process is done by hand.  Frequency Table  A table showing the different ways respondents answered a question.  Sometimes called a marginal tabulation.
  • 26. Cross-Tabulation  Cross-Tabulation  Addresses research questions involving relationships among multiple less-than interval variables.  Results in a combined frequency table displaying one variable in rows and another variable in columns.  Contingency Table  A data matrix that displays the frequency of some combination of responses to multiple variables.  Marginals  Row and column totals in a contingency table, which are shown in its margins.
  • 27. 20–27 EXHIBIT 20.2 Cross-Tabulation Tables from a Survey Regarding AIG and Government Bailouts
  • 28. 20–28 EXHIBIT 20.3 Different Ways of Depicting the Cross-Tabulation of Biological Sex and Target Patronage
  • 29. Cross-Tabulation (cont’d)  Percentage Cross-Tabulations  Statistical base – the number of respondents or observations (in a row or column) used as a basis for computing percentages.  Elaboration and Refinement  Elaboration analysis – an analysis of the basic cross-tabulation for each level of a variable not previously considered, such as subgroups of the sample.  Moderator variable – a third variable that changes the nature of a relationship between the original independent and dependent variables.
  • 30. EXHIBIT 20.4 Cross-Tabulation of Marital Status, Sex, and Responses to the Question ―Do You Shop at Target?‖
  • 31. Cross-Tabulation (cont’d)  How Many Cross-Tabulations?  Every possible response becomes a possible explanatory variable.  When hypotheses involve relationships among two categorical variables, cross- tabulations are the right tool for the job.  Quadrant Analysis  An extension of cross-tabulation in which responses to two rating-scale questions are plotted in four quadrants of a two-dimensional table.  Importance-performance analysis
  • 32. EXHIBIT 20.5 An Importance-Performance or Quadrant Analysis of Hotels
  • 33. 20–33 Data Transformation  Data Transformation  Process of changing the data from their original form to a format suitable for performing a data analysis addressing research objectives. Bimodal
  • 34. 20–34 Problems with Data Transformations  Median Split  Dividing a data set into two categories by placing respondents below the median in one category and respondents above the median in another.  The approach is best applied only when the data do indeed exhibit bimodal characteristics.  Inappropriate collapsing of continuous variables into categorical variables ignores the information contained within the untransformed values.
  • 35. 20–35 EXHIBIT 20.6 Bimodal Distributions Are Consistent with Transformations into Categorical Values
  • 36. 20–36 EXHIBIT 20.7 The Problem with Median Splits with Unimodal Data
  • 37. 20–37 Index Numbers  Index Numbers  Scores or observations recalibrated to indicate how they relate to a base number.  Price indexes  Represent simple data transformations that allow researchers to track a variable’s value over time and compare a variable(s) with other variables.  Recalibration allows scores or observations to be related to a certain base period or base number.
  • 38. 20–38 EXHIBIT 20.8 Hours of Television Usage per Week
  • 39. 20–39 Calculating Rank Order  Rank Order  Ranking data can be summarized by performing a data transformation.  The transformation involves multiplying the frequency by the ranking score for each choice resulting in a new scale.
  • 40. 20–40 EXHIBIT 20.9 Executive Rankings of Potential Conference Destinations
  • 41. 20–41 EXHIBIT 20.10 Frequencies of Conference Destination Rankings
  • 42. 20–42 EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations
  • 43. 20–43 Computer Programs for Analysis  Statistical Packages  Spreadsheets  Excel  Statistical software:  SAS  SPSS (Statistical Package for Social Sciences)  MINITAB
  • 44. 20–44 Computer Graphics and Computer Mapping  Box and Whisker Plots  Graphic representations of central tendencies, percentiles, variabilities, and the shapes of frequency distributions.  Interquartile Range  A measure of variability.  Outlier  A value that lies outside the normal range of the data.
  • 45. 20–45 EXHIBIT 20.15 Computer Drawn Box and Whisker Plot
  • 46. SPSS Windows  The main program in SPSS is FREQUENCIES. It produces a table of frequency counts, percentages, and cumulative percentages for the values of each variable. It gives all of the associated statistics.  If the data are interval scaled and only the summary statistics are desired, the DESCRIPTIVES procedure can be used.  The EXPLORE procedure produces summary statistics and graphical displays, either for all of the cases or separately for groups of cases. Mean, median, variance, standard deviation, minimum, maximum, and range are some of the statistics that can be calculated.
  • 47. SPSS Windows To select these procedures click: Analyze>Descriptive Statistics>Frequencies Analyze>Descriptive Statistics>Descriptives Analyze>Descriptive Statistics>Explore The major cross-tabulation program is CROSSTABS. This program will display the cross-classification tables and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of the strength of the association that have been discussed. To select these procedures, click: Analyze>Descriptive Statistics>Crosstabs
  • 48. SPSS Windows The major program for conducting parametric tests in SPSS is COMPARE MEANS. This program can be used to conduct t tests on one sample or independent or paired samples. To select these procedures using SPSS for Windows, click: Analyze>Compare Means>Means … Analyze>Compare Means>One-Sample T Test … Analyze>Compare Means>Independent-Samples T Test … Analyze>Compare Means>Paired-Samples T Test …
  • 49. SPSS Windows The nonparametric tests discussed in this chapter can be conducted using NONPARAMETRIC TESTS. To select these procedures using SPSS for Windows, click: Analyze>Nonparametric Tests>Chi-Square … Analyze>Nonparametric Tests>Binomial … Analyze>Nonparametric Tests>Runs … Analyze>Nonparametric Tests>1-Sample K-S … Analyze>Nonparametric Tests>2 Independent Samples … Analyze>Nonparametric Tests>2 Related Samples …
  • 51. SPSS Windows: Frequencies 1. Select ANALYZE on the SPSS menu bar. 2. Click DESCRIPTIVE STATISTICS and select FREQUENCIES. 3. Move the variable ―Familiarity [familiar]‖ to the VARIABLE(s) box. 4. Click STATISTICS. 5. Select MEAN, MEDIAN, MODE, STD. DEVIATION, VARIANCE, and RANGE.
  • 52. SPSS Windows: Frequencies 6. Click CONTINUE. 7. Click CHARTS. 8. Click HISTOGRAMS, then click CONTINUE. 9. Click OK.
  • 53. Introduction of a Third Variable in Cross-Tabulation Refined Association between the Two Variables No Association between the Two Variables No Change in the Initial Pattern Some Association between the Two Variables Some Association between the Two Variables No Association between the Two Variables Introduce a Third Variable Introduce a Third Variable Original Two Variables
  • 55. SPSS Windows: Cross- tabulations 1. Select ANALYZE on the SPSS menu bar. 2. Click on DESCRIPTIVE STATISTICS and select CROSSTABS. 3. Move the variable ―Internet Usage Group [iusagegr]‖ to the ROW(S) box. 4. Move the variable ―Sex[sex]‖ to the COLUMN(S) box. 5. Click on CELLS. 6. Select OBSERVED under COUNTS and COLUMN under PERCENTAGES.
  • 56. SPSS Windows: Cross- tabulations 7. Click CONTINUE. 8. Click STATISTICS. 9. Click on CHI-SQUARE, PHI AND CRAMER’S V. 10. Click CONTINUE. 11. Click OK.
  • 57. 20–57 Interpretation  Interpretation  The process of drawing inferences from the analysis results.  Inferences drawn from interpretations lead to managerial implications and decisions.  From a management perspective, the qualitative meaning of the data and their managerial implications are an important aspect of the interpretation.
  • 58. Hypothesis Testing  Types of Hypotheses  Relational hypotheses  Examine how changes in one variable vary with changes in another.  Hypotheses about differences between groups  Examine how some variable varies from one group to another.  Hypotheses about differences from some standard  Examine how some variable differs from some preconceived standard. These tests typify univariate statistical tests.
  • 59. 21–59 Types of Statistical Analysis  Univariate Statistical Analysis  Tests of hypotheses involving only one variable.  Testing of statistical significance  Bivariate Statistical Analysis  Tests of hypotheses involving two variables.  Multivariate Statistical Analysis  Statistical analysis involving three or more variables or sets of variables.
  • 60. 21–60 The Hypothesis-Testing Procedure  Process 1. The specifically stated hypothesis is derived from the research objectives. 2. A sample is obtained and the relevant variable is measured. 3. The measured sample value is compared to the value either stated explicitly or implied in the hypothesis.  If the value is consistent with the hypothesis, the hypothesis is supported.  If the value is not consistent with the hypothesis, the hypothesis is not supported.
  • 61. 20–61 EXHIBIT 20.10 Frequencies of Conference Destination Rankings
  • 62. 20–62 EXHIBIT 20.11 Pie Charts Work Well with Tabulations and Cross-Tabulations
  • 63. 20–63 Computer Programs for Analysis  Statistical Packages  Spreadsheets  Excel  Statistical software:  SAS  SPSS (Statistical Package for Social Sciences)  MINITAB
  • 64. 20–64 Computer Graphics and Computer Mapping  Box and Whisker Plots  Graphic representations of central tendencies, percentiles, variabilities, and the shapes of frequency distributions.  Interquartile Range  A measure of variability.  Outlier  A value that lies outside the normal range of the data.
  • 65. 20–65 EXHIBIT 20.15 Computer Drawn Box and Whisker Plot
  • 66. SPSS Windows  The main program in SPSS is FREQUENCIES. It produces a table of frequency counts, percentages, and cumulative percentages for the values of each variable. It gives all of the associated statistics.  If the data are interval scaled and only the summary statistics are desired, the DESCRIPTIVES procedure can be used.  The EXPLORE procedure produces summary statistics and graphical displays, either for all of the cases or separately for groups of cases. Mean, median, variance, standard deviation, minimum, maximum, and range are some of the statistics that can be calculated.
  • 67. SPSS Windows To select these procedures click: Analyze>Descriptive Statistics>Frequencies Analyze>Descriptive Statistics>Descriptives Analyze>Descriptive Statistics>Explore The major cross-tabulation program is CROSSTABS. This program will display the cross-classification tables and provide cell counts, row and column percentages, the chi-square test for significance, and all the measures of the strength of the association that have been discussed. To select these procedures, click: Analyze>Descriptive Statistics>Crosstabs
  • 68. SPSS Windows The major program for conducting parametric tests in SPSS is COMPARE MEANS. This program can be used to conduct t tests on one sample or independent or paired samples. To select these procedures using SPSS for Windows, click: Analyze>Compare Means>Means … Analyze>Compare Means>One-Sample T Test … Analyze>Compare Means>Independent-Samples T Test … Analyze>Compare Means>Paired-Samples T Test …
  • 69. SPSS Windows The nonparametric tests discussed in this chapter can be conducted using NONPARAMETRIC TESTS. To select these procedures using SPSS for Windows, click: Analyze>Nonparametric Tests>Chi-Square … Analyze>Nonparametric Tests>Binomial … Analyze>Nonparametric Tests>Runs … Analyze>Nonparametric Tests>1-Sample K-S … Analyze>Nonparametric Tests>2 Independent Samples … Analyze>Nonparametric Tests>2 Related Samples …
  • 71. SPSS Windows: Frequencies 1. Select ANALYZE on the SPSS menu bar. 2. Click DESCRIPTIVE STATISTICS and select FREQUENCIES. 3. Move the variable ―Familiarity [familiar]‖ to the VARIABLE(s) box. 4. Click STATISTICS. 5. Select MEAN, MEDIAN, MODE, STD. DEVIATION, VARIANCE, and RANGE.
  • 72. SPSS Windows: Frequencies 6. Click CONTINUE. 7. Click CHARTS. 8. Click HISTOGRAMS, then click CONTINUE. 9. Click OK.
  • 73. Introduction of a Third Variable in Cross-Tabulation Refined Association between the Two Variables No Association between the Two Variables No Change in the Initial Pattern Some Association between the Two Variables Some Association between the Two Variables No Association between the Two Variables Introduce a Third Variable Introduce a Third Variable Original Two Variables
  • 75. SPSS Windows: Cross- tabulations 1. Select ANALYZE on the SPSS menu bar. 2. Click on DESCRIPTIVE STATISTICS and select CROSSTABS. 3. Move the variable ―Internet Usage Group [iusagegr]‖ to the ROW(S) box. 4. Move the variable ―Sex[sex]‖ to the COLUMN(S) box. 5. Click on CELLS. 6. Select OBSERVED under COUNTS and COLUMN under PERCENTAGES.
  • 76. SPSS Windows: Cross- tabulations 7. Click CONTINUE. 8. Click STATISTICS. 9. Click on CHI-SQUARE, PHI AND CRAMER’S V. 10. Click CONTINUE. 11. Click OK.
  • 77. 20–77 Interpretation  Interpretation  The process of drawing inferences from the analysis results.  Inferences drawn from interpretations lead to managerial implications and decisions.  From a management perspective, the qualitative meaning of the data and their managerial implications are an important aspect of the interpretation.
  • 78. Hypothesis Testing  Types of Hypotheses  Relational hypotheses  Examine how changes in one variable vary with changes in another.  Hypotheses about differences between groups  Examine how some variable varies from one group to another.  Hypotheses about differences from some standard  Examine how some variable differs from some preconceived standard. These tests typify univariate statistical tests.
  • 79. 21–79 Types of Statistical Analysis  Univariate Statistical Analysis  Tests of hypotheses involving only one variable.  Testing of statistical significance  Bivariate Statistical Analysis  Tests of hypotheses involving two variables.  Multivariate Statistical Analysis  Statistical analysis involving three or more variables or sets of variables.
  • 80. 21–80 The Hypothesis-Testing Procedure  Process 1. The specifically stated hypothesis is derived from the research objectives. 2. A sample is obtained and the relevant variable is measured. 3. The measured sample value is compared to the value either stated explicitly or implied in the hypothesis.  If the value is consistent with the hypothesis, the hypothesis is supported.  If the value is not consistent with the hypothesis, the hypothesis is not supported.
  • 81. 200 :H Univariate Hypothesis Test Utilizing the t-Distribution: An Example The sample mean is equal to 20. The sample mean is equal not to 20. 201 :H nSSX / 25/5 1
  • 82. Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d)  The researcher desired a 95 percent confidence; the significance level becomes 0.05.  The researcher must then find the upper and lower limits of the confidence interval to determine the region of rejection.  Thus, the value of t is needed.  For 24 degrees of freedom (n-1= 25-1), the t-value is 2.064.
  • 83. Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d) 93617 25 5 064220 .... Xlc StLower limit = 06422 25 5 064220 .... Xlc StUpper limit =
  • 84. Univariate Hypothesis Test Utilizing the t-Distribution: An Example (cont’d) Univariate Hypothesis Test t-Test X obs S X t 1 2022 1 2 2 This is less than the critical t-value of 2.064 at the 0.05 level with 24 degrees of freedom  hypothesis is not supported.
  • 85. 21–85 The Chi-Square Test for Goodness of Fit  Chi-square (χ2) test  Tests for statistical significance.  Is particularly appropriate for testing hypotheses about frequencies arranged in a frequency or contingency table.  Goodness-of-Fit (GOF)  A general term representing how well some computed table or matrix of values matches some population or predetermined table or matrix of the same size.
  • 86. The Chi-Square Test for Goodness of Fit: An Example
  • 87. The Chi-Square Test for Goodness of Fit: An Example (cont’d) i ii( ² E E )²O χ² = chi-square statistics Oi = observed frequency in the ith cell Ei = expected frequency on the ith cell
  • 88. n CR E ji ij Chi-Square Test: Estimation for Expected Number for Each Cell Ri = total observed frequency in the ith row Cj = total observed frequency in the jth column n = sample size
  • 89. Hypothesis Test of a Proportion  Hypothesis Test of a Proportion  Is conceptually similar to the one used when the mean is the characteristic of interest but that differs in the mathematical formulation of the standard error of the proportion. p obs S p Z π is the population proportion p is the sample proportion π is estimated with p
  • 90. What Is the Appropriate Test of Difference?  Test of Differences  An investigation of a hypothesis that two (or more) groups differ with respect to measures on a variable.  Behaviour, characteristics, beliefs, opinions, emotions, or attitudes  Bivariate Tests of Differences  Involve only two variables: a variable that acts like a dependent variable and a variable that acts as a classification variable.  Differences in mean scores between groups or in comparing how two groups’ scores are distributed across possible response categories.
  • 91. 22–91 EXHIBIT 22.1 Some Bivariate Hypotheses
  • 92. Cross-Tabulation Tables: The χ2 Test for Goodness-of-Fit  Cross-Tabulation (Contingency) Table  A joint frequency distribution of observations on two more variables.  χ2 Distribution  Provides a means for testing the statistical significance of a contingency table.  Involves comparing observed frequencies (Oi) with expected frequencies (Ei) in each cell of the table.  Captures the goodness- (or closeness-) of-fit of the observed distribution with the expected distribution.
  • 93. Chi-Square Test i ii E )²E(O χ² χ² = chi-square statistic Oi = observed frequency in the ith cell Ei = expected frequency on the ith cell n CR E ji ij Ri = total observed frequency in the ith row Cj = total observed frequency in the jth column n = sample size
  • 94. Degrees of Freedom (d.f.) d.f.=(R-1)(C-1)
  • 95. 22–95 Example: Papa John’s Restaurants Univariate Hypothesis: Papa John’s restaurants are more likely to be located in a stand-alone location or in a shopping center. Bivariate Hypothesis: Stand-alone locations are more likely to be profitable than are shopping center locations.
  • 96. Example: Papa John’s Restaurants (cont’d)  In this example, χ2 = 22.16 with 1 d.f.  From Table A.4, the critical value at the 0.05 level with 1 d.f. is 3.84.  Thus, we are 95 percent confident that the observed values do not equal the expected values.  But are the deviations from the expected values in the hypothesized direction?
  • 97. χ2 Test for Goodness-of-Fit Recap Testing the hypothesis involves two key steps: 1. Examine the statistical significance of the observed contingency table. 2. Examine whether the differences between the observed and expected values are consistent with the hypothesized prediction.
  • 98. The t-Test for Comparing Two Means  Independent Samples t-Test  A test for hypotheses stating that the mean scores for some interval- or ratio-scaled variable grouped based on some less-than- interval classificatory variable are not the same. meansrandomofyVariabilit 2MeanSample-1MeanSample t 21 21 XX S t
  • 99. The t-Test for Comparing Two Means (cont’d)  Pooled Estimate of the Standard Error  An estimate of the standard error for a t- test of independent means that assumes the variances of both groups are equal. 2121 2 22 2 11 11 2 11 21 nnnn SnSn S XX ))(
  • 100. © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publically accessible website, in whole or in part. 22–100 EXHIBIT 22.2 Independent Samples t-Test Results
  • 101. Comparing Two Means (cont’d)  Paired-Samples t-Test  Compares the scores of two interval variables drawn from related populations.  Used when means need to be compared that are not from independent samples.
  • 102. © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publically accessible website, in whole or in part. 22–102 EXHIBIT 22.4 Example Results for a Paired Samples t-Test
  • 104. SPSS Windows: One Sample t Test 1. Select ANALYZE from the SPSS menu bar. 2. Click COMPARE MEANS and then ONE SAMPLE T TEST. 3. Move ―Familiarity [familiar]‖ in to the TEST VARIABLE(S) box. 4. Type ―4‖ in the TEST VALUE box. 5. Click OK.
  • 105. SPSS Windows: Two Independent Samples t Test 1. Select ANALYZE from the SPSS menu bar. 2. Click COMPARE MEANS and then INDEPENDENT SAMPLES T TEST. 3. Move ―Internet Usage Hrs/Week [iusage]‖ in to the TEST VARIABLE(S) box. 4. Move ―Sex[sex]‖ to GROUPING VARIABLE box. 5. Click DEFINE GROUPS. 6. Type ―1‖ in GROUP 1 box and ―2‖ in GROUP 2 box. 7. Click CONTINUE. 8. Click OK.
  • 106. SPSS Windows: Paired Samples t Test 1. Select ANALYZE from the SPSS menu bar. 2. Click COMPARE MEANS and then PAIRED SAMPLES T TEST. 3. Select ―Attitude toward Internet [iattitude]‖ and then select ―Attitude toward technology [tattitude].‖ Move these variables in to the PAIRED VARIABLE(S) box. 4. Click OK.
  • 107. 1 - 107 Further Reading  COOPER, D.R. AND SCHINDLER, P.S. (2011) BUSINESS RESEARCH METHODS, 11TH EDN, MCGRAW HILL  ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND GRIFFIN, M. (2010) BUSINESS RESEARCH METHODS, 8TH EDN, SOUTH-WESTERN  SAUNDERS, M., LEWIS, P. AND THORNHILL, A. (2012) RESEARCH METHODS FOR BUSINESS STUDENTS, 6TH EDN, PRENTICE HALL.  SAUNDERS, M. AND LEWIS, P. (2012) DOING RESEARCH IN BUSINESS & MANAGEMENT, FT PRENTICE HALL.