SlideShare a Scribd company logo
Data Analysis Part 2:
Variances, Regression, Correl
ation
MBA2216 BUSINESS RESEARCH PROJECT
by
Stephen Ong
Visiting Fellow, Birmingham City University, UK
Visiting Professor, Shenzhen University
17. Understand the concept of analysis of variance
(ANOVA)
18. Interpret an ANOVA table
19. Apply and interpret simple bivariate correlations
22. Interpret a correlation matrix
23. Understand simple (bivariate) regression
24. Understand the least-squares estimation technique
25. Interpret regression output including the tests of
hypotheses tied to specific parameter coefficients
27. Understand what multivariate statistical analysis
involves and know the two types of multivariate
analysis
19–2
LEARNING OUTCOMES
30. Interpret basic exploratory factor
analysis results
31. Know what multiple discriminant
analysis can be used to do
32. Understand how cluster analysis can
identify market segments
19–3
LEARNING OUTCOMES
Remember this,
 Garbage in, garbage out!
 If data is collected improperly, or coded
incorrectly, then the research results
are “garbage”.
19–5
EXHIBIT 19.1 Overview of the Stages of Data Analysis
Relationship Amongst Test, Analysis
of Variance, Analysis of Covariance, &
Regression
One Independent One or More
Metric Dependent Variable
t Test
Binary
Variable
One-Way Analysis
of Variance
One Factor
N-Way Analysis
of Variance
More than
One Factor
Analysis of
Variance
Categorical:
Factorial
Analysis of
Covariance
Categorical
and Interval
Regression
Interval
Independent Variables
The Z-Test for Comparing
Two Proportions
 Z-Test for Differences of Proportions
 Tests the hypothesis that proportions are
significantly different for two independent
samples or groups.
 Requires a sample size greater than thirty.
 The hypothesis is: Ho: π1 = π2
may be restated as: Ho: π1 - π2 = 0
The Z-Test for Comparing Two
Proportions
 Z-Test statistic for differences in large
random samples:
21
2121
ppS
pp
Z
p1 = sample portion of successes in Group 1
p2 = sample portion of successes in Group 2
1 1) = hypothesized population proportion 1
minus hypothesized population proportion 2
Sp1-p2 = pooled estimate of the standard errors of
differences of proportions
The Z-Test for Comparing Two
Proportions
 To calculate the standard error of the
differences in proportions:
21
11
21
nn
qpS pp
One-Way Analysis of
Variance (ANOVA)
 Analysis of Variance (ANOVA)
 An analysis involving the investigation of the
effects of one treatment variable on an
interval-scaled dependent variable.
 A hypothesis-testing technique to determine
whether statistically significant differences in
means occur between two or more groups.
 A method of comparing variances to make
inferences about the means.
 The substantive hypothesis tested is:
 At least one group mean is not equal to another
group mean.
Partitioning Variance in
ANOVA
 Total Variability
 Grand Mean
 The mean of a variable over all observations.
 SST = Total of (observed value-grand
mean)2
Partitioning Variance in ANOVA
 Between-Groups Variance
 The sum of differences between the group mean
and the grand mean summed over all groups for a
given set of observations.
 SSB = Total of ngroup(Group Mean − Grand Mean)2
 Within-Group Error or Variance
 The sum of the differences between observed
values and the group mean for a given set of
observations
 Also known as total error variance.
 SSE = Total of (Observed Mean − Group Mean)2
The F-Test
 F-Test
 Used to determine whether there is more
variability in the scores of one sample than
in the scores of another sample.
 Variance components are used to compute
F-ratios
 SSE, SSB, SST
groupswithinVariance
groupsbetweenVariance
F
EXHIBIT 22.6 Interpreting ANOVA
1 - 15
SPSS Windows
One-way ANOVA can be efficiently
performed using the program COMPARE
MEANS and then One-way ANOVA. To
select this procedure using SPSS for
Windows, click:
Analyze>Compare Means>One-Way ANOVA …
N-way analysis of variance and analysis of
covariance can be performed using
GENERAL LINEAR MODEL. To select this
procedure using SPSS for Windows, click:
Analyze>General Linear Model>Univariate …
SPSS Windows: One-Way
ANOVA
1. Select ANALYZE from the SPSS menu bar.
2. Click COMPARE MEANS and then ONE-WAY ANOVA.
3. Move “Sales [sales]” in to the DEPENDENT LIST box.
4. Move “In-Store Promotion[promotion]” to the FACTOR
box.
5. Click OPTIONS.
6. Click Descriptive.
7. Click CONTINUE.
8. Click OK.
SPSS Windows: Analysis of Covariance
1. Select ANALYZE from the SPSS menu bar.
2. Click GENERAL LINEAR MODEL and then UNIVARIATE.
3. Move “Sales [sales]” in to the DEPENDENT VARIABLE
box.
4. Move “In-Store Promotion[promotion]” to the FIXED
FACTOR(S) box. Then move “Coupon[coupon] also to
the FIXED FACTOR(S) box.
5. Move “Clientel[clientel] to the COVARIATE(S) box.
6. Click OK.
The Basics
 Measures of Association
 Refers to a number of bivariate statistical
techniques used to measure the strength of a
relationship between two variables.
 The chi-square ( 2) test provides information
about whether two or more less-than interval
variables are interrelated.
 Correlation analysis is most appropriate for
interval or ratio variables.
 Regression can accommodate either less-
than interval or interval independent
variables, but the dependent variable must
be continuous.
23–20
EXHIBIT 23.1
Bivariate Analysis—
Common Procedures for
Testing Association
Simple Correlation Coefficient
(continued)
 Correlation coefficient
 A statistical measure of the covariation, or
association, between two at-least interval
variables.
 Covariance
 Extent to which two variables are
associated systematically with each other.
n
i
n
i
n
i
ii
yxxy
YYiXXi
YYXX
rr
1 1
22
1
Simple Correlation Coefficient
 Correlation coefficient (r)
 Ranges from +1 to -1
 Perfect positive linear relationship = +1
 Perfect negative (inverse) linear relationship =
-1
 No correlation = 0
 Correlation coefficient for two
variables (X,Y)
EXHIBIT 23.2 Scatter Diagram to Illustrate Correlation Patterns
Correlation, Covariance, and
Causation
 When two variables covary (i.e. vary
systematically), they display
concomitant variation.
 This systematic covariation does not in
and of itself establish causality.
 e.g., Rooster’s crow and the rising of
the sun
 Rooster does not cause the sun to rise.
Coefficient of Determination
 Coefficient of Determination (R2)
 A measure obtained by squaring the
correlation coefficient; the proportion of
the total variance of a variable accounted
for by another value of another variable.
 Measures that part of the total variance of
Y that is accounted for by knowing the
value of X.
VarianceTotal
varianceExplained2
R
Correlation Matrix
 Correlation matrix
 The standard form for reporting correlation
coefficients for more than two variables.
 Statistical Significance
 The procedure for determining statistical
significance is the t-test of the significance
of a correlation coefficient.
EXHIBIT 23.4 Pearson Product-Moment Correlation Matrix for Salesperson
Examplea
Regression Analysis
 Simple (Bivariate) Linear Regression
 A measure of linear association that
investigates straight-line relationships
between a continuous dependent variable and
an independent variable that is usually
continuous, but can be a categorical dummy
variable.
 The Regression Equation (Y = α + βX )
 Y = the continuous dependent variable
 X = the independent variable
 α = the Y intercept (regression line intercepts
Y axis)
 β = the slope of the coefficient (rise over run)
130
120
110
100
90
80
80 90 100 110 120 130 140 150 160 170
X
Y
XaY ˆˆˆ
X
Yˆ
Regression Line and Slope
The Regression Equation
 Parameter Estimate Choices
 β is indicative of the strength and direction of the
relationship between the independent and
dependent variable.
 α (Y intercept) is a fixed point that is considered
a constant (how much Y can exist without X)
 Standardized Regression Coefficient (β)
 Estimated coefficient of the strength of
relationship between the independent and
dependent variables.
 Expressed on a standardized scale where higher
absolute values indicate stronger relationships
(range is from -1 to 1).
The Regression Equation (cont’d)
 Parameter Estimate Choices
 Raw regression estimates (b1)
 Raw regression weights have the advantage of retaining
the scale metric—which is also their key disadvantage.
 If the purpose of the regression analysis is forecasting,
then raw parameter estimates must be used.
 This is another way of saying when the researcher is
interested only in prediction.
 Standardized regression estimates (β)
 Standardized regression estimates have the advantage
of a constant scale.
 Standardized regression estimates should be used when
the researcher is testing explanatory hypotheses.
EXHIBIT 23.5 The Advantage of Standardized Regression Weights
EXHIBIT 23.6 Relationship of Sales Potential to Building Permits Issued
EXHIBIT 23.7 The Best Fit Line or Knocking Out the Pins
Ordinary Least-Squares
(OLS) Method of Regression
Analysis OLS
 Guarantees that the resulting straight line will produce the
least possible total error in using X to predict Y.
 Generates a straight line that minimizes the sum of
squared deviations of the actual values from this predicted
regression line.
 No straight line can completely represent every dot in the
scatter diagram.
 There will be a discrepancy between most of the actual
scores (each dot) and the predicted score .
 Uses the criterion of attempting to make the least amount
of total error in prediction of Y from X.
Ordinary Least-Squares Method
of Regression Analysis (OLS)
(cont’d)
Ordinary Least-Squares Method
of Regression Analysis (OLS)
(cont’d)
The equation means that the predicted value for any
value of X (Xi) is determined as a function of the
estimated slope coefficient, plus the estimated intercept
coefficient + some error.
© 2010 South-Western/Cengage
Learning. All rights reserved. May not
be scanned, copied or duplicated, or
posted to a publically accessible
website, in whole or in part.
23–38
Ordinary Least-Squares
Method of Regression
Analysis (OLS) (cont’d)
© 2010 South-Western/Cengage
Learning. All rights reserved. May not
be scanned, copied or duplicated, or
posted to a publically accessible
website, in whole or in part.
23–39
Ordinary Least-Squares
Method of Regression
Analysis (OLS) (cont’d) Statistical Significance Of Regression
Model
 F-test (regression)
 Determines whether more variability is
explained by the regression or
unexplained by the regression.
Ordinary Least-Squares Method
of Regression Analysis (OLS)
(cont’d)
 Statistical Significance Of Regression Model
 ANOVA Table:
Ordinary Least-Squares Method
of Regression Analysis (OLS)
(cont’d)
 R2
 The proportion of variance in Y that is
explained by X (or vice versa)
 A measure obtained by squaring the
correlation coefficient; that proportion of
the total variance of a variable that is
accounted for by knowing the value of
another variable.
875.0
40.882,3
49.398,32
R
EXHIBIT 23.8 Simple Regression Results for Building Permit Example
EXHIBIT 23.9 OLS Regression Line
Simple Regression and
Hypothesis Testing
 The explanatory power of regression lies
in hypothesis testing. Regression is often
used to test relational hypotheses.
 The outcome of the hypothesis test involves
two conditions that must both be satisfied:
 The regression weight must be in the hypothesized
direction. Positive relationships require a positive
coefficient and negative relationships require a
negative coefficient.
 The t-test associated with the regression weight
must be significant.
What is Multivariate Data
Analysis?
 Research that involves three or more
variables, or that is concerned with underlying
dimensions among multiple variables, will
involve multivariate statistical analysis.
 Methods analyze multiple variables or even
multiple sets of variables simultaneously.
 Business problems involve multivariate data
analysis:
 most employee motivation research
 customer psychographic profiles
 research that seeks to identify viable market segments
The “Variate” in Multivariate
 Variate
 A mathematical way in which a set of
variables can be represented with one
equation.
 A linear combination of variables, each
contributing to the overall meaning of the
variate based upon an empirically derived
weight.
 A function of the measured variables involved
in an analysis: Vk = f (X1, X2, . . . , Xm )
EXHIBIT 24.1 Which Multivariate Approach Is Appropriate?
24–48
Classifying Multivariate
Techniques
 Dependence Techniques
 Explain or predict one or more dependent
variables.
 Needed when hypotheses involve
distinction between independent and
dependent variables.
 Types:
 Multiple regression analysis
 Multiple discriminant analysis
 Multivariate analysis of variance
 Structural equations modeling
Classifying Multivariate
Techniques (cont’d)
 Interdependence Techniques
 Give meaning to a set of variables or seek
to group things together.
 Used when researchers examine
questions that do not distinguish between
independent and dependent variables.
 Types:
 Factor analysis
 Cluster analysis
 Multidimensional scaling
Classifying Multivariate
Techniques (cont’d)
 Influence of Measurement Scales
 The nature of the measurement scales will
determine which multivariate technique is
appropriate for the data.
 Selection of a multivariate technique
requires consideration of the types of
measures used for both independent and
dependent sets of variables.
 Nominal and ordinal scales are nonmetric.
 Interval and ratio scales are metric.
24–51
EXHIBIT 24.2 Which Multivariate Dependence Technique Should I Use?
24–52
EXHIBIT 24.3 Which Multivariate Interdependence Technique Should I Use?
Analysis of Dependence
 General Linear Model (GLM)
 A way of explaining and predicting a dependent
variable based on fluctuations (variation) from its
mean due to changes in independent variables.
μ = a constant (overall mean of the dependent variable)
∆X and ∆F = changes due to main effect independent variables
(experimental variables) and blocking independent
variables (covariates or grouping variables)
∆ XF = represents the change due to the combination
(interaction effect) of those variables.
Interpreting Multiple Regression
 Multiple Regression Analysis
 An analysis of association in which the
effects of two or more independent variables
on a single, interval-scaled dependent
variable are investigated simultaneously.
inni eXbXbXbXbbY 3322110
• Dummy variable
 The way a dichotomous (two group) independent
variable is represented in regression analysis by
assigning a 0 to one group and a 1 to the other.
Multiple Regression Analysis
 A Simple Example
 Assume that a toy manufacturer wishes to explain
store sales (dependent variable) using a sample
of stores from Canada and Europe.
 Several hypotheses are offered:
 H1: Competitor’s sales are related negatively to sales.
 H2: Sales are higher in communities with a sales office
than
when no sales office is present.
 H3: Grammar school enrollment in a community is
related
positively to sales.
Multiple Regression Analysis
(cont’d)
 Statistical Results of the Multiple
Regression
 Regression Equation:
 Coefficient of multiple determination (R2) = 0.845
 F-value= 14.6, p < 0.05
321 7362115387018102 XXXY ....

Multiple Regression Analysis
(cont’d)
 Regression Coefficients in Multiple
Regression
 Partial correlation
 The correlation between two variables after taking
into account the fact that they are correlated with
other variables too.
 R2 in Multiple Regression
 The coefficient of multiple determination in
multiple regression indicates the percentage
of variation in Y explained by all independent
variables.
24–58
Multiple Regression Analysis
(cont’d)
 Statistical Significance in Multiple
Regression
 F-test
 Tests statistical significance by comparing the
variation explained by the regression equation
to the residual error variation.
 Allows for testing of the relative magnitudes
of the sum of squares due to the regression
(SSR) and the error sum of squares (SSE).
MSE
MSR
knSSe
kSSr
F
1/
/
Multiple Regression Analysis
(cont’d)
 Degrees of Freedom (d.f.)
 k = number of independent variables
 n = number of observations or
respondents
 Calculating Degrees of Freedom (d.f.)
 d.f. for the numerator = k
 d.f. for the denominator = n - k - 1
F-test
MSE
MSR
knSSe
kSSr
F
1/
/
EXHIBIT 24.4
Interpreting Multiple
Regression Results
ANOVA (n-way) and MANOVA
 Multivariate Analysis of Variance
(MANOVA)
 A multivariate technique that predicts
multiple continuous dependent variables
with multiple categorical independent
variables.
ANOVA (n-way) and MANOVA
(cont’d)
Interpreting N-way (Univariate) ANOVA
1. Examine overall model F-test result. If
significant, proceed.
2. Examine individual F-tests for individual
variables.
3. For each significant categorical independent
variable, interpret the effect by examining the
group means.
4. For each significant, continuous covariate,
interpret the parameter estimate (b).
5. For each significant interaction, interpret the
means for each combination.
Discriminant Analysis
 A statistical technique for predicting the
probability that an object will belong in
one of two or more mutually exclusive
categories (dependent variable), based
on several independent variables.
 To calculate discriminant scores, the linear
function used is:
niniii XbXbXbZ 2211
Discriminant Analysis
Example
332211 XbXbXbZ
321 0007001300690 XXX ...
EXHIBIT 24.5 Multivariate Dependence Techniques Summary
Factor Analysis
 Statistically identifies a reduced number
of factors from a larger number of
measured variables.
 Types:
 Exploratory factor analysis (EFA)—performed
when the researcher is uncertain about how
many factors may exist among a set of
variables.
 Confirmatory factor analysis (CFA)—
performed when the researcher has strong
theoretical expectations about the factor
structure before performing the analysis.
EXHIBIT 24.6 A Simple Illustration of Factor Analysis
Factor Analysis (cont’d)
 How Many Factors
 Eigenvalues are a measure of how much
variance is explained by each factor.
 Common rule:
 Base the number of factors on the number of
eigenvalues greater than 1.0.
 Factor Loading
 Indicates how strongly a measured
variable is correlated with a factor.
Factor Analysis (cont’d)
 Factor Rotation
 A mathematical way of simplifying factor
analysis results to better identify which
variables “load on” which factors.
 Most common procedure is varimax rotation.
 Data Reduction Technique
 Approaches that summarize the information
from many variables into a reduced set of
variates formed as linear combinations of
measured variables.
 The rule of parsimony: an explanation
involving fewer components is better than
one involving many more.
Factor Analysis (cont’d)
 Creating Composite Scales with Factor
Results
 When a clear pattern of loadings exists,
the researcher may take a simpler
approach by summing the variables with
high loadings and creating a summated
scale.
 Very low loadings suggest a variable does not
contribute much to the factor.
 The reliability of each summated scale is tested
by computing a coefficient alpha estimate.
Factor Analysis (cont’d)
 Communality
 A measure of the percentage of a
variable’s variation that is explained by the
factors.
 A relatively high communality indicates
that a variable has much in common with
the other variables taken as a group.
 Communality for any variable is equal to
the sum of the squared loadings for that
variable.
Factor Analysis (cont’d)
 Total Variance Explained
 Squaring and totaling each loading factor;
dividing the total by the number of factors
provides an estimate of variance in a set of
variables explained by a factor.
 This explanation of variance is much the same
as R2 in multiple regression.
1 - 74
SPSS
Windows
To select this procedure using
SPSS for Windows, click:
Analyze>Data Reduction>Factor …
SPSS Windows: Principal Components
1. Select ANALYZE from the SPSS menu bar.
2. Click DATA REDUCTION and then FACTOR.
3. Move “Prevents Cavities [v1],” “Shiny Teeth [v2],” “Strengthen Gums [v3],”
“Freshens Breath [v4],” “Tooth Decay Unimportant [v5],” and “Attractive
Teeth [v6]” into the VARIABLES box
4. Click on DESCRIPTIVES. In the pop-up window, in the STATISTICS box
check INITIAL SOLUTION. In the CORRELATION MATRIX box, check KMO
AND BARTLETT’S TEST OF SPHERICITY and also check REPRODUCED.
Click CONTINUE.
5. Click on EXTRACTION. In the pop-up window, for METHOD select
PRINCIPAL COMPONENTS (default). In the ANALYZE box, check
CORRELATION MATRIX. In the EXTRACT box, check EIGEN VALUE OVER
1(default). In the DISPLAY box, check UNROTATED FACTOR SOLUTION.
Click CONTINUE.
6. Click on ROTATION. In the METHOD box, check VARIMAX. In the DISPLAY
box, check ROTATED SOLUTION. Click CONTINUE.
7. Click on SCORES. In the pop-up window, check DISPLAY FACTOR SCORE
COEFFICIENT MATRIX. Click CONTINUE.
8. Click OK.
Cluster Analysis
 Cluster analysis
 A multivariate approach for grouping observations
based on similarity among measured variables.
 Cluster analysis is an important tool for identifying market
segments.
 Cluster analysis classifies individuals or objects into a
small number of mutually exclusive and exhaustive
groups.
 Objects or individuals are assigned to groups so that
there is great similarity within groups and much less
similarity between groups.
 The cluster should have high internal (within-cluster)
homogeneity and external (between-cluster)
heterogeneity.
EXHIBIT 24.7 Clusters of Individuals on Two Dimensions
24–79
EXHIBIT 24.8 Cluster Analysis of Test-Market Cities
1 - 80
SPSS Windows
To select this procedure using SPSS for
Windows, click:
Analyze>Classify>Hierarchical Cluster
…
Analyze>Classify>K-Means Cluster …
Analyze>Classify>Two-Step Cluster
SPSS Windows: Hierarchical Clustering
1. Select ANALYZE from the SPSS menu bar.
2. Click CLASSIFY and then HIERARCHICAL CLUSTER.
3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best Buys [v4],”
“Don’t Care [v5],” and “Compare Prices [v6]” into the VARIABLES box.
4. In the CLUSTER box, check CASES (default option). In the DISPLAY box, check
STATISTICS and PLOTS (default options).
5. Click on STATISTICS. In the pop-up window, check AGGLOMERATION
SCHEDULE. In the CLUSTER MEMBERSHIP box, check RANGE OF SOLUTIONS.
Then, for MINIMUM NUMBER OF CLUSTERS, enter 2 and for MAXIMUM NUMBER
OF CLUSTERS, enter 4. Click CONTINUE.
6. Click on PLOTS. In the pop-up window, check DENDROGRAM. In the ICICLE
box, check ALL CLUSTERS (default). In the ORIENTATION box, check
VERTICAL. Click CONTINUE.
7. Click on METHOD. For CLUSTER METHOD, select WARD’S METHOD. In the
MEASURE box, check INTERVAL and select SQUARED EUCLIDEAN DISTANCE.
Click CONTINUE.
8. Click OK.
SPSS Windows: K-Means
Clustering
1. Select ANALYZE from the SPSS menu bar.
2. Click CLASSIFY and then K-MEANS CLUSTER.
3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best
Buys [v4],” “Don’t Care [v5],” and “Compare Prices [v6]” into
the VARIABLES box.
4. For NUMBER OF CLUSTER, select 3.
5. Click on OPTIONS. In the pop-up window, in the STATISTICS
box, check INITIAL CLUSTER CENTERS and CLUSTER
INFORMATION FOR EACH CASE. Click CONTINUE.
6. Click OK.
SPSS Windows: Two-Step
Clustering
1. Select ANALYZE from the SPSS menu bar.
2. Click CLASSIFY and then TWO-STEP CLUSTER.
3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best
Buys [v4],” “Don’t Care [v5],” and “Compare Prices [v6]” into
the CONTINUOUS VARIABLES box.
4. For DISTANCE MEASURE, select EUCLIDEAN.
5. For NUMBER OF CLUSTER, select DETERMINE
AUTOMATICALLY.
6. For CLUSTERING CRITERION, select AKAIKE’S INFORMATION
CRITERION (AIC).
7. Click OK.
Multidimensional Scaling
 Multidimensional Scaling
 Measures objects in multidimensional
space on the basis of respondents’
judgments of the similarity of objects.
EXHIBIT 24.9 Perceptual Map of Six Graduate Business Schools: Simple Space
1 - 87
1 - 88
SPSS Windows
The multidimensional scaling program allows individual
differences as well as aggregate analysis using ALSCAL. The
level of measurement can be ordinal, interval or ratio. Both the
direct and the derived approaches can be accommodated.
To select multidimensional scaling procedures using SPSS
for Windows, click:
Analyze>Scale>Multidimensional Scaling …
The conjoint analysis approach can be implemented using
regression if the dependent variable is metric (interval or
ratio).
This procedure can be run by clicking:
Analyze>Regression>Linear …
SPSS Windows : MDS
First convert similarity ratings to distances by subtracting each
value of Table 21.1 from 8. The form of the data matrix has to
be square symmetric (diagonal elements zero and distances
above and below the diagonal. See SPSS file Table 21.1 Input).
1. Select ANALYZE from the SPSS menu bar.
2. Click SCALE and then MULTIDIMENSIONAL SCALING
(ALSCAL).
3. Move “Aqua-Fresh [AquaFresh],” “Crest [Crest],” “Colgate
[Colgate],” “Aim [Aim],” “Gleem [Gleem],” “Ultra Brite
[UltraBrite],” “Ultra-Brite [var00007],” “Close-Up [CloseUp],”
“Pepsodent [Pepsodent],” and “Sensodyne [Sensodyne]” into
the VARIABLES box.
SPSS Windows : MDS
4. In the DISTANCES box, check DATA ARE DISTANCES.
SHAPE should be SQUARE SYMMETRIC (default).
5. Click on MODEL. In the pop-up window, in the LEVEL OF
MEASUREMENT box, check INTERVAL. In the SCALING
MODEL box, check EUCLIDEAN DISTANCE. In the
CONDITIONALITY box, check MATRIX. Click CONTINUE.
6. Click on OPTIONS. In the pop-up window, in the DISPLAY
box, check GROUP PLOTS, DATA MATRIX and MODEL
AND OPTIONS SUMMARY. Click CONTINUE.
7. Click OK.
24–92
EXHIBIT 24.10 Summary of Multivariate Techniques for Analysis of
Interdependence
Further Reading
 COOPER, D.R. AND SCHINDLER, P.S. (2011)
BUSINESS RESEARCH METHODS, 11TH EDN,
MCGRAW HILL
 ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND
GRIFFIN, M. (2010) BUSINESS RESEARCH
METHODS, 8TH EDN, SOUTH-WESTERN
 SAUNDERS, M., LEWIS, P. AND THORNHILL, A.
(2012) RESEARCH METHODS FOR BUSINESS
STUDENTS, 6TH EDN, PRENTICE HALL.
 SAUNDERS, M. AND LEWIS, P. (2012) DOING
RESEARCH IN BUSINESS & MANAGEMENT, FT
PRENTICE HALL.

More Related Content

PPTX
Spss &amp; rsm copy
PPT
Chap019
PPTX
Logistic regression analysis
PPT
Chapter 15 Marketing Research Malhotra
PDF
Factor analysis using spss 2005
PPTX
Multivariate analysis - Multiple regression analysis
PPTX
Inferential statistics nominal data
PPTX
Unit 4 editing and coding (2)
Spss &amp; rsm copy
Chap019
Logistic regression analysis
Chapter 15 Marketing Research Malhotra
Factor analysis using spss 2005
Multivariate analysis - Multiple regression analysis
Inferential statistics nominal data
Unit 4 editing and coding (2)

What's hot (19)

PPSX
Multivariate Analysis An Overview
PPTX
Multivariate
PPTX
Inferential statistics quantitative data - single sample and 2 groups
PPTX
Agreement analysis
PPTX
Factor Analysis in Research
PPT
SPSS statistics - get help using SPSS
PPT
Multivariate Analysis Techniques
PDF
Applied Statistical Methods - Question & Answer on SPSS
PPTX
Mba2216 week 11 data analysis part 03 appendix
PPTX
Basics of Educational Statistics (T-test)
PPTX
Types of variables and descriptive statistics
PPTX
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
PPT
Path Analysis
PPTX
Data entry in Excel and SPSS
PPTX
Anova copy
PPTX
Ibm SPSS Statistics
PPTX
Logistic regression with SPSS examples
PPTX
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
PPTX
Statistical analysis in SPSS_
Multivariate Analysis An Overview
Multivariate
Inferential statistics quantitative data - single sample and 2 groups
Agreement analysis
Factor Analysis in Research
SPSS statistics - get help using SPSS
Multivariate Analysis Techniques
Applied Statistical Methods - Question & Answer on SPSS
Mba2216 week 11 data analysis part 03 appendix
Basics of Educational Statistics (T-test)
Types of variables and descriptive statistics
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
Path Analysis
Data entry in Excel and SPSS
Anova copy
Ibm SPSS Statistics
Logistic regression with SPSS examples
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
Statistical analysis in SPSS_
Ad

Viewers also liked (20)

PPS
Correlation and regression
PPT
Abdm4064 week 11 data analysis
PPTX
Mba2216 week 11 data analysis part 01
PPTX
Analysis of Variance
DOCX
Variance Analysis in Standard Costing
PPTX
Slope and y intercept
PPT
Slope Intercept Form
PPTX
Pair of linear equations in two variable
PPT
Unit 6 lesson 2
PPT
Review Of Slope And The Slope Intercept Formula
PPTX
SAS 0412
PDF
Lecture slides stats1.13.l16.air
PPT
Graphing Linear Equations Lesson
PPT
Chapter 5 Slope-Intercept Form
PPTX
Graph of a linear equation in two variables
PDF
Statistics lecture 11 (chapter 11)
PPT
regression and correlation
PPTX
Slope PowerPoint
PDF
Lecture 4: Statistical Inference
PPT
Introduction to slope presentation
Correlation and regression
Abdm4064 week 11 data analysis
Mba2216 week 11 data analysis part 01
Analysis of Variance
Variance Analysis in Standard Costing
Slope and y intercept
Slope Intercept Form
Pair of linear equations in two variable
Unit 6 lesson 2
Review Of Slope And The Slope Intercept Formula
SAS 0412
Lecture slides stats1.13.l16.air
Graphing Linear Equations Lesson
Chapter 5 Slope-Intercept Form
Graph of a linear equation in two variables
Statistics lecture 11 (chapter 11)
regression and correlation
Slope PowerPoint
Lecture 4: Statistical Inference
Introduction to slope presentation
Ad

Similar to Mba2216 week 11 data analysis part 02 (20)

PPTX
mean comparison.pptx
PPTX
mean comparison.pptx
PDF
Research method ch09 statistical methods 3 estimation np
PPTX
6 the six uContinuous data analysis.pptx
DOCX
Copyright© Dorling Kinde.docx
DOCX
Copyright© Dorling Kinde.docx
PPTX
data analysis in research.pptx
PPTX
LESSON 5-DATA ANALYSIS-Practical Research 2
PPTX
KASSAN KASELEMA. LESSON IV. D. DATA ANALYSIS
PPT
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
PPT
1 ANOVA.ppt
PPTX
REVISION SLIDES 2.pptx
PPT
classmar16.ppt
PPT
classmar16.ppt
PDF
9. parametric regression
PPTX
Correlation and regression
PPT
2-20-04.ppt
PPT
Regression and Co-Relation
PPT
ANOVA - PPT_03294146744445635645677364.ppt
PPT
lecture13.ppt
mean comparison.pptx
mean comparison.pptx
Research method ch09 statistical methods 3 estimation np
6 the six uContinuous data analysis.pptx
Copyright© Dorling Kinde.docx
Copyright© Dorling Kinde.docx
data analysis in research.pptx
LESSON 5-DATA ANALYSIS-Practical Research 2
KASSAN KASELEMA. LESSON IV. D. DATA ANALYSIS
2-20-04.ppthjjbnjjjhhhhhhhhhhhhhhhhhhhhhhhh
1 ANOVA.ppt
REVISION SLIDES 2.pptx
classmar16.ppt
classmar16.ppt
9. parametric regression
Correlation and regression
2-20-04.ppt
Regression and Co-Relation
ANOVA - PPT_03294146744445635645677364.ppt
lecture13.ppt

More from Stephen Ong (20)

PPTX
Tcm step 3 venture assessment
PPTX
Tcm step 2 market needs analysis
PPTX
Tcm step 1 technology analysis
PPTX
Tcm Workshop 1 Technology analysis
PPTX
Tcm step 3 venture assessment
PPTX
Tcm step 2 market needs analysis
PPTX
Tcm step 1 technology analysis
PPTX
Tcm concept discovery stage introduction
PPT
Mod001093 german sme hidden champions 120415
PPT
Tbs910 linear programming
PPT
Mod001093 family businesses 050415
PPT
Gs503 vcf lecture 8 innovation finance ii 060415
PPT
Gs503 vcf lecture 7 innovation finance i 300315
PPT
Tbs910 regression models
PPT
Tbs910 sampling hypothesis regression
PPT
Mod001093 intrapreneurship 290315
PPT
Gs503 vcf lecture 6 partial valuation ii 160315
PPT
Gs503 vcf lecture 5 partial valuation i 140315
PPT
Mod001093 context of sme 220315
PPT
Mod001093 from innovation business model to startup 140315
Tcm step 3 venture assessment
Tcm step 2 market needs analysis
Tcm step 1 technology analysis
Tcm Workshop 1 Technology analysis
Tcm step 3 venture assessment
Tcm step 2 market needs analysis
Tcm step 1 technology analysis
Tcm concept discovery stage introduction
Mod001093 german sme hidden champions 120415
Tbs910 linear programming
Mod001093 family businesses 050415
Gs503 vcf lecture 8 innovation finance ii 060415
Gs503 vcf lecture 7 innovation finance i 300315
Tbs910 regression models
Tbs910 sampling hypothesis regression
Mod001093 intrapreneurship 290315
Gs503 vcf lecture 6 partial valuation ii 160315
Gs503 vcf lecture 5 partial valuation i 140315
Mod001093 context of sme 220315
Mod001093 from innovation business model to startup 140315

Recently uploaded (20)

PPTX
Principles of Marketing, Industrial, Consumers,
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Types of control:Qualitative vs Quantitative
PPTX
Amazon (Business Studies) management studies
PDF
A Brief Introduction About Julia Allison
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
How to Get Business Funding for Small Business Fast
PPT
Data mining for business intelligence ch04 sharda
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
Chapter 5_Foreign Exchange Market in .pdf
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PPTX
HR Introduction Slide (1).pptx on hr intro
PDF
Nidhal Samdaie CV - International Business Consultant
PDF
WRN_Investor_Presentation_August 2025.pdf
PDF
Training And Development of Employee .pdf
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PPTX
Probability Distribution, binomial distribution, poisson distribution
Principles of Marketing, Industrial, Consumers,
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Types of control:Qualitative vs Quantitative
Amazon (Business Studies) management studies
A Brief Introduction About Julia Allison
Euro SEO Services 1st 3 General Updates.docx
How to Get Business Funding for Small Business Fast
Data mining for business intelligence ch04 sharda
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
Chapter 5_Foreign Exchange Market in .pdf
New Microsoft PowerPoint Presentation - Copy.pptx
HR Introduction Slide (1).pptx on hr intro
Nidhal Samdaie CV - International Business Consultant
WRN_Investor_Presentation_August 2025.pdf
Training And Development of Employee .pdf
340036916-American-Literature-Literary-Period-Overview.ppt
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
Probability Distribution, binomial distribution, poisson distribution

Mba2216 week 11 data analysis part 02

  • 1. Data Analysis Part 2: Variances, Regression, Correl ation MBA2216 BUSINESS RESEARCH PROJECT by Stephen Ong Visiting Fellow, Birmingham City University, UK Visiting Professor, Shenzhen University
  • 2. 17. Understand the concept of analysis of variance (ANOVA) 18. Interpret an ANOVA table 19. Apply and interpret simple bivariate correlations 22. Interpret a correlation matrix 23. Understand simple (bivariate) regression 24. Understand the least-squares estimation technique 25. Interpret regression output including the tests of hypotheses tied to specific parameter coefficients 27. Understand what multivariate statistical analysis involves and know the two types of multivariate analysis 19–2 LEARNING OUTCOMES
  • 3. 30. Interpret basic exploratory factor analysis results 31. Know what multiple discriminant analysis can be used to do 32. Understand how cluster analysis can identify market segments 19–3 LEARNING OUTCOMES
  • 4. Remember this,  Garbage in, garbage out!  If data is collected improperly, or coded incorrectly, then the research results are “garbage”.
  • 5. 19–5 EXHIBIT 19.1 Overview of the Stages of Data Analysis
  • 6. Relationship Amongst Test, Analysis of Variance, Analysis of Covariance, & Regression One Independent One or More Metric Dependent Variable t Test Binary Variable One-Way Analysis of Variance One Factor N-Way Analysis of Variance More than One Factor Analysis of Variance Categorical: Factorial Analysis of Covariance Categorical and Interval Regression Interval Independent Variables
  • 7. The Z-Test for Comparing Two Proportions  Z-Test for Differences of Proportions  Tests the hypothesis that proportions are significantly different for two independent samples or groups.  Requires a sample size greater than thirty.  The hypothesis is: Ho: π1 = π2 may be restated as: Ho: π1 - π2 = 0
  • 8. The Z-Test for Comparing Two Proportions  Z-Test statistic for differences in large random samples: 21 2121 ppS pp Z p1 = sample portion of successes in Group 1 p2 = sample portion of successes in Group 2 1 1) = hypothesized population proportion 1 minus hypothesized population proportion 2 Sp1-p2 = pooled estimate of the standard errors of differences of proportions
  • 9. The Z-Test for Comparing Two Proportions  To calculate the standard error of the differences in proportions: 21 11 21 nn qpS pp
  • 10. One-Way Analysis of Variance (ANOVA)  Analysis of Variance (ANOVA)  An analysis involving the investigation of the effects of one treatment variable on an interval-scaled dependent variable.  A hypothesis-testing technique to determine whether statistically significant differences in means occur between two or more groups.  A method of comparing variances to make inferences about the means.  The substantive hypothesis tested is:  At least one group mean is not equal to another group mean.
  • 11. Partitioning Variance in ANOVA  Total Variability  Grand Mean  The mean of a variable over all observations.  SST = Total of (observed value-grand mean)2
  • 12. Partitioning Variance in ANOVA  Between-Groups Variance  The sum of differences between the group mean and the grand mean summed over all groups for a given set of observations.  SSB = Total of ngroup(Group Mean − Grand Mean)2  Within-Group Error or Variance  The sum of the differences between observed values and the group mean for a given set of observations  Also known as total error variance.  SSE = Total of (Observed Mean − Group Mean)2
  • 13. The F-Test  F-Test  Used to determine whether there is more variability in the scores of one sample than in the scores of another sample.  Variance components are used to compute F-ratios  SSE, SSB, SST groupswithinVariance groupsbetweenVariance F
  • 16. SPSS Windows One-way ANOVA can be efficiently performed using the program COMPARE MEANS and then One-way ANOVA. To select this procedure using SPSS for Windows, click: Analyze>Compare Means>One-Way ANOVA … N-way analysis of variance and analysis of covariance can be performed using GENERAL LINEAR MODEL. To select this procedure using SPSS for Windows, click: Analyze>General Linear Model>Univariate …
  • 17. SPSS Windows: One-Way ANOVA 1. Select ANALYZE from the SPSS menu bar. 2. Click COMPARE MEANS and then ONE-WAY ANOVA. 3. Move “Sales [sales]” in to the DEPENDENT LIST box. 4. Move “In-Store Promotion[promotion]” to the FACTOR box. 5. Click OPTIONS. 6. Click Descriptive. 7. Click CONTINUE. 8. Click OK.
  • 18. SPSS Windows: Analysis of Covariance 1. Select ANALYZE from the SPSS menu bar. 2. Click GENERAL LINEAR MODEL and then UNIVARIATE. 3. Move “Sales [sales]” in to the DEPENDENT VARIABLE box. 4. Move “In-Store Promotion[promotion]” to the FIXED FACTOR(S) box. Then move “Coupon[coupon] also to the FIXED FACTOR(S) box. 5. Move “Clientel[clientel] to the COVARIATE(S) box. 6. Click OK.
  • 19. The Basics  Measures of Association  Refers to a number of bivariate statistical techniques used to measure the strength of a relationship between two variables.  The chi-square ( 2) test provides information about whether two or more less-than interval variables are interrelated.  Correlation analysis is most appropriate for interval or ratio variables.  Regression can accommodate either less- than interval or interval independent variables, but the dependent variable must be continuous.
  • 20. 23–20 EXHIBIT 23.1 Bivariate Analysis— Common Procedures for Testing Association
  • 21. Simple Correlation Coefficient (continued)  Correlation coefficient  A statistical measure of the covariation, or association, between two at-least interval variables.  Covariance  Extent to which two variables are associated systematically with each other. n i n i n i ii yxxy YYiXXi YYXX rr 1 1 22 1
  • 22. Simple Correlation Coefficient  Correlation coefficient (r)  Ranges from +1 to -1  Perfect positive linear relationship = +1  Perfect negative (inverse) linear relationship = -1  No correlation = 0  Correlation coefficient for two variables (X,Y)
  • 23. EXHIBIT 23.2 Scatter Diagram to Illustrate Correlation Patterns
  • 24. Correlation, Covariance, and Causation  When two variables covary (i.e. vary systematically), they display concomitant variation.  This systematic covariation does not in and of itself establish causality.  e.g., Rooster’s crow and the rising of the sun  Rooster does not cause the sun to rise.
  • 25. Coefficient of Determination  Coefficient of Determination (R2)  A measure obtained by squaring the correlation coefficient; the proportion of the total variance of a variable accounted for by another value of another variable.  Measures that part of the total variance of Y that is accounted for by knowing the value of X. VarianceTotal varianceExplained2 R
  • 26. Correlation Matrix  Correlation matrix  The standard form for reporting correlation coefficients for more than two variables.  Statistical Significance  The procedure for determining statistical significance is the t-test of the significance of a correlation coefficient.
  • 27. EXHIBIT 23.4 Pearson Product-Moment Correlation Matrix for Salesperson Examplea
  • 28. Regression Analysis  Simple (Bivariate) Linear Regression  A measure of linear association that investigates straight-line relationships between a continuous dependent variable and an independent variable that is usually continuous, but can be a categorical dummy variable.  The Regression Equation (Y = α + βX )  Y = the continuous dependent variable  X = the independent variable  α = the Y intercept (regression line intercepts Y axis)  β = the slope of the coefficient (rise over run)
  • 29. 130 120 110 100 90 80 80 90 100 110 120 130 140 150 160 170 X Y XaY ˆˆˆ X Yˆ Regression Line and Slope
  • 30. The Regression Equation  Parameter Estimate Choices  β is indicative of the strength and direction of the relationship between the independent and dependent variable.  α (Y intercept) is a fixed point that is considered a constant (how much Y can exist without X)  Standardized Regression Coefficient (β)  Estimated coefficient of the strength of relationship between the independent and dependent variables.  Expressed on a standardized scale where higher absolute values indicate stronger relationships (range is from -1 to 1).
  • 31. The Regression Equation (cont’d)  Parameter Estimate Choices  Raw regression estimates (b1)  Raw regression weights have the advantage of retaining the scale metric—which is also their key disadvantage.  If the purpose of the regression analysis is forecasting, then raw parameter estimates must be used.  This is another way of saying when the researcher is interested only in prediction.  Standardized regression estimates (β)  Standardized regression estimates have the advantage of a constant scale.  Standardized regression estimates should be used when the researcher is testing explanatory hypotheses.
  • 32. EXHIBIT 23.5 The Advantage of Standardized Regression Weights
  • 33. EXHIBIT 23.6 Relationship of Sales Potential to Building Permits Issued
  • 34. EXHIBIT 23.7 The Best Fit Line or Knocking Out the Pins
  • 35. Ordinary Least-Squares (OLS) Method of Regression Analysis OLS  Guarantees that the resulting straight line will produce the least possible total error in using X to predict Y.  Generates a straight line that minimizes the sum of squared deviations of the actual values from this predicted regression line.  No straight line can completely represent every dot in the scatter diagram.  There will be a discrepancy between most of the actual scores (each dot) and the predicted score .  Uses the criterion of attempting to make the least amount of total error in prediction of Y from X.
  • 36. Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)
  • 37. Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d) The equation means that the predicted value for any value of X (Xi) is determined as a function of the estimated slope coefficient, plus the estimated intercept coefficient + some error.
  • 38. © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publically accessible website, in whole or in part. 23–38 Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)
  • 39. © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publically accessible website, in whole or in part. 23–39 Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d) Statistical Significance Of Regression Model  F-test (regression)  Determines whether more variability is explained by the regression or unexplained by the regression.
  • 40. Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)  Statistical Significance Of Regression Model  ANOVA Table:
  • 41. Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)  R2  The proportion of variance in Y that is explained by X (or vice versa)  A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable. 875.0 40.882,3 49.398,32 R
  • 42. EXHIBIT 23.8 Simple Regression Results for Building Permit Example
  • 43. EXHIBIT 23.9 OLS Regression Line
  • 44. Simple Regression and Hypothesis Testing  The explanatory power of regression lies in hypothesis testing. Regression is often used to test relational hypotheses.  The outcome of the hypothesis test involves two conditions that must both be satisfied:  The regression weight must be in the hypothesized direction. Positive relationships require a positive coefficient and negative relationships require a negative coefficient.  The t-test associated with the regression weight must be significant.
  • 45. What is Multivariate Data Analysis?  Research that involves three or more variables, or that is concerned with underlying dimensions among multiple variables, will involve multivariate statistical analysis.  Methods analyze multiple variables or even multiple sets of variables simultaneously.  Business problems involve multivariate data analysis:  most employee motivation research  customer psychographic profiles  research that seeks to identify viable market segments
  • 46. The “Variate” in Multivariate  Variate  A mathematical way in which a set of variables can be represented with one equation.  A linear combination of variables, each contributing to the overall meaning of the variate based upon an empirically derived weight.  A function of the measured variables involved in an analysis: Vk = f (X1, X2, . . . , Xm )
  • 47. EXHIBIT 24.1 Which Multivariate Approach Is Appropriate?
  • 48. 24–48 Classifying Multivariate Techniques  Dependence Techniques  Explain or predict one or more dependent variables.  Needed when hypotheses involve distinction between independent and dependent variables.  Types:  Multiple regression analysis  Multiple discriminant analysis  Multivariate analysis of variance  Structural equations modeling
  • 49. Classifying Multivariate Techniques (cont’d)  Interdependence Techniques  Give meaning to a set of variables or seek to group things together.  Used when researchers examine questions that do not distinguish between independent and dependent variables.  Types:  Factor analysis  Cluster analysis  Multidimensional scaling
  • 50. Classifying Multivariate Techniques (cont’d)  Influence of Measurement Scales  The nature of the measurement scales will determine which multivariate technique is appropriate for the data.  Selection of a multivariate technique requires consideration of the types of measures used for both independent and dependent sets of variables.  Nominal and ordinal scales are nonmetric.  Interval and ratio scales are metric.
  • 51. 24–51 EXHIBIT 24.2 Which Multivariate Dependence Technique Should I Use?
  • 52. 24–52 EXHIBIT 24.3 Which Multivariate Interdependence Technique Should I Use?
  • 53. Analysis of Dependence  General Linear Model (GLM)  A way of explaining and predicting a dependent variable based on fluctuations (variation) from its mean due to changes in independent variables. μ = a constant (overall mean of the dependent variable) ∆X and ∆F = changes due to main effect independent variables (experimental variables) and blocking independent variables (covariates or grouping variables) ∆ XF = represents the change due to the combination (interaction effect) of those variables.
  • 54. Interpreting Multiple Regression  Multiple Regression Analysis  An analysis of association in which the effects of two or more independent variables on a single, interval-scaled dependent variable are investigated simultaneously. inni eXbXbXbXbbY 3322110 • Dummy variable  The way a dichotomous (two group) independent variable is represented in regression analysis by assigning a 0 to one group and a 1 to the other.
  • 55. Multiple Regression Analysis  A Simple Example  Assume that a toy manufacturer wishes to explain store sales (dependent variable) using a sample of stores from Canada and Europe.  Several hypotheses are offered:  H1: Competitor’s sales are related negatively to sales.  H2: Sales are higher in communities with a sales office than when no sales office is present.  H3: Grammar school enrollment in a community is related positively to sales.
  • 56. Multiple Regression Analysis (cont’d)  Statistical Results of the Multiple Regression  Regression Equation:  Coefficient of multiple determination (R2) = 0.845  F-value= 14.6, p < 0.05 321 7362115387018102 XXXY .... 
  • 57. Multiple Regression Analysis (cont’d)  Regression Coefficients in Multiple Regression  Partial correlation  The correlation between two variables after taking into account the fact that they are correlated with other variables too.  R2 in Multiple Regression  The coefficient of multiple determination in multiple regression indicates the percentage of variation in Y explained by all independent variables.
  • 58. 24–58 Multiple Regression Analysis (cont’d)  Statistical Significance in Multiple Regression  F-test  Tests statistical significance by comparing the variation explained by the regression equation to the residual error variation.  Allows for testing of the relative magnitudes of the sum of squares due to the regression (SSR) and the error sum of squares (SSE). MSE MSR knSSe kSSr F 1/ /
  • 59. Multiple Regression Analysis (cont’d)  Degrees of Freedom (d.f.)  k = number of independent variables  n = number of observations or respondents  Calculating Degrees of Freedom (d.f.)  d.f. for the numerator = k  d.f. for the denominator = n - k - 1
  • 62. ANOVA (n-way) and MANOVA  Multivariate Analysis of Variance (MANOVA)  A multivariate technique that predicts multiple continuous dependent variables with multiple categorical independent variables.
  • 63. ANOVA (n-way) and MANOVA (cont’d) Interpreting N-way (Univariate) ANOVA 1. Examine overall model F-test result. If significant, proceed. 2. Examine individual F-tests for individual variables. 3. For each significant categorical independent variable, interpret the effect by examining the group means. 4. For each significant, continuous covariate, interpret the parameter estimate (b). 5. For each significant interaction, interpret the means for each combination.
  • 64. Discriminant Analysis  A statistical technique for predicting the probability that an object will belong in one of two or more mutually exclusive categories (dependent variable), based on several independent variables.  To calculate discriminant scores, the linear function used is: niniii XbXbXbZ 2211
  • 66. EXHIBIT 24.5 Multivariate Dependence Techniques Summary
  • 67. Factor Analysis  Statistically identifies a reduced number of factors from a larger number of measured variables.  Types:  Exploratory factor analysis (EFA)—performed when the researcher is uncertain about how many factors may exist among a set of variables.  Confirmatory factor analysis (CFA)— performed when the researcher has strong theoretical expectations about the factor structure before performing the analysis.
  • 68. EXHIBIT 24.6 A Simple Illustration of Factor Analysis
  • 69. Factor Analysis (cont’d)  How Many Factors  Eigenvalues are a measure of how much variance is explained by each factor.  Common rule:  Base the number of factors on the number of eigenvalues greater than 1.0.  Factor Loading  Indicates how strongly a measured variable is correlated with a factor.
  • 70. Factor Analysis (cont’d)  Factor Rotation  A mathematical way of simplifying factor analysis results to better identify which variables “load on” which factors.  Most common procedure is varimax rotation.  Data Reduction Technique  Approaches that summarize the information from many variables into a reduced set of variates formed as linear combinations of measured variables.  The rule of parsimony: an explanation involving fewer components is better than one involving many more.
  • 71. Factor Analysis (cont’d)  Creating Composite Scales with Factor Results  When a clear pattern of loadings exists, the researcher may take a simpler approach by summing the variables with high loadings and creating a summated scale.  Very low loadings suggest a variable does not contribute much to the factor.  The reliability of each summated scale is tested by computing a coefficient alpha estimate.
  • 72. Factor Analysis (cont’d)  Communality  A measure of the percentage of a variable’s variation that is explained by the factors.  A relatively high communality indicates that a variable has much in common with the other variables taken as a group.  Communality for any variable is equal to the sum of the squared loadings for that variable.
  • 73. Factor Analysis (cont’d)  Total Variance Explained  Squaring and totaling each loading factor; dividing the total by the number of factors provides an estimate of variance in a set of variables explained by a factor.  This explanation of variance is much the same as R2 in multiple regression.
  • 75. SPSS Windows To select this procedure using SPSS for Windows, click: Analyze>Data Reduction>Factor …
  • 76. SPSS Windows: Principal Components 1. Select ANALYZE from the SPSS menu bar. 2. Click DATA REDUCTION and then FACTOR. 3. Move “Prevents Cavities [v1],” “Shiny Teeth [v2],” “Strengthen Gums [v3],” “Freshens Breath [v4],” “Tooth Decay Unimportant [v5],” and “Attractive Teeth [v6]” into the VARIABLES box 4. Click on DESCRIPTIVES. In the pop-up window, in the STATISTICS box check INITIAL SOLUTION. In the CORRELATION MATRIX box, check KMO AND BARTLETT’S TEST OF SPHERICITY and also check REPRODUCED. Click CONTINUE. 5. Click on EXTRACTION. In the pop-up window, for METHOD select PRINCIPAL COMPONENTS (default). In the ANALYZE box, check CORRELATION MATRIX. In the EXTRACT box, check EIGEN VALUE OVER 1(default). In the DISPLAY box, check UNROTATED FACTOR SOLUTION. Click CONTINUE. 6. Click on ROTATION. In the METHOD box, check VARIMAX. In the DISPLAY box, check ROTATED SOLUTION. Click CONTINUE. 7. Click on SCORES. In the pop-up window, check DISPLAY FACTOR SCORE COEFFICIENT MATRIX. Click CONTINUE. 8. Click OK.
  • 77. Cluster Analysis  Cluster analysis  A multivariate approach for grouping observations based on similarity among measured variables.  Cluster analysis is an important tool for identifying market segments.  Cluster analysis classifies individuals or objects into a small number of mutually exclusive and exhaustive groups.  Objects or individuals are assigned to groups so that there is great similarity within groups and much less similarity between groups.  The cluster should have high internal (within-cluster) homogeneity and external (between-cluster) heterogeneity.
  • 78. EXHIBIT 24.7 Clusters of Individuals on Two Dimensions
  • 79. 24–79 EXHIBIT 24.8 Cluster Analysis of Test-Market Cities
  • 81. SPSS Windows To select this procedure using SPSS for Windows, click: Analyze>Classify>Hierarchical Cluster … Analyze>Classify>K-Means Cluster … Analyze>Classify>Two-Step Cluster
  • 82. SPSS Windows: Hierarchical Clustering 1. Select ANALYZE from the SPSS menu bar. 2. Click CLASSIFY and then HIERARCHICAL CLUSTER. 3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best Buys [v4],” “Don’t Care [v5],” and “Compare Prices [v6]” into the VARIABLES box. 4. In the CLUSTER box, check CASES (default option). In the DISPLAY box, check STATISTICS and PLOTS (default options). 5. Click on STATISTICS. In the pop-up window, check AGGLOMERATION SCHEDULE. In the CLUSTER MEMBERSHIP box, check RANGE OF SOLUTIONS. Then, for MINIMUM NUMBER OF CLUSTERS, enter 2 and for MAXIMUM NUMBER OF CLUSTERS, enter 4. Click CONTINUE. 6. Click on PLOTS. In the pop-up window, check DENDROGRAM. In the ICICLE box, check ALL CLUSTERS (default). In the ORIENTATION box, check VERTICAL. Click CONTINUE. 7. Click on METHOD. For CLUSTER METHOD, select WARD’S METHOD. In the MEASURE box, check INTERVAL and select SQUARED EUCLIDEAN DISTANCE. Click CONTINUE. 8. Click OK.
  • 83. SPSS Windows: K-Means Clustering 1. Select ANALYZE from the SPSS menu bar. 2. Click CLASSIFY and then K-MEANS CLUSTER. 3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best Buys [v4],” “Don’t Care [v5],” and “Compare Prices [v6]” into the VARIABLES box. 4. For NUMBER OF CLUSTER, select 3. 5. Click on OPTIONS. In the pop-up window, in the STATISTICS box, check INITIAL CLUSTER CENTERS and CLUSTER INFORMATION FOR EACH CASE. Click CONTINUE. 6. Click OK.
  • 84. SPSS Windows: Two-Step Clustering 1. Select ANALYZE from the SPSS menu bar. 2. Click CLASSIFY and then TWO-STEP CLUSTER. 3. Move “Fun [v1],” “Bad for Budget [v2],” “Eating Out [v3],” “Best Buys [v4],” “Don’t Care [v5],” and “Compare Prices [v6]” into the CONTINUOUS VARIABLES box. 4. For DISTANCE MEASURE, select EUCLIDEAN. 5. For NUMBER OF CLUSTER, select DETERMINE AUTOMATICALLY. 6. For CLUSTERING CRITERION, select AKAIKE’S INFORMATION CRITERION (AIC). 7. Click OK.
  • 85. Multidimensional Scaling  Multidimensional Scaling  Measures objects in multidimensional space on the basis of respondents’ judgments of the similarity of objects.
  • 86. EXHIBIT 24.9 Perceptual Map of Six Graduate Business Schools: Simple Space
  • 89. SPSS Windows The multidimensional scaling program allows individual differences as well as aggregate analysis using ALSCAL. The level of measurement can be ordinal, interval or ratio. Both the direct and the derived approaches can be accommodated. To select multidimensional scaling procedures using SPSS for Windows, click: Analyze>Scale>Multidimensional Scaling … The conjoint analysis approach can be implemented using regression if the dependent variable is metric (interval or ratio). This procedure can be run by clicking: Analyze>Regression>Linear …
  • 90. SPSS Windows : MDS First convert similarity ratings to distances by subtracting each value of Table 21.1 from 8. The form of the data matrix has to be square symmetric (diagonal elements zero and distances above and below the diagonal. See SPSS file Table 21.1 Input). 1. Select ANALYZE from the SPSS menu bar. 2. Click SCALE and then MULTIDIMENSIONAL SCALING (ALSCAL). 3. Move “Aqua-Fresh [AquaFresh],” “Crest [Crest],” “Colgate [Colgate],” “Aim [Aim],” “Gleem [Gleem],” “Ultra Brite [UltraBrite],” “Ultra-Brite [var00007],” “Close-Up [CloseUp],” “Pepsodent [Pepsodent],” and “Sensodyne [Sensodyne]” into the VARIABLES box.
  • 91. SPSS Windows : MDS 4. In the DISTANCES box, check DATA ARE DISTANCES. SHAPE should be SQUARE SYMMETRIC (default). 5. Click on MODEL. In the pop-up window, in the LEVEL OF MEASUREMENT box, check INTERVAL. In the SCALING MODEL box, check EUCLIDEAN DISTANCE. In the CONDITIONALITY box, check MATRIX. Click CONTINUE. 6. Click on OPTIONS. In the pop-up window, in the DISPLAY box, check GROUP PLOTS, DATA MATRIX and MODEL AND OPTIONS SUMMARY. Click CONTINUE. 7. Click OK.
  • 92. 24–92 EXHIBIT 24.10 Summary of Multivariate Techniques for Analysis of Interdependence
  • 93. Further Reading  COOPER, D.R. AND SCHINDLER, P.S. (2011) BUSINESS RESEARCH METHODS, 11TH EDN, MCGRAW HILL  ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND GRIFFIN, M. (2010) BUSINESS RESEARCH METHODS, 8TH EDN, SOUTH-WESTERN  SAUNDERS, M., LEWIS, P. AND THORNHILL, A. (2012) RESEARCH METHODS FOR BUSINESS STUDENTS, 6TH EDN, PRENTICE HALL.  SAUNDERS, M. AND LEWIS, P. (2012) DOING RESEARCH IN BUSINESS & MANAGEMENT, FT PRENTICE HALL.