Factor analysis

Factor Analysis
 Factor analysis is a general name denoting a class of
procedures primarily used for data reduction and
summarization.
 Factor analysis is an interdependence technique in that an
entire set of interdependent relationships is examined without
making the distinction between dependent and independent
variables.
 Factor analysis is used in the following circumstances:
 To identify underlying dimensions, or factors, that explain
the correlations among a set of variables.
 To identify a new, smaller, set of uncorrelated variables to
replace the original set of correlated variables in subsequent
multivariate analysis (regression or discriminant analysis).
 To identify a smaller set of salient variables from a larger set
for use in subsequent multivariate analysis.

Factor Analysis Model
Mathematically, each variable is expressed as a linear combination
of underlying factors. The covariation among the variables is
described in terms of a small number of common factors plus a
unique factor for each variable. If the variables are standardized,
the factor model may be represented as:
Xi = Ai 1F1 + Ai 2F2 + Ai 3F3 + . . . + AimFm + ViUi
where
Xi = i th standardized variable
Aij = standardized multiple regression coefficient of
variable i on common factor j
F = common factor
Vi = standardized regression coefficient of variable i on
unique factor i
Ui = the unique factor for variable i
m = number of common factors

The unique factors are uncorrelated with each other and with the
common factors. The common factors themselves can be
expressed as linear combinations of the observed variables.
Fi = Wi1X1 + Wi2X2 + Wi3X3 + . . . + WikXk
where
Fi = estimate of i th factor
Wi = weight or factor score coefficient
k = number of variables

 It is possible to select weights or factor score
coefficients so that the first factor explains the largest
portion of the total variance.
 Then a second set of weights can be selected, so
that the second factor accounts for most of the
residual variance, subject to being uncorrelated with
the first factor.
 This same principle could be applied to selecting
additional weights for the additional factors.

Statistics Associated with Factor Analysis
 Bartlett's test of sphericity. Bartlett's test of
sphericity is a test statistic used to examine the
hypothesis that the variables are uncorrelated in the
population. In other words, the population
correlation matrix is an identity matrix; each variable
correlates perfectly with itself (r = 1) but has no
correlation with the other variables (r = 0).
 Correlation matrix. A correlation matrix is a lower
triangle matrix showing the simple correlations, r,
between all possible pairs of variables included in the
analysis. The diagonal elements, which are all 1, are
usually omitted.

 Communality. Communality is the amount of
variance a variable shares with all the other variables
being considered. This is also the proportion of
variance explained by the common factors.
 Eigenvalue. The eigenvalue represents the total
variance explained by each factor.
 Factor loadings. Factor loadings are simple
correlations between the variables and the factors.
 Factor loading plot. A factor loading plot is a plot
of the original variables using the factor loadings as
coordinates.
 Factor matrix. A factor matrix contains the factor
loadings of all the variables on all the factors
extracted.

 Factor scores. Factor scores are composite scores
estimated for each respondent on the derived factors.
 Kaiser-Meyer-Olkin (KMO) measure of sampling
adequacy. The Kaiser-Meyer-Olkin (KMO) measure of
sampling adequacy is an index used to examine the
appropriateness of factor analysis. High values (between
0.5 and 1.0) indicate factor analysis is appropriate. Values
below 0.5 imply that factor analysis may not be
appropriate.
 Percentage of variance. The percentage of the total
variance attributed to each factor.
 Residuals are the differences between the observed
correlations, as given in the input correlation matrix, and
the reproduced correlations, as estimated from the factor
matrix.
 Scree plot. A scree plot is a plot of the Eigenvalues
against the number of factors in order of extraction.

A Factor Analysis Example
A sample of 30 respondents was interviewed using
mall intercept interviewing. The respondents were
asked to to indicate their degree of agreement with
the following statements using a seven-point scale (1
= strongly disagree, 7 =strongly agree).
 V1 = It is important to buy a toothpaste that prevents
cavities
 V2 = I like a toothpaste that gives a shiny teeth
 V3 = A toothpaste should strengthen your gums
 V4= I prefer a toothpaste that freshens breath
 V5 = Prevention of tooth decay is not an important
benefit offered by a toothpaste
 V6 = The most important consideration in buying a
toothpaste is attractive teeth

Conducting Factor Analysis
RESPONDENT
NUMBER V1 V2 V3 V4 V5 V6
1 7.00 3.00 6.00 4.00 2.00 4.00
2 1.00 3.00 2.00 4.00 5.00 4.00
3 6.00 2.00 7.00 4.00 1.00 3.00
4 4.00 5.00 4.00 6.00 2.00 5.00
5 1.00 2.00 2.00 3.00 6.00 2.00
6 6.00 3.00 6.00 4.00 2.00 4.00
7 5.00 3.00 6.00 3.00 4.00 3.00
8 6.00 4.00 7.00 4.00 1.00 4.00
9 3.00 4.00 2.00 3.00 6.00 3.00
10 2.00 6.00 2.00 6.00 7.00 6.00
11 6.00 4.00 7.00 3.00 2.00 3.00
12 2.00 3.00 1.00 4.00 5.00 4.00
13 7.00 2.00 6.00 4.00 1.00 3.00
14 4.00 6.00 4.00 5.00 3.00 6.00
15 1.00 3.00 2.00 2.00 6.00 4.00
16 6.00 4.00 6.00 3.00 3.00 4.00
17 5.00 3.00 6.00 3.00 3.00 4.00
18 7.00 3.00 7.00 4.00 1.00 4.00
19 2.00 4.00 3.00 3.00 6.00 3.00
20 3.00 5.00 3.00 6.00 4.00 6.00
21 1.00 3.00 2.00 3.00 5.00 3.00
22 5.00 4.00 5.00 4.00 2.00 4.00
23 2.00 2.00 1.00 5.00 4.00 4.00
24 4.00 6.00 4.00 6.00 4.00 7.00
25 6.00 5.00 4.00 2.00 1.00 4.00
26 3.00 5.00 4.00 6.00 4.00 7.00
27 4.00 4.00 7.00 2.00 2.00 5.00
28 3.00 7.00 2.00 6.00 4.00 3.00
29 4.00 6.00 3.00 7.00 2.00 7.00
30 2.00 3.00 2.00 4.00 7.00 2.00

Construction of the Correlation Matrix
Method of Factor Analysis
Determination of Number of Factors
Determination of Model Fit
Problem formulation
Calculation of
Factor Scores
Interpretation of Factors
Rotation of Factors
Selection of
Surrogate Variables

Formulate the Problem
 The objectives of factor analysis should be identified.
 The variables to be included in the factor analysis
should be specified based on past research, theory,
and judgment of the researcher. It is important that
the variables be appropriately measured on an
interval or ratio scale.
 An appropriate sample size should be used. As a
rough guideline, there should be at least four or five
times as many observations (sample size) as there
are variables.

Correlation Matrix
Variables V1 V2 V3 V4 V5 V6
V1 1.000
V2 -0.530 1.000
V3 0.873 -0.155 1.000
V4 -0.086 0.572 -0.248 1.000
V5 -0.858 0.020 -0.778 -0.007 1.000
V6 0.004 0.640 -0.018 0.640 -0.136 1.000

 The analytical process is based on a matrix of
correlations between the variables.
 Bartlett's test of sphericity can be used to test the
null hypothesis that the variables are uncorrelated in
the population: in other words, the population
correlation matrix is an identity matrix. If this
hypothesis cannot be rejected, then the
appropriateness of factor analysis should be
questioned.
 Another useful statistic is the Kaiser-Meyer-Olkin
(KMO) measure of sampling adequacy. Small values
of the KMO statistic indicate that the correlations
between pairs of variables cannot be explained by
other variables and that factor analysis may not be
appropriate.
Construct the Correlation Matrix

 In principal components analysis, the total variance in
the data is considered. The diagonal of the correlation
matrix consists of unities, and full variance is brought into
the factor matrix. Principal components analysis is
recommended when the primary concern is to determine
the minimum number of factors that will account for
maximum variance in the data for use in subsequent
multivariate analysis. The factors are called principal
components.
 In common factor analysis, the factors are estimated
based only on the common variance. Communalities are
inserted in the diagonal of the correlation matrix. This
method is appropriate when the primary concern is to
identify the underlying dimensions and the common
variance is of interest. This method is also known as
principal axis factoring.
Determine the Method of Factor Analysis

Results of Principal Components Analysis
Communalities
Variables Initial Extraction
V1 1.000 0.926
V2 1.000 0.723
V3 1.000 0.894
V4 1.000 0.739
V5 1.000 0.878
V6 1.000 0.790
Initial Eigen values
Factor Eigen value % of variance Cumulat. %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000

Extraction Sums of Squared Loadings
Factor Eigen value % of variance Cumulat. %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
Factor Matrix
Variables Factor 1 Factor 2
V1 0.928 0.253
V2 -0.301 0.795
V3 0.936 0.131
V4 -0.342 0.789
V5 -0.869 -0.351
V6 -0.177 0.871
Rotation Sums of Squared Loadings
Factor Eigenvalue % of variance Cumulat. %
1 2.688 44.802 44.802
2 2.261 37.687 82.488

Rotated Factor Matrix
V1 0.962 -0.027
V2 -0.057 0.848
V3 0.934 -0.146
V4 -0.098 0.845
V5 -0.933 -0.084
V6 0.083 0.885
Factor Score Coefficient Matrix
V1 0.358 0.011
V2 -0.001 0.375
V3 0.345 -0.043
V4 -0.017 0.377
V5 -0.350 -0.059
V6 0.052 0.395

V1 0.926 0.024 -0.029 0.031 0.038 -0.053
V2 -0.078 0.723 0.022 -0.158 0.038 -0.105
V3 0.902 -0.177 0.894 -0.031 0.081 0.033
V4 -0.117 0.730 -0.217 0.739 -0.027 -0.107
V5 -0.895 -0.018 -0.859 0.020 0.878 0.016
V6 0.057 0.746 -0.051 0.748 -0.152 0.790
The lower left triangle contains the reproduced
correlation matrix; the diagonal, the communalities;
the upper right triangle, the residuals between the
observed correlations and the reproduced
correlations.

 A Priori Determination. Sometimes, because of
prior knowledge, the researcher knows how many
factors to expect and thus can specify the number of
factors to be extracted beforehand.
 Determination Based on Eigenvalues. In this
approach, only factors with Eigenvalues greater than
1.0 are retained. An Eigenvalue represents the
amount of variance associated with the factor.
Hence, only factors with a variance greater than 1.0
are included. Factors with variance less than 1.0 are
no better than a single variable, since, due to
standardization, each variable has a variance of 1.0.
If the number of variables is less than 20, this
approach will result in a conservative number of
factors.
Determine the Number of Factors

 Determination Based on Scree Plot. A scree
plot is a plot of the Eigenvalues against the number
of factors in order of extraction. Experimental
evidence indicates that the point at which the scree
begins denotes the true number of factors.
Generally, the number of factors determined by a
scree plot will be one or a few more than that
determined by the Eigenvalue criterion.
 Determination Based on Percentage of
Variance. In this approach the number of factors
extracted is determined so that the cumulative
percentage of variance extracted by the factors
reaches a satisfactory level. It is recommended that
the factors extracted should account for at least 60%
of the variance.

Scree Plot
0.5
2 543 6
Component Number
0.0
2.0
3.0
Eigenvalue
1.0
1.5
2.5
1

 Determination Based on Split-Half Reliability.
The sample is split in half and factor analysis is
performed on each half. Only factors with high
correspondence of factor loadings across the two
subsamples are retained.
 Determination Based on Significance Tests.
It is possible to determine the statistical significance
of the separate Eigenvalues and retain only those
factors that are statistically significant. A drawback is
that with large samples (size greater than 200),
many factors are likely to be statistically significant,
although from a practical viewpoint many of these
account for only a small proportion of the total
variance.

 Although the initial or unrotated factor matrix
indicates the relationship between the factors and
individual variables, it seldom results in factors that
can be interpreted, because the factors are
correlated with many variables. Therefore, through
rotation the factor matrix is transformed into a
simpler one that is easier to interpret.
 In rotating the factors, we would like each factor to
have nonzero, or significant, loadings or coefficients
for only some of the variables. Likewise, we would
like each variable to have nonzero or significant
loadings with only a few factors, if possible with only
one.
 The rotation is called orthogonal rotation if the
axes are maintained at right angles.
Rotate Factors

 The most commonly used method for rotation is the
varimax procedure. This is an orthogonal method
of rotation that minimizes the number of variables
with high loadings on a factor, thereby enhancing the
interpretability of the factors. Orthogonal rotation
results in factors that are uncorrelated.
 The rotation is called oblique rotation when the
axes are not maintained at right angles, and the
factors are correlated. Sometimes, allowing for
correlations among factors can simplify the factor
pattern matrix. Oblique rotation should be used
when factors in the population are likely to be
strongly correlated.
Rotate Factors

 A factor can then be interpreted in terms of the
variables that load high on it.
 Another useful aid in interpretation is to plot the
variables, using the factor loadings as coordinates.
Variables at the end of an axis are those that have
high loadings on only that factor, and hence describe
the factor.
Interpret Factors

Factor Loading Plot
1.0
0.5
0.0
-0.5
-1.0
Component2

Component 1
Component
Variable 1 2
V1 0.962 -2.66E-02
V2 -5.72E-02 0.848
V3 0.934 -0.146
V4 -9.83E-02 0.854
V5 -0.933 -8.40E-02
V6 8.337E-02 0.885
Component Plot in Rotated Space





1.0 0.5 0.0 -0.5 -1.0
V1
V3
V6
V2
V5
V4
Rotated Component Matrix

The factor scores for the ith factor may be estimated
as follows:
Fi = Wi1 X1 + Wi2 X2 + Wi3 X3 + . . . + Wik Xk
Calculate Factor Scores

 By examining the factor matrix, one could select for
each factor the variable with the highest loading on
that factor. That variable could then be used as a
surrogate variable for the associated factor.
 However, the choice is not as easy if two or more
variables have similarly high loadings. In such a
case, the choice between these variables should be
based on theoretical and measurement
considerations.
Select Surrogate Variables

 The correlations between the variables can be
deduced or reproduced from the estimated
correlations between the variables and the factors.
 The differences between the observed correlations
(as given in the input correlation matrix) and the
reproduced correlations (as estimated from the factor
matrix) can be examined to determine model fit.
These differences are called residuals.
Determine the Model Fit

Results of Common Factor Analysis
Communalities
Variables Initial Extraction
V1 0.859 0.928
V2 0.480 0.562
V3 0.814 0.836
V4 0.543 0.600
V5 0.763 0.789
V6 0.587 0.723
Barlett test of sphericity
• Approx. Chi-Square = 111.314
• df = 15
• Significance = 0.00000
• Kaiser-Meyer-Olkin measure of
sampling adequacy = 0.660
Initial Eigenvalues
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000

Extraction Sums of Squared Loadings
1 2.570 42.837 42.837
2 1.868 31.126 73.964
Factor Matrix
V1 0.949 0.168
V2 -0.206 0.720
V3 0.914 0.038
V4 -0.246 0.734
V5 -0.850 -0.259
V6 -0.101 0.844
Rotation Sums of Squared Loadings
1 2.541 42.343 42.343
2 1.897 31.621 73.964

Rotated Factor Matrix
V1 0.963 -0.030
V2 -0.054 0.747
V3 0.902 -0.150
V4 -0.090 0.769
V5 -0.885 -0.079
V6 0.075 0.847
V1 0.628 0.101
V2 -0.024 0.253
V3 0.217 -0.169
V4 -0.023 0.271
V5 -0.166 -0.059
V6 0.083 0.500

V1 0.928 0.022 -0.000 0.024 -0.008 -0.042
V2 -0.075 0.562 0.006 -0.008 0.031 0.012
V3 0.873 -0.161 0.836 -0.005 0.008 0.042
V4 -0.110 0.580 -0.197 0.600 -0.025 -0.004
V5 -0.850 -0.012 -0.786 0.019 0.789 0.003
V6 0.046 0.629 -0.060 0.645 -0.133 0.723
The lower left triangle contains the reproduced
correlation matrix; the diagonal, the communalities;
the upper right triangle, the residuals between the
observed correlations and the reproduced correlations.

SPSS Windows
To select this procedures using SPSS for Windows click:
Analyze>Data Reduction>Factor …

37
Exploratory Factor Analysis

38
Exploratory factor analysis . . . is an
interdependence technique whose primary
purpose is to define the underlying structure
among the variables in the analysis.
Exploratory Factor Analysis
Defined

39
Exploratory Factor Analysis . . .
• Examines the interrelationships among a large
number of variables and then attempts to explain
them in terms of their common underlying
dimensions.
• These common underlying dimensions are referred
to as factors.
• A summarization and data reduction technique that
does not have independent and dependent
variables, but is an interdependence technique in
which all variables are considered simultaneously.
What is Exploratory Factor Analysis?

40
Correlation Matrix for Store Image Elements
VV11 VV22 VV33 VV44 VV55 VV66 VV77 VV88 VV99
VV11 PPrriiccee LLeevveell 1.00
VV22 SSttoorree PPeerrssoonnnneell .427 1.00
VV33 RReettuurrnn PPoolliiccyy .302 .771 1.00
VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .470 .497 .427 1.00
VV55 PPrroodduucctt QQuuaalliittyy .765 .406 .307 .472 1.00
VV66 AAssssoorrttmmeenntt DDeepptthh .281 .445 .423 .713 .325 1.00
VV77 AAssssoorrttmmeenntt WWiiddtthh .354 .490 .471 .719 .378 .724 1.00
VV88 IInn--SSttoorree SSeerrvviiccee .242 .719 .733 .428 .240 .311 .435 1.00
VV99 SSttoorree AAttmmoosspphheerree .372 .737 .774 .479 .326 .429 .466 .710 1.00

41
Correlation Matrix of Variables After
Grouping Using Factor Analysis
Shaded areas represent variables likely to be grouped together by factor analysis.
VV33 VV88 VV99 VV22 VV66 VV77 VV44 VV11 VV55
VV33 RReettuurrnn PPoolliiccyy 1.00
VV88 IInn--ssttoorree SSeerrvviiccee .733 1.00
VV99 SSttoorree AAttmmoosspphheerree .774 .710 1.00
VV22 SSttoorree PPeerrssoonnnneell .741 .719 .787 1.00
VV66 AAssssoorrttmmeenntt DDeepptthh .423 .311 .429 .445 1.00
VV77 AAssssoorrttmmeenntt WWiiddtthh .471 .435 .468 .490 .724 1.00
VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .427 .428 .479 .497 .713 .719 1.00
VV11 PPrriiccee LLeevveell .302 .242 .372 .427 .281 .354 .470 1. 00
VV55 PPrroodduucctt QQuuaalliittyy .307 .240 .326 .406 .325 .378 .472 .765 1.00

42
Application of Factor Analysis
to a Fast-Food Restaurant
Service Quality
Food Quality
FactorsVariables
Waiting Time
Cleanliness
Friendly Employees
Taste
Temperature
Freshness

43
Factor Analysis Decision Process
Stage 1: Objectives of Factor Analysis
Stage 2: Designing a Factor Analysis
Stage 3: Assumptions in Factor Analysis
Stage 4: Deriving Factors and Assessing Overall Fit
Stage 5: Interpreting the Factors
Stage 6: Validation of Factor Analysis
Stage 7: Additional uses of Factor Analysis Results

44
Stage 1: Objectives of Factor Analysis
1. Is the objective exploratory or confirmatory?
2. Specify the unit of analysis.
3. Data summarization and/or reduction?
4. Using factor analysis with other techniques.

45
Factor Analysis Outcomes
1. Data summarization = derives underlying
dimensions that, when interpreted and
understood, describe the data in a much
smaller number of concepts than the original
individual variables.
2. Data reduction = extends the process of
data summarization by deriving an empirical
value (factor score or summated scale) for
each dimension (factor) and then substituting
this value for the original values.

46
Types of Factor Analysis
1. Exploratory Factor Analysis (EFA) = is
used to discover the factor structure of a
construct and examine its reliability. It is
data driven.
2. Confirmatory Factor Analysis (CFA) = is
used to confirm the fit of the hypothesized
factor structure to the observed (sample)
data. It is theory driven.

47
Stage 2: Designing a Factor Analysis
Three Basic Decisions:
1. Calculation of input data – R vs. Q
analysis.
2. Design of study in terms of number of
variables, measurement properties of
variables, and the type of variables.
3. Sample size necessary.

48
Rules of Thumb 1
Factor Analysis Design
o Factor analysis is performed most often only on metric
variables, although specialized methods exist for the use of
dummy variables. A small number of “dummy variables” can
be included in a set of metric variables that are factor
analyzed.
o If a study is being designed to reveal factor structure, strive
to have at least five variables for each proposed factor.
o For sample size:
• the sample must have more observations than variables.
• the minimum absolute sample size should be 50
observations.
o Maximize the number of observations per variable, with a
minimum of five and hopefully at least ten observations per
variable.

49
Stage 3: Assumptions in Factor Analysis
Three Basic Decisions . . .
1. Calculation of input data – R vs. Q
analysis.
2. Design of study in terms of number of
variables, measurement properties of
variables, and the type of variables.
3. Sample size required.

50
Assumptions
• Multicollinearity
 Assessed using MSA (measure of sampling
adequacy).
• Homogeneity of sample factor solutions
The MSA is measured by the Kaiser-Meyer-Olkin (KMO)
statistic. As a measure of sampling adequacy, the KMO predicts if
data are likely to factor well based on correlation and partial
correlation. KMO can be used to identify which variables to drop
from the factor analysis because they lack multicollinearity.
There is a KMO statistic for each individual variable, and their
sum is the KMO overall statistic. KMO varies from 0 to 1.0.
Overall KMO should be .50 or higher to proceed with factor
analysis. If it is not, remove the variable with the lowest
individual KMO statistic value one at a time until KMO overall rises
above .50, and each individual variable KMO is above .50.

51
Rules of Thumb 2
Testing Assumptions of Factor Analysis
• There must be a strong conceptual foundation to
support the assumption that a structure does exist
before the factor analysis is performed.
• A statistically significant Bartlett’s test of sphericity
(sig. < .05) indicates that sufficient correlations exist
among the variables to proceed.
• Measure of Sampling Adequacy (MSA) values must
exceed .50 for both the overall test and each
individual variable. Variables with values less than
.50 should be omitted from the factor analysis one at
a time, with the smallest one being omitted each time.

52
Stage 4: Deriving Factors and Assessing Overall
Fit
• Selecting the factor extraction method
– common vs. component analysis.
• Determining the number of factors to
represent the data.

53
Extraction Decisions
o Which method?
• Principal Components Analysis
• Common Factor Analysis
o How to rotate?
• Orthogonal or Oblique rotation

54
Diagonal Value Variance
Unity (1)
Communality
Total Variance
Common Specific and Error
Variance extracted
Variance not used
Extraction Method Determines the
Types of Variance Carried into the Factor Matrix

55
Principal Components vs. Common?
Two Criteria . . .
• Objectives of the factor analysis.
• Amount of prior knowledge about
the variance in the variables.

56
Number of Factors?
• A Priori Criterion
• Latent Root Criterion
• Percentage of Variance
• Scree Test Criterion

57
Eigenvalue Plot for Scree Test
Criterion

58
Rules of Thumb 3
Choosing Factor Models and Number of Factors
• Although both component and common factor analysis models yield similar
results in common research settings (30 or more variables or communalities of
.60 for most variables):
 the component analysis model is most appropriate when data reduction is
paramount.
 the common factor model is best in well-specified theoretical applications.
• Any decision on the number of factors to be retained should be based on several
considerations:
 use of several stopping criteria to determine the initial number of factors to
retain.
 Factors With Eigenvalues greater than 1.0.
 A pre-determined number of factors based on research objectives and/or
prior research.
 Enough factors to meet a specified percentage of variance explained, usually
60% or higher.
 Factors shown by the scree test to have substantial amounts of common
variance (i.e., factors before inflection point).
 More factors when there is heterogeneity among sample subgroups.
• Consideration of several alternative solutions (one more and one less factor than
the initial solution) to ensure the best structure is identified.

59
Processes of Factor Interpretation
• Estimate the Factor Matrix
• Factor Rotation
• Factor Interpretation
• Respecification of factor model, if needed, may
involve . . .
o Deletion of variables from analysis
o Desire to use a different rotational approach
o Need to extract a different number of factors
o Desire to change method of extraction

60
Rotation of Factors
Factor rotation = the reference axes of the factors
are turned about the origin until some other position
has been reached. Since unrotated factor solutions
extract factors based on how much variance they
account for, with each subsequent factor accounting
for less variance. The ultimate effect of rotating the
factor matrix is to redistribute the variance from earlier
factors to later ones to achieve a simpler, theoretically
more meaningful factor pattern.

61
Two Rotational Approaches
1. Orthogonal = axes are maintained
at 90 degrees.
2. Oblique = axes are not maintained
at 90 degrees.

62
Unrotated
Factor II
Unrotated
Factor I
Rotated
Factor I
Rotated Factor II
-1.0 -.50
0
+.50
+1.0
-.50
-1.0
+1.
0
+.5
0
V1
V2
V3
V4
V5
Orthogonal Factor Rotation

63
Unrotated
Factor II
Unrotate
d Factor I
Oblique
Rotation
: Factor
I
Orthogonal
Rotation: Factor II
-1.0 -.50
0
+.50
+1.0
-.50
-1.0
+1.
0
+.5
0
V1
V2
V3
V4
V5
Orthogonal
Rotation: Factor
I
Oblique
Rotation: Factor
II
Oblique Factor Rotation

64
Orthogonal Rotation Methods
• Quartimax (simplify rows)
• Varimax (simplify columns)
• Equimax (combination)

65
Rules of Thumb 4
Choosing Factor Rotation Methods
• Orthogonal rotation methods . . .
o are the most widely used rotational methods.
o are The preferred method when the research
goal is data reduction to either a smaller number
of variables or a set of uncorrelated measures for
subsequent use in other multivariate techniques.
• Oblique rotation methods . . .
o best suited to the goal of obtaining several
theoretically meaningful factors or constructs
because, realistically, very few constructs in the
“real world” are uncorrelated.

66
Which Factor Loadings Are Significant?
• Customary Criteria = Practical Significance.
• Sample Size & Statistical Significance.
• Number of Factors ( = >) and/or Variables ( = <) .

67
Factor Loading Sample Size Needed
for Significance*
.30 350
.35 250
.40 200
.45 150
.50 120
.55 100
.60 85
.65 70
.70 60
.75 50*Significance is based on a .05 significance level (a), a power level of 80 percent, and
standard errors assumed to be twice those of conventional correlation coefficients.
Guidelines for Identifying Significant
Factor Loadings Based on Sample Size

68
Rules of Thumb 5
Assessing Factor Loadings
• While factor loadings of +.30 to +.40 are minimally
acceptable, values greater than + .50 are considered
necessary for practical significance.
• To be considered significant:
o A smaller loading is needed given either a larger sample
size, or a larger number of variables being analyzed.
o A larger loading is needed given a factor solution with a
larger number of factors, especially in evaluating the
loadings on later factors.
• Statistical tests of significance for factor loadings are
generally very conservative and should be considered
only as starting points needed for including a variable for
further consideration.

69
Stage 5: Interpreting the Factors
• Selecting the factor extraction method
– common vs. component analysis.
• Determining the number of factors to
represent the data.

70
Interpreting a Factor Matrix:
1. Examine the factor matrix of
loadings.
2. Identify the highest loading across
all factors for each variable.
3. Assess communalities of the
variables.
4. Label the factors.

71
Rules of Thumb 6
Interpreting The Factors
 An optimal structure exists when all variables have high
loadings only on a single factor.
 Variables that cross-load (load highly on two or more factors)
are usually deleted unless theoretically justified or the
objective is strictly data reduction.
 Variables should generally have communalities of greater
than .50 to be retained in the analysis.
 Respecification of a factor analysis can include options such
as:
o deleting a variable(s),
o changing rotation methods, and/or
o increasing or decreasing the number of factors.

72
Stage 6: Validation of Factor Analysis
• Confirmatory Perspective.
• Assessing Factor Structure Stability.
• Detecting Influential Observations.

73
Stage 7: Additional Uses of Factor
Analysis Results
• Selecting Surrogate Variables
• Creating Summated Scales
• Computing Factor Scores

74
Rules of Thumb 7
Summated Scales
• A summated scale is only as good as the items used to
represent the construct. While it may pass all empirical
tests, it is useless without theoretical justification.
• Never create a summated scale without first assessing its
unidimensionality with exploratory or confirmatory factor
analysis.
• Once a scale is deemed unidimensional, its reliability score,
as measured by Cronbach’s alpha:
o should exceed a threshold of .70, although a .60 level
can be used in exploratory research.
o the threshold should be raised as the number of items
increases, especially as the number of items
approaches 10 or more.
• With reliability established, validity should be assessed in
terms of:
o convergent validity = scale correlates with other like
scales.
o discriminant validity = scale is sufficiently different
from other related scales.
o nomological validity = scale “predicts” as theoretically

75
Rules of Thumb 8
Representing Factor Analysis In Other Analyses
• The single surrogate variable:
 Advantages: simple to administer and interpret.
 Disadvantages:
1) does not represent all “facets” of a factor
2) prone to measurement error.
• Factor scores:
 Advantages:
1) represents all variables loading on the factor,
2) best method for complete data reduction.
3) Are by default orthogonal and can avoid
complications caused by multicollinearity.
 Disadvantages:
1) interpretation more difficult since all variables
contribute through loadings
2) Difficult to replicate across studies.

76
Rules of Thumb 8 Continued . . .
Representing Factor Analysis In Other Analyses
• Summated scales:
 Advantages:
1) compromise between the surrogate variable and factor
score options.
2) reduces measurement error.
3) represents multiple facets of a concept.
4) easily replicated across studies.
 Disadvantages:
1) includes only the variables that load highly on the
factor and excludes those having little or marginal
impact.
2) not necessarily orthogonal.
3) Require extensive analysis of reliability and validity
issues.

77
Variable Description Variable Type
Data Warehouse Classification Variables
X1 Customer Type nonmetric
X2 Industry Type nonmetric
X3 Firm Size nonmetric
X4 Region nonmetric
X5 Distribution System nonmetric
Performance Perceptions Variables
X6 Product Quality metric
X7 E-Commerce Activities/Website metric
X8 Technical Support metric
X9 Complaint Resolution metric
X10 Advertising metric
X11 Product Line metric
X12 Salesforce Image metric
X13 Competitive Pricing metric
X14 Warranty & Claims metric
X15 New Products metric
X16 Ordering & Billing metric
X17 Price Flexibility metric
X18 Delivery Speed metric
Outcome/Relationship Measures
X19 Satisfaction metric
X20 Likelihood of Recommendation metric
X21 Likelihood of Future Purchase metric
X22 Current Purchase/Usage Level metric
X23 Consider Strategic Alliance/Partnership in Future nonmetric
Description of HBAT Primary Database Variables

78
Rotated Component Matrix
“Reduced Set” of HBAT Perceptions Variables
Component
Communality
1 2 3 4
X9 – Complaint Resolution .933 .890
X18 – Delivery Speed .931 .894
X16 – Order & Billing .886 .806
X12 – Salesforce Image .898
.860
X7 – E-Commerce Activities .868 .780
X10 – Advertising .743 .585
X8 – Technical Support .940
.894
X14 – Warranty & Claims .933
.891
X6 – Product Quality .892 .798
X13 – Competitive Pricing -.730 .661
Sum of Squares 2.589 2.216 1.846 1.406
8.057
Percentage of Trace 25.893 22.161 18.457 14.061 80.572

79
Scree Test for HBAT Component Analysis

80
Factor Analysis Learning Checkpoint
1. What are the major uses of factor analysis?
2. What is the difference between component
analysis and common factor analysis?
3. Is rotation of factors necessary?
4. How do you decide how many factors to extract?
5. What is a significant factor loading?
6. How and why do you name a factor?
7. Should you use factor scores or summated ratings
in follow-up analyses?

Factor analysis

More Related Content

What's hot (20)

Similar to Factor analysis (20)

Recently uploaded (20)

Factor analysis