SlideShare a Scribd company logo
Factor Analysis
Factor Analysis
 Factor analysis is a general name denoting a class of
procedures primarily used for data reduction and
summarization.
 Factor analysis is an interdependence technique in that an
entire set of interdependent relationships is examined without
making the distinction between dependent and independent
variables.
 Factor analysis is used in the following circumstances:
 To identify underlying dimensions, or factors, that explain
the correlations among a set of variables.
 To identify a new, smaller, set of uncorrelated variables to
replace the original set of correlated variables in subsequent
multivariate analysis (regression or discriminant analysis).
 To identify a smaller set of salient variables from a larger set
for use in subsequent multivariate analysis.
Factor Analysis Model
Mathematically, each variable is expressed as a linear combination
of underlying factors. The covariation among the variables is
described in terms of a small number of common factors plus a
unique factor for each variable. If the variables are standardized,
the factor model may be represented as:
Xi = Ai 1F1 + Ai 2F2 + Ai 3F3 + . . . + AimFm + ViUi
where
Xi = i th standardized variable
Aij = standardized multiple regression coefficient of
variable i on common factor j
F = common factor
Vi = standardized regression coefficient of variable i on
unique factor i
Ui = the unique factor for variable i
m = number of common factors
The unique factors are uncorrelated with each other and with the
common factors. The common factors themselves can be
expressed as linear combinations of the observed variables.
Fi = Wi1X1 + Wi2X2 + Wi3X3 + . . . + WikXk
where
Fi = estimate of i th factor
Wi = weight or factor score coefficient
k = number of variables
Factor Analysis Model
 It is possible to select weights or factor score
coefficients so that the first factor explains the largest
portion of the total variance.
 Then a second set of weights can be selected, so
that the second factor accounts for most of the
residual variance, subject to being uncorrelated with
the first factor.
 This same principle could be applied to selecting
additional weights for the additional factors.
Factor Analysis Model
Statistics Associated with Factor Analysis
 Bartlett's test of sphericity. Bartlett's test of
sphericity is a test statistic used to examine the
hypothesis that the variables are uncorrelated in the
population. In other words, the population
correlation matrix is an identity matrix; each variable
correlates perfectly with itself (r = 1) but has no
correlation with the other variables (r = 0).
 Correlation matrix. A correlation matrix is a lower
triangle matrix showing the simple correlations, r,
between all possible pairs of variables included in the
analysis. The diagonal elements, which are all 1, are
usually omitted.
 Communality. Communality is the amount of
variance a variable shares with all the other variables
being considered. This is also the proportion of
variance explained by the common factors.
 Eigenvalue. The eigenvalue represents the total
variance explained by each factor.
 Factor loadings. Factor loadings are simple
correlations between the variables and the factors.
 Factor loading plot. A factor loading plot is a plot
of the original variables using the factor loadings as
coordinates.
 Factor matrix. A factor matrix contains the factor
loadings of all the variables on all the factors
extracted.
Statistics Associated with Factor Analysis
 Factor scores. Factor scores are composite scores
estimated for each respondent on the derived factors.
 Kaiser-Meyer-Olkin (KMO) measure of sampling
adequacy. The Kaiser-Meyer-Olkin (KMO) measure of
sampling adequacy is an index used to examine the
appropriateness of factor analysis. High values (between
0.5 and 1.0) indicate factor analysis is appropriate. Values
below 0.5 imply that factor analysis may not be
appropriate.
 Percentage of variance. The percentage of the total
variance attributed to each factor.
 Residuals are the differences between the observed
correlations, as given in the input correlation matrix, and
the reproduced correlations, as estimated from the factor
matrix.
 Scree plot. A scree plot is a plot of the Eigenvalues
against the number of factors in order of extraction.
Statistics Associated with Factor Analysis
A Factor Analysis Example
A sample of 30 respondents was interviewed using
mall intercept interviewing. The respondents were
asked to to indicate their degree of agreement with
the following statements using a seven-point scale (1
= strongly disagree, 7 =strongly agree).
 V1 = It is important to buy a toothpaste that prevents
cavities
 V2 = I like a toothpaste that gives a shiny teeth
 V3 = A toothpaste should strengthen your gums
 V4= I prefer a toothpaste that freshens breath
 V5 = Prevention of tooth decay is not an important
benefit offered by a toothpaste
 V6 = The most important consideration in buying a
toothpaste is attractive teeth
Conducting Factor Analysis
RESPONDENT
NUMBER V1 V2 V3 V4 V5 V6
1 7.00 3.00 6.00 4.00 2.00 4.00
2 1.00 3.00 2.00 4.00 5.00 4.00
3 6.00 2.00 7.00 4.00 1.00 3.00
4 4.00 5.00 4.00 6.00 2.00 5.00
5 1.00 2.00 2.00 3.00 6.00 2.00
6 6.00 3.00 6.00 4.00 2.00 4.00
7 5.00 3.00 6.00 3.00 4.00 3.00
8 6.00 4.00 7.00 4.00 1.00 4.00
9 3.00 4.00 2.00 3.00 6.00 3.00
10 2.00 6.00 2.00 6.00 7.00 6.00
11 6.00 4.00 7.00 3.00 2.00 3.00
12 2.00 3.00 1.00 4.00 5.00 4.00
13 7.00 2.00 6.00 4.00 1.00 3.00
14 4.00 6.00 4.00 5.00 3.00 6.00
15 1.00 3.00 2.00 2.00 6.00 4.00
16 6.00 4.00 6.00 3.00 3.00 4.00
17 5.00 3.00 6.00 3.00 3.00 4.00
18 7.00 3.00 7.00 4.00 1.00 4.00
19 2.00 4.00 3.00 3.00 6.00 3.00
20 3.00 5.00 3.00 6.00 4.00 6.00
21 1.00 3.00 2.00 3.00 5.00 3.00
22 5.00 4.00 5.00 4.00 2.00 4.00
23 2.00 2.00 1.00 5.00 4.00 4.00
24 4.00 6.00 4.00 6.00 4.00 7.00
25 6.00 5.00 4.00 2.00 1.00 4.00
26 3.00 5.00 4.00 6.00 4.00 7.00
27 4.00 4.00 7.00 2.00 2.00 5.00
28 3.00 7.00 2.00 6.00 4.00 3.00
29 4.00 6.00 3.00 7.00 2.00 7.00
30 2.00 3.00 2.00 4.00 7.00 2.00
Conducting Factor Analysis
Construction of the Correlation Matrix
Method of Factor Analysis
Determination of Number of Factors
Determination of Model Fit
Problem formulation
Calculation of
Factor Scores
Interpretation of Factors
Rotation of Factors
Selection of
Surrogate Variables
Conducting Factor Analysis
Formulate the Problem
 The objectives of factor analysis should be identified.
 The variables to be included in the factor analysis
should be specified based on past research, theory,
and judgment of the researcher. It is important that
the variables be appropriately measured on an
interval or ratio scale.
 An appropriate sample size should be used. As a
rough guideline, there should be at least four or five
times as many observations (sample size) as there
are variables.
Correlation Matrix
Variables V1 V2 V3 V4 V5 V6
V1 1.000
V2 -0.530 1.000
V3 0.873 -0.155 1.000
V4 -0.086 0.572 -0.248 1.000
V5 -0.858 0.020 -0.778 -0.007 1.000
V6 0.004 0.640 -0.018 0.640 -0.136 1.000
 The analytical process is based on a matrix of
correlations between the variables.
 Bartlett's test of sphericity can be used to test the
null hypothesis that the variables are uncorrelated in
the population: in other words, the population
correlation matrix is an identity matrix. If this
hypothesis cannot be rejected, then the
appropriateness of factor analysis should be
questioned.
 Another useful statistic is the Kaiser-Meyer-Olkin
(KMO) measure of sampling adequacy. Small values
of the KMO statistic indicate that the correlations
between pairs of variables cannot be explained by
other variables and that factor analysis may not be
appropriate.
Conducting Factor Analysis
Construct the Correlation Matrix
 In principal components analysis, the total variance in
the data is considered. The diagonal of the correlation
matrix consists of unities, and full variance is brought into
the factor matrix. Principal components analysis is
recommended when the primary concern is to determine
the minimum number of factors that will account for
maximum variance in the data for use in subsequent
multivariate analysis. The factors are called principal
components.
 In common factor analysis, the factors are estimated
based only on the common variance. Communalities are
inserted in the diagonal of the correlation matrix. This
method is appropriate when the primary concern is to
identify the underlying dimensions and the common
variance is of interest. This method is also known as
principal axis factoring.
Conducting Factor Analysis
Determine the Method of Factor Analysis
Results of Principal Components Analysis
Communalities
Variables Initial Extraction
V1 1.000 0.926
V2 1.000 0.723
V3 1.000 0.894
V4 1.000 0.739
V5 1.000 0.878
V6 1.000 0.790
Initial Eigen values
Factor Eigen value % of variance Cumulat. %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000
Results of Principal Components Analysis
Extraction Sums of Squared Loadings
Factor Eigen value % of variance Cumulat. %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
Factor Matrix
Variables Factor 1 Factor 2
V1 0.928 0.253
V2 -0.301 0.795
V3 0.936 0.131
V4 -0.342 0.789
V5 -0.869 -0.351
V6 -0.177 0.871
Rotation Sums of Squared Loadings
Factor Eigenvalue % of variance Cumulat. %
1 2.688 44.802 44.802
2 2.261 37.687 82.488
Results of Principal Components Analysis
Rotated Factor Matrix
Variables Factor 1 Factor 2
V1 0.962 -0.027
V2 -0.057 0.848
V3 0.934 -0.146
V4 -0.098 0.845
V5 -0.933 -0.084
V6 0.083 0.885
Factor Score Coefficient Matrix
Variables Factor 1 Factor 2
V1 0.358 0.011
V2 -0.001 0.375
V3 0.345 -0.043
V4 -0.017 0.377
V5 -0.350 -0.059
V6 0.052 0.395
Factor Score Coefficient Matrix
Variables V1 V2 V3 V4 V5 V6
V1 0.926 0.024 -0.029 0.031 0.038 -0.053
V2 -0.078 0.723 0.022 -0.158 0.038 -0.105
V3 0.902 -0.177 0.894 -0.031 0.081 0.033
V4 -0.117 0.730 -0.217 0.739 -0.027 -0.107
V5 -0.895 -0.018 -0.859 0.020 0.878 0.016
V6 0.057 0.746 -0.051 0.748 -0.152 0.790
The lower left triangle contains the reproduced
correlation matrix; the diagonal, the communalities;
the upper right triangle, the residuals between the
observed correlations and the reproduced
correlations.
Results of Principal Components Analysis
 A Priori Determination. Sometimes, because of
prior knowledge, the researcher knows how many
factors to expect and thus can specify the number of
factors to be extracted beforehand.
 Determination Based on Eigenvalues. In this
approach, only factors with Eigenvalues greater than
1.0 are retained. An Eigenvalue represents the
amount of variance associated with the factor.
Hence, only factors with a variance greater than 1.0
are included. Factors with variance less than 1.0 are
no better than a single variable, since, due to
standardization, each variable has a variance of 1.0.
If the number of variables is less than 20, this
approach will result in a conservative number of
factors.
Conducting Factor Analysis
Determine the Number of Factors
 Determination Based on Scree Plot. A scree
plot is a plot of the Eigenvalues against the number
of factors in order of extraction. Experimental
evidence indicates that the point at which the scree
begins denotes the true number of factors.
Generally, the number of factors determined by a
scree plot will be one or a few more than that
determined by the Eigenvalue criterion.
 Determination Based on Percentage of
Variance. In this approach the number of factors
extracted is determined so that the cumulative
percentage of variance extracted by the factors
reaches a satisfactory level. It is recommended that
the factors extracted should account for at least 60%
of the variance.
Conducting Factor Analysis
Determine the Number of Factors
Scree Plot
0.5
2 543 6
Component Number
0.0
2.0
3.0
Eigenvalue
1.0
1.5
2.5
1
 Determination Based on Split-Half Reliability.
The sample is split in half and factor analysis is
performed on each half. Only factors with high
correspondence of factor loadings across the two
subsamples are retained.
 Determination Based on Significance Tests.
It is possible to determine the statistical significance
of the separate Eigenvalues and retain only those
factors that are statistically significant. A drawback is
that with large samples (size greater than 200),
many factors are likely to be statistically significant,
although from a practical viewpoint many of these
account for only a small proportion of the total
variance.
Conducting Factor Analysis
Determine the Number of Factors
 Although the initial or unrotated factor matrix
indicates the relationship between the factors and
individual variables, it seldom results in factors that
can be interpreted, because the factors are
correlated with many variables. Therefore, through
rotation the factor matrix is transformed into a
simpler one that is easier to interpret.
 In rotating the factors, we would like each factor to
have nonzero, or significant, loadings or coefficients
for only some of the variables. Likewise, we would
like each variable to have nonzero or significant
loadings with only a few factors, if possible with only
one.
 The rotation is called orthogonal rotation if the
axes are maintained at right angles.
Conducting Factor Analysis
Rotate Factors
 The most commonly used method for rotation is the
varimax procedure. This is an orthogonal method
of rotation that minimizes the number of variables
with high loadings on a factor, thereby enhancing the
interpretability of the factors. Orthogonal rotation
results in factors that are uncorrelated.
 The rotation is called oblique rotation when the
axes are not maintained at right angles, and the
factors are correlated. Sometimes, allowing for
correlations among factors can simplify the factor
pattern matrix. Oblique rotation should be used
when factors in the population are likely to be
strongly correlated.
Conducting Factor Analysis
Rotate Factors
 A factor can then be interpreted in terms of the
variables that load high on it.
 Another useful aid in interpretation is to plot the
variables, using the factor loadings as coordinates.
Variables at the end of an axis are those that have
high loadings on only that factor, and hence describe
the factor.
Conducting Factor Analysis
Interpret Factors
Factor Loading Plot
1.0
0.5
0.0
-0.5
-1.0
Component2

Component 1
Component
Variable 1 2
V1 0.962 -2.66E-02
V2 -5.72E-02 0.848
V3 0.934 -0.146
V4 -9.83E-02 0.854
V5 -0.933 -8.40E-02
V6 8.337E-02 0.885
Component Plot in Rotated Space





1.0 0.5 0.0 -0.5 -1.0
V1
V3
V6
V2
V5
V4
Rotated Component Matrix
The factor scores for the ith factor may be estimated
as follows:
Fi = Wi1 X1 + Wi2 X2 + Wi3 X3 + . . . + Wik Xk
Conducting Factor Analysis
Calculate Factor Scores
 By examining the factor matrix, one could select for
each factor the variable with the highest loading on
that factor. That variable could then be used as a
surrogate variable for the associated factor.
 However, the choice is not as easy if two or more
variables have similarly high loadings. In such a
case, the choice between these variables should be
based on theoretical and measurement
considerations.
Conducting Factor Analysis
Select Surrogate Variables
 The correlations between the variables can be
deduced or reproduced from the estimated
correlations between the variables and the factors.
 The differences between the observed correlations
(as given in the input correlation matrix) and the
reproduced correlations (as estimated from the factor
matrix) can be examined to determine model fit.
These differences are called residuals.
Conducting Factor Analysis
Determine the Model Fit
Results of Common Factor Analysis
Communalities
Variables Initial Extraction
V1 0.859 0.928
V2 0.480 0.562
V3 0.814 0.836
V4 0.543 0.600
V5 0.763 0.789
V6 0.587 0.723
Barlett test of sphericity
• Approx. Chi-Square = 111.314
• df = 15
• Significance = 0.00000
• Kaiser-Meyer-Olkin measure of
sampling adequacy = 0.660
Initial Eigenvalues
Factor Eigenvalue % of variance Cumulat. %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000
Results of Common Factor Analysis
Extraction Sums of Squared Loadings
Factor Eigenvalue % of variance Cumulat. %
1 2.570 42.837 42.837
2 1.868 31.126 73.964
Factor Matrix
Variables Factor 1 Factor 2
V1 0.949 0.168
V2 -0.206 0.720
V3 0.914 0.038
V4 -0.246 0.734
V5 -0.850 -0.259
V6 -0.101 0.844
Rotation Sums of Squared Loadings
Factor Eigenvalue % of variance Cumulat. %
1 2.541 42.343 42.343
2 1.897 31.621 73.964
Rotated Factor Matrix
Variables Factor 1 Factor 2
V1 0.963 -0.030
V2 -0.054 0.747
V3 0.902 -0.150
V4 -0.090 0.769
V5 -0.885 -0.079
V6 0.075 0.847
Factor Score Coefficient Matrix
Variables Factor 1 Factor 2
V1 0.628 0.101
V2 -0.024 0.253
V3 0.217 -0.169
V4 -0.023 0.271
V5 -0.166 -0.059
V6 0.083 0.500
Results of Common Factor Analysis
Results of Common Factor Analysis
Factor Score Coefficient Matrix
Variables V1 V2 V3 V4 V5 V6
V1 0.928 0.022 -0.000 0.024 -0.008 -0.042
V2 -0.075 0.562 0.006 -0.008 0.031 0.012
V3 0.873 -0.161 0.836 -0.005 0.008 0.042
V4 -0.110 0.580 -0.197 0.600 -0.025 -0.004
V5 -0.850 -0.012 -0.786 0.019 0.789 0.003
V6 0.046 0.629 -0.060 0.645 -0.133 0.723
The lower left triangle contains the reproduced
correlation matrix; the diagonal, the communalities;
the upper right triangle, the residuals between the
observed correlations and the reproduced correlations.
SPSS Windows
To select this procedures using SPSS for Windows click:
Analyze>Data Reduction>Factor …
Factor analysis
37
Exploratory Factor Analysis
38
Exploratory factor analysis . . . is an
interdependence technique whose primary
purpose is to define the underlying structure
among the variables in the analysis.
Exploratory Factor Analysis
Defined
39
Exploratory Factor Analysis . . .
• Examines the interrelationships among a large
number of variables and then attempts to explain
them in terms of their common underlying
dimensions.
• These common underlying dimensions are referred
to as factors.
• A summarization and data reduction technique that
does not have independent and dependent
variables, but is an interdependence technique in
which all variables are considered simultaneously.
What is Exploratory Factor Analysis?
40
Correlation Matrix for Store Image Elements
VV11 VV22 VV33 VV44 VV55 VV66 VV77 VV88 VV99
VV11 PPrriiccee LLeevveell 1.00
VV22 SSttoorree PPeerrssoonnnneell .427 1.00
VV33 RReettuurrnn PPoolliiccyy .302 .771 1.00
VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .470 .497 .427 1.00
VV55 PPrroodduucctt QQuuaalliittyy .765 .406 .307 .472 1.00
VV66 AAssssoorrttmmeenntt DDeepptthh .281 .445 .423 .713 .325 1.00
VV77 AAssssoorrttmmeenntt WWiiddtthh .354 .490 .471 .719 .378 .724 1.00
VV88 IInn--SSttoorree SSeerrvviiccee .242 .719 .733 .428 .240 .311 .435 1.00
VV99 SSttoorree AAttmmoosspphheerree .372 .737 .774 .479 .326 .429 .466 .710 1.00
41
Correlation Matrix of Variables After
Grouping Using Factor Analysis
Shaded areas represent variables likely to be grouped together by factor analysis.
VV33 VV88 VV99 VV22 VV66 VV77 VV44 VV11 VV55
VV33 RReettuurrnn PPoolliiccyy 1.00
VV88 IInn--ssttoorree SSeerrvviiccee .733 1.00
VV99 SSttoorree AAttmmoosspphheerree .774 .710 1.00
VV22 SSttoorree PPeerrssoonnnneell .741 .719 .787 1.00
VV66 AAssssoorrttmmeenntt DDeepptthh .423 .311 .429 .445 1.00
VV77 AAssssoorrttmmeenntt WWiiddtthh .471 .435 .468 .490 .724 1.00
VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .427 .428 .479 .497 .713 .719 1.00
VV11 PPrriiccee LLeevveell .302 .242 .372 .427 .281 .354 .470 1. 00
VV55 PPrroodduucctt QQuuaalliittyy .307 .240 .326 .406 .325 .378 .472 .765 1.00
42
Application of Factor Analysis
to a Fast-Food Restaurant
Service Quality
Food Quality
FactorsVariables
Waiting Time
Cleanliness
Friendly Employees
Taste
Temperature
Freshness
43
Factor Analysis Decision Process
Stage 1: Objectives of Factor Analysis
Stage 2: Designing a Factor Analysis
Stage 3: Assumptions in Factor Analysis
Stage 4: Deriving Factors and Assessing Overall Fit
Stage 5: Interpreting the Factors
Stage 6: Validation of Factor Analysis
Stage 7: Additional uses of Factor Analysis Results
44
Stage 1: Objectives of Factor Analysis
1. Is the objective exploratory or confirmatory?
2. Specify the unit of analysis.
3. Data summarization and/or reduction?
4. Using factor analysis with other techniques.
45
Factor Analysis Outcomes
1. Data summarization = derives underlying
dimensions that, when interpreted and
understood, describe the data in a much
smaller number of concepts than the original
individual variables.
2. Data reduction = extends the process of
data summarization by deriving an empirical
value (factor score or summated scale) for
each dimension (factor) and then substituting
this value for the original values.
46
Types of Factor Analysis
1. Exploratory Factor Analysis (EFA) = is
used to discover the factor structure of a
construct and examine its reliability. It is
data driven.
2. Confirmatory Factor Analysis (CFA) = is
used to confirm the fit of the hypothesized
factor structure to the observed (sample)
data. It is theory driven.
47
Stage 2: Designing a Factor Analysis
Three Basic Decisions:
1. Calculation of input data – R vs. Q
analysis.
2. Design of study in terms of number of
variables, measurement properties of
variables, and the type of variables.
3. Sample size necessary.
48
Rules of Thumb 1
Factor Analysis Design
o Factor analysis is performed most often only on metric
variables, although specialized methods exist for the use of
dummy variables. A small number of “dummy variables” can
be included in a set of metric variables that are factor
analyzed.
o If a study is being designed to reveal factor structure, strive
to have at least five variables for each proposed factor.
o For sample size:
• the sample must have more observations than variables.
• the minimum absolute sample size should be 50
observations.
o Maximize the number of observations per variable, with a
minimum of five and hopefully at least ten observations per
variable.
49
Stage 3: Assumptions in Factor Analysis
Three Basic Decisions . . .
1. Calculation of input data – R vs. Q
analysis.
2. Design of study in terms of number of
variables, measurement properties of
variables, and the type of variables.
3. Sample size required.
50
Assumptions
• Multicollinearity
 Assessed using MSA (measure of sampling
adequacy).
• Homogeneity of sample factor solutions
The MSA is measured by the Kaiser-Meyer-Olkin (KMO)
statistic. As a measure of sampling adequacy, the KMO predicts if
data are likely to factor well based on correlation and partial
correlation. KMO can be used to identify which variables to drop
from the factor analysis because they lack multicollinearity.
There is a KMO statistic for each individual variable, and their
sum is the KMO overall statistic. KMO varies from 0 to 1.0.
Overall KMO should be .50 or higher to proceed with factor
analysis. If it is not, remove the variable with the lowest
individual KMO statistic value one at a time until KMO overall rises
above .50, and each individual variable KMO is above .50.
51
Rules of Thumb 2
Testing Assumptions of Factor Analysis
• There must be a strong conceptual foundation to
support the assumption that a structure does exist
before the factor analysis is performed.
• A statistically significant Bartlett’s test of sphericity
(sig. < .05) indicates that sufficient correlations exist
among the variables to proceed.
• Measure of Sampling Adequacy (MSA) values must
exceed .50 for both the overall test and each
individual variable. Variables with values less than
.50 should be omitted from the factor analysis one at
a time, with the smallest one being omitted each time.
52
Stage 4: Deriving Factors and Assessing Overall
Fit
• Selecting the factor extraction method
– common vs. component analysis.
• Determining the number of factors to
represent the data.
53
Extraction Decisions
o Which method?
• Principal Components Analysis
• Common Factor Analysis
o How to rotate?
• Orthogonal or Oblique rotation
54
Diagonal Value Variance
Unity (1)
Communality
Total Variance
Common Specific and Error
Variance extracted
Variance not used
Extraction Method Determines the
Types of Variance Carried into the Factor Matrix
55
Principal Components vs. Common?
Two Criteria . . .
• Objectives of the factor analysis.
• Amount of prior knowledge about
the variance in the variables.
56
Number of Factors?
• A Priori Criterion
• Latent Root Criterion
• Percentage of Variance
• Scree Test Criterion
57
Eigenvalue Plot for Scree Test
Criterion
58
Rules of Thumb 3
Choosing Factor Models and Number of Factors
• Although both component and common factor analysis models yield similar
results in common research settings (30 or more variables or communalities of
.60 for most variables):
 the component analysis model is most appropriate when data reduction is
paramount.
 the common factor model is best in well-specified theoretical applications.
• Any decision on the number of factors to be retained should be based on several
considerations:
 use of several stopping criteria to determine the initial number of factors to
retain.
 Factors With Eigenvalues greater than 1.0.
 A pre-determined number of factors based on research objectives and/or
prior research.
 Enough factors to meet a specified percentage of variance explained, usually
60% or higher.
 Factors shown by the scree test to have substantial amounts of common
variance (i.e., factors before inflection point).
 More factors when there is heterogeneity among sample subgroups.
• Consideration of several alternative solutions (one more and one less factor than
the initial solution) to ensure the best structure is identified.
59
Processes of Factor Interpretation
• Estimate the Factor Matrix
• Factor Rotation
• Factor Interpretation
• Respecification of factor model, if needed, may
involve . . .
o Deletion of variables from analysis
o Desire to use a different rotational approach
o Need to extract a different number of factors
o Desire to change method of extraction
60
Rotation of Factors
Factor rotation = the reference axes of the factors
are turned about the origin until some other position
has been reached. Since unrotated factor solutions
extract factors based on how much variance they
account for, with each subsequent factor accounting
for less variance. The ultimate effect of rotating the
factor matrix is to redistribute the variance from earlier
factors to later ones to achieve a simpler, theoretically
more meaningful factor pattern.
61
Two Rotational Approaches
1. Orthogonal = axes are maintained
at 90 degrees.
2. Oblique = axes are not maintained
at 90 degrees.
62
Unrotated
Factor II
Unrotated
Factor I
Rotated
Factor I
Rotated Factor II
-1.0 -.50
0
+.50
+1.0
-.50
-1.0
+1.
0
+.5
0
V1
V2
V3
V4
V5
Orthogonal Factor Rotation
63
Unrotated
Factor II
Unrotate
d Factor I
Oblique
Rotation
: Factor
I
Orthogonal
Rotation: Factor II
-1.0 -.50
0
+.50
+1.0
-.50
-1.0
+1.
0
+.5
0
V1
V2
V3
V4
V5
Orthogonal
Rotation: Factor
I
Oblique
Rotation: Factor
II
Oblique Factor Rotation
64
Orthogonal Rotation Methods
• Quartimax (simplify rows)
• Varimax (simplify columns)
• Equimax (combination)
65
Rules of Thumb 4
Choosing Factor Rotation Methods
• Orthogonal rotation methods . . .
o are the most widely used rotational methods.
o are The preferred method when the research
goal is data reduction to either a smaller number
of variables or a set of uncorrelated measures for
subsequent use in other multivariate techniques.
• Oblique rotation methods . . .
o best suited to the goal of obtaining several
theoretically meaningful factors or constructs
because, realistically, very few constructs in the
“real world” are uncorrelated.
66
Which Factor Loadings Are Significant?
• Customary Criteria = Practical Significance.
• Sample Size & Statistical Significance.
• Number of Factors ( = >) and/or Variables ( = <) .
67
Factor Loading Sample Size Needed
for Significance*
.30 350
.35 250
.40 200
.45 150
.50 120
.55 100
.60 85
.65 70
.70 60
.75 50*Significance is based on a .05 significance level (a), a power level of 80 percent, and
standard errors assumed to be twice those of conventional correlation coefficients.
Guidelines for Identifying Significant
Factor Loadings Based on Sample Size
68
Rules of Thumb 5
Assessing Factor Loadings
• While factor loadings of +.30 to +.40 are minimally
acceptable, values greater than + .50 are considered
necessary for practical significance.
• To be considered significant:
o A smaller loading is needed given either a larger sample
size, or a larger number of variables being analyzed.
o A larger loading is needed given a factor solution with a
larger number of factors, especially in evaluating the
loadings on later factors.
• Statistical tests of significance for factor loadings are
generally very conservative and should be considered
only as starting points needed for including a variable for
further consideration.
69
Stage 5: Interpreting the Factors
• Selecting the factor extraction method
– common vs. component analysis.
• Determining the number of factors to
represent the data.
70
Interpreting a Factor Matrix:
1. Examine the factor matrix of
loadings.
2. Identify the highest loading across
all factors for each variable.
3. Assess communalities of the
variables.
4. Label the factors.
71
Rules of Thumb 6
Interpreting The Factors
 An optimal structure exists when all variables have high
loadings only on a single factor.
 Variables that cross-load (load highly on two or more factors)
are usually deleted unless theoretically justified or the
objective is strictly data reduction.
 Variables should generally have communalities of greater
than .50 to be retained in the analysis.
 Respecification of a factor analysis can include options such
as:
o deleting a variable(s),
o changing rotation methods, and/or
o increasing or decreasing the number of factors.
72
Stage 6: Validation of Factor Analysis
• Confirmatory Perspective.
• Assessing Factor Structure Stability.
• Detecting Influential Observations.
73
Stage 7: Additional Uses of Factor
Analysis Results
• Selecting Surrogate Variables
• Creating Summated Scales
• Computing Factor Scores
74
Rules of Thumb 7
Summated Scales
• A summated scale is only as good as the items used to
represent the construct. While it may pass all empirical
tests, it is useless without theoretical justification.
• Never create a summated scale without first assessing its
unidimensionality with exploratory or confirmatory factor
analysis.
• Once a scale is deemed unidimensional, its reliability score,
as measured by Cronbach’s alpha:
o should exceed a threshold of .70, although a .60 level
can be used in exploratory research.
o the threshold should be raised as the number of items
increases, especially as the number of items
approaches 10 or more.
• With reliability established, validity should be assessed in
terms of:
o convergent validity = scale correlates with other like
scales.
o discriminant validity = scale is sufficiently different
from other related scales.
o nomological validity = scale “predicts” as theoretically
75
Rules of Thumb 8
Representing Factor Analysis In Other Analyses
• The single surrogate variable:
 Advantages: simple to administer and interpret.
 Disadvantages:
1) does not represent all “facets” of a factor
2) prone to measurement error.
• Factor scores:
 Advantages:
1) represents all variables loading on the factor,
2) best method for complete data reduction.
3) Are by default orthogonal and can avoid
complications caused by multicollinearity.
 Disadvantages:
1) interpretation more difficult since all variables
contribute through loadings
2) Difficult to replicate across studies.
76
Rules of Thumb 8 Continued . . .
Representing Factor Analysis In Other Analyses
• Summated scales:
 Advantages:
1) compromise between the surrogate variable and factor
score options.
2) reduces measurement error.
3) represents multiple facets of a concept.
4) easily replicated across studies.
 Disadvantages:
1) includes only the variables that load highly on the
factor and excludes those having little or marginal
impact.
2) not necessarily orthogonal.
3) Require extensive analysis of reliability and validity
issues.
77
Variable Description Variable Type
Data Warehouse Classification Variables
X1 Customer Type nonmetric
X2 Industry Type nonmetric
X3 Firm Size nonmetric
X4 Region nonmetric
X5 Distribution System nonmetric
Performance Perceptions Variables
X6 Product Quality metric
X7 E-Commerce Activities/Website metric
X8 Technical Support metric
X9 Complaint Resolution metric
X10 Advertising metric
X11 Product Line metric
X12 Salesforce Image metric
X13 Competitive Pricing metric
X14 Warranty & Claims metric
X15 New Products metric
X16 Ordering & Billing metric
X17 Price Flexibility metric
X18 Delivery Speed metric
Outcome/Relationship Measures
X19 Satisfaction metric
X20 Likelihood of Recommendation metric
X21 Likelihood of Future Purchase metric
X22 Current Purchase/Usage Level metric
X23 Consider Strategic Alliance/Partnership in Future nonmetric
Description of HBAT Primary Database Variables
78
Rotated Component Matrix
“Reduced Set” of HBAT Perceptions Variables
Component
Communality
1 2 3 4
X9 – Complaint Resolution .933 .890
X18 – Delivery Speed .931 .894
X16 – Order & Billing .886 .806
X12 – Salesforce Image .898
.860
X7 – E-Commerce Activities .868 .780
X10 – Advertising .743 .585
X8 – Technical Support .940
.894
X14 – Warranty & Claims .933
.891
X6 – Product Quality .892 .798
X13 – Competitive Pricing -.730 .661
Sum of Squares 2.589 2.216 1.846 1.406
8.057
Percentage of Trace 25.893 22.161 18.457 14.061 80.572
79
Scree Test for HBAT Component Analysis
80
Factor Analysis Learning Checkpoint
1. What are the major uses of factor analysis?
2. What is the difference between component
analysis and common factor analysis?
3. Is rotation of factors necessary?
4. How do you decide how many factors to extract?
5. What is a significant factor loading?
6. How and why do you name a factor?
7. Should you use factor scores or summated ratings
in follow-up analyses?

More Related Content

PPT
Factor analysis
PPT
Research Methology -Factor Analyses
PPTX
Factor analysis (fa)
PPTX
Factor analysis ppt
PPTX
Factor analysis (1)
PPT
Factor analysis in Spss
PPTX
A gentle introduction to growth curves using SPSS
Factor analysis
Research Methology -Factor Analyses
Factor analysis (fa)
Factor analysis ppt
Factor analysis (1)
Factor analysis in Spss
A gentle introduction to growth curves using SPSS

What's hot (20)

PPTX
Factor analysis
PPTX
Factor analysis
PPTX
Confirmatory factor analysis (cfa)
PPTX
Correlation & Regression Analysis using SPSS
PPTX
Discriminant analysis
PPTX
Priya
PPTX
Factor Analysis in Research
PPTX
An Introduction to Factor analysis ppt
PPTX
Factor analysis
PPTX
discriminant analysis
PPTX
Factor analysis
PPTX
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
PPT
Econometrics and business forecasting
PDF
Structural Equation Modelling (SEM) Part 1
PPT
Regression analysis
PDF
Assumptions of Linear Regression - Machine Learning
PDF
Factor analysis
PPT
Discriminant analysis
PDF
Structural Equation Modelling (SEM) Part 3
PPTX
Introduction to Structural Equation Modeling
Factor analysis
Factor analysis
Confirmatory factor analysis (cfa)
Correlation & Regression Analysis using SPSS
Discriminant analysis
Priya
Factor Analysis in Research
An Introduction to Factor analysis ppt
Factor analysis
discriminant analysis
Factor analysis
Confirmatory Factor Analysis Presented by Mahfoudh Mgammal
Econometrics and business forecasting
Structural Equation Modelling (SEM) Part 1
Regression analysis
Assumptions of Linear Regression - Machine Learning
Factor analysis
Discriminant analysis
Structural Equation Modelling (SEM) Part 3
Introduction to Structural Equation Modeling
Ad

Similar to Factor analysis (20)

PPTX
Marketing Research-Factor Analysis
PPTX
08 - FACTOR ANALYSIS PPT.pptx
PPTX
Factor Analysis Prakash Poddar
PPTX
Factor analysis
PPTX
Factor Analysis from sets of measures.pptx
PPT
Factor analysis
PPTX
Factor Analysis | Introduction to Factor Analysis
PPTX
Factor Analysis of MPH Biostatistics.pptx
PPTX
Factor_analysis in psychology clinical biostatistics
PPT
Factor anaysis scale dimensionality
PPT
FactorAnalysis.ppt
PPT
chapter_03_us_7e_Explorer Factor analysis
PPTX
Exploratory factor analysis
PDF
factor-analysis (1).pdf
PDF
Exploratory Factor Analysis
PPT
factor analysis (basics) for research .ppt
PPTX
MR Multivariate.pptx
PDF
Factor Analysis - Statistics
PDF
factor analysis.pdf
Marketing Research-Factor Analysis
08 - FACTOR ANALYSIS PPT.pptx
Factor Analysis Prakash Poddar
Factor analysis
Factor Analysis from sets of measures.pptx
Factor analysis
Factor Analysis | Introduction to Factor Analysis
Factor Analysis of MPH Biostatistics.pptx
Factor_analysis in psychology clinical biostatistics
Factor anaysis scale dimensionality
FactorAnalysis.ppt
chapter_03_us_7e_Explorer Factor analysis
Exploratory factor analysis
factor-analysis (1).pdf
Exploratory Factor Analysis
factor analysis (basics) for research .ppt
MR Multivariate.pptx
Factor Analysis - Statistics
factor analysis.pdf
Ad

Recently uploaded (20)

PPTX
Grp C.ppt presentation.pptx for Economics
PDF
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
PDF
The Right Social Media Strategy Can Transform Your Business
PDF
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
PDF
The Role of Islamic Faith, Ethics, Culture, and values in promoting fairness ...
PDF
Financial discipline for educational purpose
PPTX
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
PPTX
The discussion on the Economic in transportation .pptx
PPT
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
PPTX
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1
PPTX
Module5_Session1 (mlzrkfbbbbbbbbbbbz1).pptx
PDF
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
PPTX
2. RBI.pptx202029291023i38039013i92292992
PDF
DTC TRADIND CLUB MAKE YOUR TRADING BETTER
PDF
Fintech Regulatory Sandbox: Lessons Learned and Future Prospects
PPT
Chap 1PP.ppt introductory micro economics
PDF
Principal of magaement is good fundamentals in economics
PPTX
kyc aml guideline a detailed pt onthat.pptx
PDF
Buy Verified Stripe Accounts for Sale - Secure and.pdf
Grp C.ppt presentation.pptx for Economics
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
The Right Social Media Strategy Can Transform Your Business
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
The Role of Islamic Faith, Ethics, Culture, and values in promoting fairness ...
Financial discipline for educational purpose
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
The discussion on the Economic in transportation .pptx
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1
Module5_Session1 (mlzrkfbbbbbbbbbbbz1).pptx
HCWM AND HAI FOR BHCM STUDENTS(1).Pdf and ptts
2. RBI.pptx202029291023i38039013i92292992
DTC TRADIND CLUB MAKE YOUR TRADING BETTER
Fintech Regulatory Sandbox: Lessons Learned and Future Prospects
Chap 1PP.ppt introductory micro economics
Principal of magaement is good fundamentals in economics
kyc aml guideline a detailed pt onthat.pptx
Buy Verified Stripe Accounts for Sale - Secure and.pdf

Factor analysis

  • 2. Factor Analysis  Factor analysis is a general name denoting a class of procedures primarily used for data reduction and summarization.  Factor analysis is an interdependence technique in that an entire set of interdependent relationships is examined without making the distinction between dependent and independent variables.  Factor analysis is used in the following circumstances:  To identify underlying dimensions, or factors, that explain the correlations among a set of variables.  To identify a new, smaller, set of uncorrelated variables to replace the original set of correlated variables in subsequent multivariate analysis (regression or discriminant analysis).  To identify a smaller set of salient variables from a larger set for use in subsequent multivariate analysis.
  • 3. Factor Analysis Model Mathematically, each variable is expressed as a linear combination of underlying factors. The covariation among the variables is described in terms of a small number of common factors plus a unique factor for each variable. If the variables are standardized, the factor model may be represented as: Xi = Ai 1F1 + Ai 2F2 + Ai 3F3 + . . . + AimFm + ViUi where Xi = i th standardized variable Aij = standardized multiple regression coefficient of variable i on common factor j F = common factor Vi = standardized regression coefficient of variable i on unique factor i Ui = the unique factor for variable i m = number of common factors
  • 4. The unique factors are uncorrelated with each other and with the common factors. The common factors themselves can be expressed as linear combinations of the observed variables. Fi = Wi1X1 + Wi2X2 + Wi3X3 + . . . + WikXk where Fi = estimate of i th factor Wi = weight or factor score coefficient k = number of variables Factor Analysis Model
  • 5.  It is possible to select weights or factor score coefficients so that the first factor explains the largest portion of the total variance.  Then a second set of weights can be selected, so that the second factor accounts for most of the residual variance, subject to being uncorrelated with the first factor.  This same principle could be applied to selecting additional weights for the additional factors. Factor Analysis Model
  • 6. Statistics Associated with Factor Analysis  Bartlett's test of sphericity. Bartlett's test of sphericity is a test statistic used to examine the hypothesis that the variables are uncorrelated in the population. In other words, the population correlation matrix is an identity matrix; each variable correlates perfectly with itself (r = 1) but has no correlation with the other variables (r = 0).  Correlation matrix. A correlation matrix is a lower triangle matrix showing the simple correlations, r, between all possible pairs of variables included in the analysis. The diagonal elements, which are all 1, are usually omitted.
  • 7.  Communality. Communality is the amount of variance a variable shares with all the other variables being considered. This is also the proportion of variance explained by the common factors.  Eigenvalue. The eigenvalue represents the total variance explained by each factor.  Factor loadings. Factor loadings are simple correlations between the variables and the factors.  Factor loading plot. A factor loading plot is a plot of the original variables using the factor loadings as coordinates.  Factor matrix. A factor matrix contains the factor loadings of all the variables on all the factors extracted. Statistics Associated with Factor Analysis
  • 8.  Factor scores. Factor scores are composite scores estimated for each respondent on the derived factors.  Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy is an index used to examine the appropriateness of factor analysis. High values (between 0.5 and 1.0) indicate factor analysis is appropriate. Values below 0.5 imply that factor analysis may not be appropriate.  Percentage of variance. The percentage of the total variance attributed to each factor.  Residuals are the differences between the observed correlations, as given in the input correlation matrix, and the reproduced correlations, as estimated from the factor matrix.  Scree plot. A scree plot is a plot of the Eigenvalues against the number of factors in order of extraction. Statistics Associated with Factor Analysis
  • 9. A Factor Analysis Example A sample of 30 respondents was interviewed using mall intercept interviewing. The respondents were asked to to indicate their degree of agreement with the following statements using a seven-point scale (1 = strongly disagree, 7 =strongly agree).  V1 = It is important to buy a toothpaste that prevents cavities  V2 = I like a toothpaste that gives a shiny teeth  V3 = A toothpaste should strengthen your gums  V4= I prefer a toothpaste that freshens breath  V5 = Prevention of tooth decay is not an important benefit offered by a toothpaste  V6 = The most important consideration in buying a toothpaste is attractive teeth
  • 10. Conducting Factor Analysis RESPONDENT NUMBER V1 V2 V3 V4 V5 V6 1 7.00 3.00 6.00 4.00 2.00 4.00 2 1.00 3.00 2.00 4.00 5.00 4.00 3 6.00 2.00 7.00 4.00 1.00 3.00 4 4.00 5.00 4.00 6.00 2.00 5.00 5 1.00 2.00 2.00 3.00 6.00 2.00 6 6.00 3.00 6.00 4.00 2.00 4.00 7 5.00 3.00 6.00 3.00 4.00 3.00 8 6.00 4.00 7.00 4.00 1.00 4.00 9 3.00 4.00 2.00 3.00 6.00 3.00 10 2.00 6.00 2.00 6.00 7.00 6.00 11 6.00 4.00 7.00 3.00 2.00 3.00 12 2.00 3.00 1.00 4.00 5.00 4.00 13 7.00 2.00 6.00 4.00 1.00 3.00 14 4.00 6.00 4.00 5.00 3.00 6.00 15 1.00 3.00 2.00 2.00 6.00 4.00 16 6.00 4.00 6.00 3.00 3.00 4.00 17 5.00 3.00 6.00 3.00 3.00 4.00 18 7.00 3.00 7.00 4.00 1.00 4.00 19 2.00 4.00 3.00 3.00 6.00 3.00 20 3.00 5.00 3.00 6.00 4.00 6.00 21 1.00 3.00 2.00 3.00 5.00 3.00 22 5.00 4.00 5.00 4.00 2.00 4.00 23 2.00 2.00 1.00 5.00 4.00 4.00 24 4.00 6.00 4.00 6.00 4.00 7.00 25 6.00 5.00 4.00 2.00 1.00 4.00 26 3.00 5.00 4.00 6.00 4.00 7.00 27 4.00 4.00 7.00 2.00 2.00 5.00 28 3.00 7.00 2.00 6.00 4.00 3.00 29 4.00 6.00 3.00 7.00 2.00 7.00 30 2.00 3.00 2.00 4.00 7.00 2.00
  • 11. Conducting Factor Analysis Construction of the Correlation Matrix Method of Factor Analysis Determination of Number of Factors Determination of Model Fit Problem formulation Calculation of Factor Scores Interpretation of Factors Rotation of Factors Selection of Surrogate Variables
  • 12. Conducting Factor Analysis Formulate the Problem  The objectives of factor analysis should be identified.  The variables to be included in the factor analysis should be specified based on past research, theory, and judgment of the researcher. It is important that the variables be appropriately measured on an interval or ratio scale.  An appropriate sample size should be used. As a rough guideline, there should be at least four or five times as many observations (sample size) as there are variables.
  • 13. Correlation Matrix Variables V1 V2 V3 V4 V5 V6 V1 1.000 V2 -0.530 1.000 V3 0.873 -0.155 1.000 V4 -0.086 0.572 -0.248 1.000 V5 -0.858 0.020 -0.778 -0.007 1.000 V6 0.004 0.640 -0.018 0.640 -0.136 1.000
  • 14.  The analytical process is based on a matrix of correlations between the variables.  Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population: in other words, the population correlation matrix is an identity matrix. If this hypothesis cannot be rejected, then the appropriateness of factor analysis should be questioned.  Another useful statistic is the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. Small values of the KMO statistic indicate that the correlations between pairs of variables cannot be explained by other variables and that factor analysis may not be appropriate. Conducting Factor Analysis Construct the Correlation Matrix
  • 15.  In principal components analysis, the total variance in the data is considered. The diagonal of the correlation matrix consists of unities, and full variance is brought into the factor matrix. Principal components analysis is recommended when the primary concern is to determine the minimum number of factors that will account for maximum variance in the data for use in subsequent multivariate analysis. The factors are called principal components.  In common factor analysis, the factors are estimated based only on the common variance. Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate when the primary concern is to identify the underlying dimensions and the common variance is of interest. This method is also known as principal axis factoring. Conducting Factor Analysis Determine the Method of Factor Analysis
  • 16. Results of Principal Components Analysis Communalities Variables Initial Extraction V1 1.000 0.926 V2 1.000 0.723 V3 1.000 0.894 V4 1.000 0.739 V5 1.000 0.878 V6 1.000 0.790 Initial Eigen values Factor Eigen value % of variance Cumulat. % 1 2.731 45.520 45.520 2 2.218 36.969 82.488 3 0.442 7.360 89.848 4 0.341 5.688 95.536 5 0.183 3.044 98.580 6 0.085 1.420 100.000
  • 17. Results of Principal Components Analysis Extraction Sums of Squared Loadings Factor Eigen value % of variance Cumulat. % 1 2.731 45.520 45.520 2 2.218 36.969 82.488 Factor Matrix Variables Factor 1 Factor 2 V1 0.928 0.253 V2 -0.301 0.795 V3 0.936 0.131 V4 -0.342 0.789 V5 -0.869 -0.351 V6 -0.177 0.871 Rotation Sums of Squared Loadings Factor Eigenvalue % of variance Cumulat. % 1 2.688 44.802 44.802 2 2.261 37.687 82.488
  • 18. Results of Principal Components Analysis Rotated Factor Matrix Variables Factor 1 Factor 2 V1 0.962 -0.027 V2 -0.057 0.848 V3 0.934 -0.146 V4 -0.098 0.845 V5 -0.933 -0.084 V6 0.083 0.885 Factor Score Coefficient Matrix Variables Factor 1 Factor 2 V1 0.358 0.011 V2 -0.001 0.375 V3 0.345 -0.043 V4 -0.017 0.377 V5 -0.350 -0.059 V6 0.052 0.395
  • 19. Factor Score Coefficient Matrix Variables V1 V2 V3 V4 V5 V6 V1 0.926 0.024 -0.029 0.031 0.038 -0.053 V2 -0.078 0.723 0.022 -0.158 0.038 -0.105 V3 0.902 -0.177 0.894 -0.031 0.081 0.033 V4 -0.117 0.730 -0.217 0.739 -0.027 -0.107 V5 -0.895 -0.018 -0.859 0.020 0.878 0.016 V6 0.057 0.746 -0.051 0.748 -0.152 0.790 The lower left triangle contains the reproduced correlation matrix; the diagonal, the communalities; the upper right triangle, the residuals between the observed correlations and the reproduced correlations. Results of Principal Components Analysis
  • 20.  A Priori Determination. Sometimes, because of prior knowledge, the researcher knows how many factors to expect and thus can specify the number of factors to be extracted beforehand.  Determination Based on Eigenvalues. In this approach, only factors with Eigenvalues greater than 1.0 are retained. An Eigenvalue represents the amount of variance associated with the factor. Hence, only factors with a variance greater than 1.0 are included. Factors with variance less than 1.0 are no better than a single variable, since, due to standardization, each variable has a variance of 1.0. If the number of variables is less than 20, this approach will result in a conservative number of factors. Conducting Factor Analysis Determine the Number of Factors
  • 21.  Determination Based on Scree Plot. A scree plot is a plot of the Eigenvalues against the number of factors in order of extraction. Experimental evidence indicates that the point at which the scree begins denotes the true number of factors. Generally, the number of factors determined by a scree plot will be one or a few more than that determined by the Eigenvalue criterion.  Determination Based on Percentage of Variance. In this approach the number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level. It is recommended that the factors extracted should account for at least 60% of the variance. Conducting Factor Analysis Determine the Number of Factors
  • 22. Scree Plot 0.5 2 543 6 Component Number 0.0 2.0 3.0 Eigenvalue 1.0 1.5 2.5 1
  • 23.  Determination Based on Split-Half Reliability. The sample is split in half and factor analysis is performed on each half. Only factors with high correspondence of factor loadings across the two subsamples are retained.  Determination Based on Significance Tests. It is possible to determine the statistical significance of the separate Eigenvalues and retain only those factors that are statistically significant. A drawback is that with large samples (size greater than 200), many factors are likely to be statistically significant, although from a practical viewpoint many of these account for only a small proportion of the total variance. Conducting Factor Analysis Determine the Number of Factors
  • 24.  Although the initial or unrotated factor matrix indicates the relationship between the factors and individual variables, it seldom results in factors that can be interpreted, because the factors are correlated with many variables. Therefore, through rotation the factor matrix is transformed into a simpler one that is easier to interpret.  In rotating the factors, we would like each factor to have nonzero, or significant, loadings or coefficients for only some of the variables. Likewise, we would like each variable to have nonzero or significant loadings with only a few factors, if possible with only one.  The rotation is called orthogonal rotation if the axes are maintained at right angles. Conducting Factor Analysis Rotate Factors
  • 25.  The most commonly used method for rotation is the varimax procedure. This is an orthogonal method of rotation that minimizes the number of variables with high loadings on a factor, thereby enhancing the interpretability of the factors. Orthogonal rotation results in factors that are uncorrelated.  The rotation is called oblique rotation when the axes are not maintained at right angles, and the factors are correlated. Sometimes, allowing for correlations among factors can simplify the factor pattern matrix. Oblique rotation should be used when factors in the population are likely to be strongly correlated. Conducting Factor Analysis Rotate Factors
  • 26.  A factor can then be interpreted in terms of the variables that load high on it.  Another useful aid in interpretation is to plot the variables, using the factor loadings as coordinates. Variables at the end of an axis are those that have high loadings on only that factor, and hence describe the factor. Conducting Factor Analysis Interpret Factors
  • 27. Factor Loading Plot 1.0 0.5 0.0 -0.5 -1.0 Component2  Component 1 Component Variable 1 2 V1 0.962 -2.66E-02 V2 -5.72E-02 0.848 V3 0.934 -0.146 V4 -9.83E-02 0.854 V5 -0.933 -8.40E-02 V6 8.337E-02 0.885 Component Plot in Rotated Space      1.0 0.5 0.0 -0.5 -1.0 V1 V3 V6 V2 V5 V4 Rotated Component Matrix
  • 28. The factor scores for the ith factor may be estimated as follows: Fi = Wi1 X1 + Wi2 X2 + Wi3 X3 + . . . + Wik Xk Conducting Factor Analysis Calculate Factor Scores
  • 29.  By examining the factor matrix, one could select for each factor the variable with the highest loading on that factor. That variable could then be used as a surrogate variable for the associated factor.  However, the choice is not as easy if two or more variables have similarly high loadings. In such a case, the choice between these variables should be based on theoretical and measurement considerations. Conducting Factor Analysis Select Surrogate Variables
  • 30.  The correlations between the variables can be deduced or reproduced from the estimated correlations between the variables and the factors.  The differences between the observed correlations (as given in the input correlation matrix) and the reproduced correlations (as estimated from the factor matrix) can be examined to determine model fit. These differences are called residuals. Conducting Factor Analysis Determine the Model Fit
  • 31. Results of Common Factor Analysis Communalities Variables Initial Extraction V1 0.859 0.928 V2 0.480 0.562 V3 0.814 0.836 V4 0.543 0.600 V5 0.763 0.789 V6 0.587 0.723 Barlett test of sphericity • Approx. Chi-Square = 111.314 • df = 15 • Significance = 0.00000 • Kaiser-Meyer-Olkin measure of sampling adequacy = 0.660 Initial Eigenvalues Factor Eigenvalue % of variance Cumulat. % 1 2.731 45.520 45.520 2 2.218 36.969 82.488 3 0.442 7.360 89.848 4 0.341 5.688 95.536 5 0.183 3.044 98.580 6 0.085 1.420 100.000
  • 32. Results of Common Factor Analysis Extraction Sums of Squared Loadings Factor Eigenvalue % of variance Cumulat. % 1 2.570 42.837 42.837 2 1.868 31.126 73.964 Factor Matrix Variables Factor 1 Factor 2 V1 0.949 0.168 V2 -0.206 0.720 V3 0.914 0.038 V4 -0.246 0.734 V5 -0.850 -0.259 V6 -0.101 0.844 Rotation Sums of Squared Loadings Factor Eigenvalue % of variance Cumulat. % 1 2.541 42.343 42.343 2 1.897 31.621 73.964
  • 33. Rotated Factor Matrix Variables Factor 1 Factor 2 V1 0.963 -0.030 V2 -0.054 0.747 V3 0.902 -0.150 V4 -0.090 0.769 V5 -0.885 -0.079 V6 0.075 0.847 Factor Score Coefficient Matrix Variables Factor 1 Factor 2 V1 0.628 0.101 V2 -0.024 0.253 V3 0.217 -0.169 V4 -0.023 0.271 V5 -0.166 -0.059 V6 0.083 0.500 Results of Common Factor Analysis
  • 34. Results of Common Factor Analysis Factor Score Coefficient Matrix Variables V1 V2 V3 V4 V5 V6 V1 0.928 0.022 -0.000 0.024 -0.008 -0.042 V2 -0.075 0.562 0.006 -0.008 0.031 0.012 V3 0.873 -0.161 0.836 -0.005 0.008 0.042 V4 -0.110 0.580 -0.197 0.600 -0.025 -0.004 V5 -0.850 -0.012 -0.786 0.019 0.789 0.003 V6 0.046 0.629 -0.060 0.645 -0.133 0.723 The lower left triangle contains the reproduced correlation matrix; the diagonal, the communalities; the upper right triangle, the residuals between the observed correlations and the reproduced correlations.
  • 35. SPSS Windows To select this procedures using SPSS for Windows click: Analyze>Data Reduction>Factor …
  • 38. 38 Exploratory factor analysis . . . is an interdependence technique whose primary purpose is to define the underlying structure among the variables in the analysis. Exploratory Factor Analysis Defined
  • 39. 39 Exploratory Factor Analysis . . . • Examines the interrelationships among a large number of variables and then attempts to explain them in terms of their common underlying dimensions. • These common underlying dimensions are referred to as factors. • A summarization and data reduction technique that does not have independent and dependent variables, but is an interdependence technique in which all variables are considered simultaneously. What is Exploratory Factor Analysis?
  • 40. 40 Correlation Matrix for Store Image Elements VV11 VV22 VV33 VV44 VV55 VV66 VV77 VV88 VV99 VV11 PPrriiccee LLeevveell 1.00 VV22 SSttoorree PPeerrssoonnnneell .427 1.00 VV33 RReettuurrnn PPoolliiccyy .302 .771 1.00 VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .470 .497 .427 1.00 VV55 PPrroodduucctt QQuuaalliittyy .765 .406 .307 .472 1.00 VV66 AAssssoorrttmmeenntt DDeepptthh .281 .445 .423 .713 .325 1.00 VV77 AAssssoorrttmmeenntt WWiiddtthh .354 .490 .471 .719 .378 .724 1.00 VV88 IInn--SSttoorree SSeerrvviiccee .242 .719 .733 .428 .240 .311 .435 1.00 VV99 SSttoorree AAttmmoosspphheerree .372 .737 .774 .479 .326 .429 .466 .710 1.00
  • 41. 41 Correlation Matrix of Variables After Grouping Using Factor Analysis Shaded areas represent variables likely to be grouped together by factor analysis. VV33 VV88 VV99 VV22 VV66 VV77 VV44 VV11 VV55 VV33 RReettuurrnn PPoolliiccyy 1.00 VV88 IInn--ssttoorree SSeerrvviiccee .733 1.00 VV99 SSttoorree AAttmmoosspphheerree .774 .710 1.00 VV22 SSttoorree PPeerrssoonnnneell .741 .719 .787 1.00 VV66 AAssssoorrttmmeenntt DDeepptthh .423 .311 .429 .445 1.00 VV77 AAssssoorrttmmeenntt WWiiddtthh .471 .435 .468 .490 .724 1.00 VV44 PPrroodduucctt AAvvaaiillaabbiilliittyy .427 .428 .479 .497 .713 .719 1.00 VV11 PPrriiccee LLeevveell .302 .242 .372 .427 .281 .354 .470 1. 00 VV55 PPrroodduucctt QQuuaalliittyy .307 .240 .326 .406 .325 .378 .472 .765 1.00
  • 42. 42 Application of Factor Analysis to a Fast-Food Restaurant Service Quality Food Quality FactorsVariables Waiting Time Cleanliness Friendly Employees Taste Temperature Freshness
  • 43. 43 Factor Analysis Decision Process Stage 1: Objectives of Factor Analysis Stage 2: Designing a Factor Analysis Stage 3: Assumptions in Factor Analysis Stage 4: Deriving Factors and Assessing Overall Fit Stage 5: Interpreting the Factors Stage 6: Validation of Factor Analysis Stage 7: Additional uses of Factor Analysis Results
  • 44. 44 Stage 1: Objectives of Factor Analysis 1. Is the objective exploratory or confirmatory? 2. Specify the unit of analysis. 3. Data summarization and/or reduction? 4. Using factor analysis with other techniques.
  • 45. 45 Factor Analysis Outcomes 1. Data summarization = derives underlying dimensions that, when interpreted and understood, describe the data in a much smaller number of concepts than the original individual variables. 2. Data reduction = extends the process of data summarization by deriving an empirical value (factor score or summated scale) for each dimension (factor) and then substituting this value for the original values.
  • 46. 46 Types of Factor Analysis 1. Exploratory Factor Analysis (EFA) = is used to discover the factor structure of a construct and examine its reliability. It is data driven. 2. Confirmatory Factor Analysis (CFA) = is used to confirm the fit of the hypothesized factor structure to the observed (sample) data. It is theory driven.
  • 47. 47 Stage 2: Designing a Factor Analysis Three Basic Decisions: 1. Calculation of input data – R vs. Q analysis. 2. Design of study in terms of number of variables, measurement properties of variables, and the type of variables. 3. Sample size necessary.
  • 48. 48 Rules of Thumb 1 Factor Analysis Design o Factor analysis is performed most often only on metric variables, although specialized methods exist for the use of dummy variables. A small number of “dummy variables” can be included in a set of metric variables that are factor analyzed. o If a study is being designed to reveal factor structure, strive to have at least five variables for each proposed factor. o For sample size: • the sample must have more observations than variables. • the minimum absolute sample size should be 50 observations. o Maximize the number of observations per variable, with a minimum of five and hopefully at least ten observations per variable.
  • 49. 49 Stage 3: Assumptions in Factor Analysis Three Basic Decisions . . . 1. Calculation of input data – R vs. Q analysis. 2. Design of study in terms of number of variables, measurement properties of variables, and the type of variables. 3. Sample size required.
  • 50. 50 Assumptions • Multicollinearity  Assessed using MSA (measure of sampling adequacy). • Homogeneity of sample factor solutions The MSA is measured by the Kaiser-Meyer-Olkin (KMO) statistic. As a measure of sampling adequacy, the KMO predicts if data are likely to factor well based on correlation and partial correlation. KMO can be used to identify which variables to drop from the factor analysis because they lack multicollinearity. There is a KMO statistic for each individual variable, and their sum is the KMO overall statistic. KMO varies from 0 to 1.0. Overall KMO should be .50 or higher to proceed with factor analysis. If it is not, remove the variable with the lowest individual KMO statistic value one at a time until KMO overall rises above .50, and each individual variable KMO is above .50.
  • 51. 51 Rules of Thumb 2 Testing Assumptions of Factor Analysis • There must be a strong conceptual foundation to support the assumption that a structure does exist before the factor analysis is performed. • A statistically significant Bartlett’s test of sphericity (sig. < .05) indicates that sufficient correlations exist among the variables to proceed. • Measure of Sampling Adequacy (MSA) values must exceed .50 for both the overall test and each individual variable. Variables with values less than .50 should be omitted from the factor analysis one at a time, with the smallest one being omitted each time.
  • 52. 52 Stage 4: Deriving Factors and Assessing Overall Fit • Selecting the factor extraction method – common vs. component analysis. • Determining the number of factors to represent the data.
  • 53. 53 Extraction Decisions o Which method? • Principal Components Analysis • Common Factor Analysis o How to rotate? • Orthogonal or Oblique rotation
  • 54. 54 Diagonal Value Variance Unity (1) Communality Total Variance Common Specific and Error Variance extracted Variance not used Extraction Method Determines the Types of Variance Carried into the Factor Matrix
  • 55. 55 Principal Components vs. Common? Two Criteria . . . • Objectives of the factor analysis. • Amount of prior knowledge about the variance in the variables.
  • 56. 56 Number of Factors? • A Priori Criterion • Latent Root Criterion • Percentage of Variance • Scree Test Criterion
  • 57. 57 Eigenvalue Plot for Scree Test Criterion
  • 58. 58 Rules of Thumb 3 Choosing Factor Models and Number of Factors • Although both component and common factor analysis models yield similar results in common research settings (30 or more variables or communalities of .60 for most variables):  the component analysis model is most appropriate when data reduction is paramount.  the common factor model is best in well-specified theoretical applications. • Any decision on the number of factors to be retained should be based on several considerations:  use of several stopping criteria to determine the initial number of factors to retain.  Factors With Eigenvalues greater than 1.0.  A pre-determined number of factors based on research objectives and/or prior research.  Enough factors to meet a specified percentage of variance explained, usually 60% or higher.  Factors shown by the scree test to have substantial amounts of common variance (i.e., factors before inflection point).  More factors when there is heterogeneity among sample subgroups. • Consideration of several alternative solutions (one more and one less factor than the initial solution) to ensure the best structure is identified.
  • 59. 59 Processes of Factor Interpretation • Estimate the Factor Matrix • Factor Rotation • Factor Interpretation • Respecification of factor model, if needed, may involve . . . o Deletion of variables from analysis o Desire to use a different rotational approach o Need to extract a different number of factors o Desire to change method of extraction
  • 60. 60 Rotation of Factors Factor rotation = the reference axes of the factors are turned about the origin until some other position has been reached. Since unrotated factor solutions extract factors based on how much variance they account for, with each subsequent factor accounting for less variance. The ultimate effect of rotating the factor matrix is to redistribute the variance from earlier factors to later ones to achieve a simpler, theoretically more meaningful factor pattern.
  • 61. 61 Two Rotational Approaches 1. Orthogonal = axes are maintained at 90 degrees. 2. Oblique = axes are not maintained at 90 degrees.
  • 62. 62 Unrotated Factor II Unrotated Factor I Rotated Factor I Rotated Factor II -1.0 -.50 0 +.50 +1.0 -.50 -1.0 +1. 0 +.5 0 V1 V2 V3 V4 V5 Orthogonal Factor Rotation
  • 63. 63 Unrotated Factor II Unrotate d Factor I Oblique Rotation : Factor I Orthogonal Rotation: Factor II -1.0 -.50 0 +.50 +1.0 -.50 -1.0 +1. 0 +.5 0 V1 V2 V3 V4 V5 Orthogonal Rotation: Factor I Oblique Rotation: Factor II Oblique Factor Rotation
  • 64. 64 Orthogonal Rotation Methods • Quartimax (simplify rows) • Varimax (simplify columns) • Equimax (combination)
  • 65. 65 Rules of Thumb 4 Choosing Factor Rotation Methods • Orthogonal rotation methods . . . o are the most widely used rotational methods. o are The preferred method when the research goal is data reduction to either a smaller number of variables or a set of uncorrelated measures for subsequent use in other multivariate techniques. • Oblique rotation methods . . . o best suited to the goal of obtaining several theoretically meaningful factors or constructs because, realistically, very few constructs in the “real world” are uncorrelated.
  • 66. 66 Which Factor Loadings Are Significant? • Customary Criteria = Practical Significance. • Sample Size & Statistical Significance. • Number of Factors ( = >) and/or Variables ( = <) .
  • 67. 67 Factor Loading Sample Size Needed for Significance* .30 350 .35 250 .40 200 .45 150 .50 120 .55 100 .60 85 .65 70 .70 60 .75 50*Significance is based on a .05 significance level (a), a power level of 80 percent, and standard errors assumed to be twice those of conventional correlation coefficients. Guidelines for Identifying Significant Factor Loadings Based on Sample Size
  • 68. 68 Rules of Thumb 5 Assessing Factor Loadings • While factor loadings of +.30 to +.40 are minimally acceptable, values greater than + .50 are considered necessary for practical significance. • To be considered significant: o A smaller loading is needed given either a larger sample size, or a larger number of variables being analyzed. o A larger loading is needed given a factor solution with a larger number of factors, especially in evaluating the loadings on later factors. • Statistical tests of significance for factor loadings are generally very conservative and should be considered only as starting points needed for including a variable for further consideration.
  • 69. 69 Stage 5: Interpreting the Factors • Selecting the factor extraction method – common vs. component analysis. • Determining the number of factors to represent the data.
  • 70. 70 Interpreting a Factor Matrix: 1. Examine the factor matrix of loadings. 2. Identify the highest loading across all factors for each variable. 3. Assess communalities of the variables. 4. Label the factors.
  • 71. 71 Rules of Thumb 6 Interpreting The Factors  An optimal structure exists when all variables have high loadings only on a single factor.  Variables that cross-load (load highly on two or more factors) are usually deleted unless theoretically justified or the objective is strictly data reduction.  Variables should generally have communalities of greater than .50 to be retained in the analysis.  Respecification of a factor analysis can include options such as: o deleting a variable(s), o changing rotation methods, and/or o increasing or decreasing the number of factors.
  • 72. 72 Stage 6: Validation of Factor Analysis • Confirmatory Perspective. • Assessing Factor Structure Stability. • Detecting Influential Observations.
  • 73. 73 Stage 7: Additional Uses of Factor Analysis Results • Selecting Surrogate Variables • Creating Summated Scales • Computing Factor Scores
  • 74. 74 Rules of Thumb 7 Summated Scales • A summated scale is only as good as the items used to represent the construct. While it may pass all empirical tests, it is useless without theoretical justification. • Never create a summated scale without first assessing its unidimensionality with exploratory or confirmatory factor analysis. • Once a scale is deemed unidimensional, its reliability score, as measured by Cronbach’s alpha: o should exceed a threshold of .70, although a .60 level can be used in exploratory research. o the threshold should be raised as the number of items increases, especially as the number of items approaches 10 or more. • With reliability established, validity should be assessed in terms of: o convergent validity = scale correlates with other like scales. o discriminant validity = scale is sufficiently different from other related scales. o nomological validity = scale “predicts” as theoretically
  • 75. 75 Rules of Thumb 8 Representing Factor Analysis In Other Analyses • The single surrogate variable:  Advantages: simple to administer and interpret.  Disadvantages: 1) does not represent all “facets” of a factor 2) prone to measurement error. • Factor scores:  Advantages: 1) represents all variables loading on the factor, 2) best method for complete data reduction. 3) Are by default orthogonal and can avoid complications caused by multicollinearity.  Disadvantages: 1) interpretation more difficult since all variables contribute through loadings 2) Difficult to replicate across studies.
  • 76. 76 Rules of Thumb 8 Continued . . . Representing Factor Analysis In Other Analyses • Summated scales:  Advantages: 1) compromise between the surrogate variable and factor score options. 2) reduces measurement error. 3) represents multiple facets of a concept. 4) easily replicated across studies.  Disadvantages: 1) includes only the variables that load highly on the factor and excludes those having little or marginal impact. 2) not necessarily orthogonal. 3) Require extensive analysis of reliability and validity issues.
  • 77. 77 Variable Description Variable Type Data Warehouse Classification Variables X1 Customer Type nonmetric X2 Industry Type nonmetric X3 Firm Size nonmetric X4 Region nonmetric X5 Distribution System nonmetric Performance Perceptions Variables X6 Product Quality metric X7 E-Commerce Activities/Website metric X8 Technical Support metric X9 Complaint Resolution metric X10 Advertising metric X11 Product Line metric X12 Salesforce Image metric X13 Competitive Pricing metric X14 Warranty & Claims metric X15 New Products metric X16 Ordering & Billing metric X17 Price Flexibility metric X18 Delivery Speed metric Outcome/Relationship Measures X19 Satisfaction metric X20 Likelihood of Recommendation metric X21 Likelihood of Future Purchase metric X22 Current Purchase/Usage Level metric X23 Consider Strategic Alliance/Partnership in Future nonmetric Description of HBAT Primary Database Variables
  • 78. 78 Rotated Component Matrix “Reduced Set” of HBAT Perceptions Variables Component Communality 1 2 3 4 X9 – Complaint Resolution .933 .890 X18 – Delivery Speed .931 .894 X16 – Order & Billing .886 .806 X12 – Salesforce Image .898 .860 X7 – E-Commerce Activities .868 .780 X10 – Advertising .743 .585 X8 – Technical Support .940 .894 X14 – Warranty & Claims .933 .891 X6 – Product Quality .892 .798 X13 – Competitive Pricing -.730 .661 Sum of Squares 2.589 2.216 1.846 1.406 8.057 Percentage of Trace 25.893 22.161 18.457 14.061 80.572
  • 79. 79 Scree Test for HBAT Component Analysis
  • 80. 80 Factor Analysis Learning Checkpoint 1. What are the major uses of factor analysis? 2. What is the difference between component analysis and common factor analysis? 3. Is rotation of factors necessary? 4. How do you decide how many factors to extract? 5. What is a significant factor loading? 6. How and why do you name a factor? 7. Should you use factor scores or summated ratings in follow-up analyses?