SlideShare a Scribd company logo
QCI- 5 Days Online Workshop on
Enhancing Research Capability
Dr. Jai Singh
FACTOR ANALYSIS
Impact of Covid-19 in Health Sector
• Insurance Industry
• Pharmaceutical Companies
• Physicians
• Patients
• Government
• Nurse/Carers, staff, unions,
• Voluntary organisations
• Social services,
• Local health authority
• Primary care groups
• Local health groups
Factor Analysis
• Factor Analysis (FA) is an exploratory technique applied to
a set of large number of observed variables to find basic
small number of factors (subsets of variables) from which
the observed variables were generated.
Ex.- An individual’s response to the questions on a college
entrance test is influenced by underlying variables such as
intelligence, years in school, age, emotional state on the day
of the test, amount of practice taking tests, and so on.
Ex.- The analyst hopes to reduce the interpretation of a 200-
question test to the study of 4 or 5 factors.
Continue-
• Factor analysis is a procedure used to reduce a
large amount of questions (condense the data) into
few variables (Factors) according to their
relevance.
• Factors are assumed to represent dimensions
within data.
• The answers to the questions are the observed
variables. The underlying, influential variables are
the factors.
• The most common technique is known as
Principal Component Analysis (PCA).
Factor-1
Factor-2
Continue-
• Factor analysis is a useful tool for investigating variable
relationships for complex concepts such as socioeconomic
status, dietary patterns, or psychological scales.
• It allows researchers to investigate concepts that are not easily
measured directly by collapsing a large number of variables
into a few interpretable underlying factors.
• Factor analysis is commonly used in:
-Scale development
-The evaluation of the psychometric quality of a measure, and
-The assessment of the dimensionality of a set of variables.
Factor Analysis –Structural Aspect
• Factor analysis provides a tool
for analyzing the structure of
interrelationships
(Correlations) among
variables by defining a set of
variables which are highly
correlated known as Factors.
What is a factor?
• Multiple observed variables have similar patterns of
responses because they are all associated with a latent (i.e.
not directly measured) variable.
- Ex.- People may respond similarly to questions about
income, education, and occupation, which are all associated
with the latent variable socioeconomic status.
• In every factor analysis, there are the same number of
factors as there are variables. Each factor captures a certain
amount of the overall variance in the observed variables.
• Factors are always listed in order of how much variation
they explain.
Exploratory Factor Analysis (EFA)
Exploratory - When the dimensions/factors are theoretically unknown
- Exploratory factor analysis is a statistical approach that can be used
to analyze interrelationships (correlation) among a large number of
variables in a data set and to explain these variables in terms of a
smaller number of common underlying dimensions.
- This involves finding a way of condensing the information
contained in some of the original variables into a smaller set of
implicit variables (called factors) with a minimum loss of
information.
- This type of analysis provides a factor structure (a grouping of
variables based on strong correlations).
Confirmatory Factor Analysis
• Confirmatory – When researcher has preconceived thoughts
about the actual structure of data based on theoretical
support or prior research.
• Researcher may wish to test hypothesis involving issues as
which variables should be grouped together on a factor.
• Example - Retail firm identified 80 characteristics of retail
stores and their services that consumers mentioned as
affecting their patronage choice among stores. Retailer want
to find the broader dimensions on which he can conduct a
survey
Exploratory Factor Analysis
• Assumptions:
Metric Data (Interval)
Multicollinearity must be present i.e.
Correlation among variables
Adequate sample size
• Purpose
Obtaining independent factors
Data Reduction
Extract Factor (Example-1)
• Several questions closely related to aspects of
customer satisfaction
-How satisfied are you with our product?
-Would you recommend our product to a friend or family
member?
-How likely are to you purchase our product in the future
Continue-
• We want one variable to represent a customer satisfaction
score.
- One option would be to average the three question
responses.
- Another option would be to create a factor dependent
variable.
• This can be done by PCA and keeping the first Principal
Component (also known as a factor).
- The advantage of PCA over an average is that it automatically
weights each of the variables in the calculation.
Example-2
• Purchase barriers of potential customers
• Possible barriers to purchase
- Factor analysis can uncover the trends of how these
questions will move together.
- Loadings for 3 factors for each of the variables.
Retrieved from- https://www.qualtrics.com/xm-institute
Principal components weights for the variables
• The first component heavily weights variables
related to cost,
• The second weights variables related to IT, and
• The third weights variables related to
organizational factors.
- We can give our new super variables appropriate
names.
If we were to cluster the customers based on these three
components, we can see some trends. Customers tend to be high
in Cost barriers or Org barriers, but not both.
QCI WORKSHOP- Factor analysis-
Scope of Factor analysis
 Psychographics (Agree/Disagree):
• I value family
• I believe brand represents value
 Behavioral (Agree/Disagree):
• I purchase the cheapest option
• I am a bargain shopper
 Attitudinal (Agree/Disagree):
• The economy is not improving
• I am pleased with the product
 Activity-Based (Agree/Disagree):
• I love sports
• I sometimes shop online during
work hours
Behavioral and psychographic questions are especially suited
for factor analysis.
Eigenvalue
• The eigenvalue is a measure of how much of the variance of the
observed variables a factor explains. Eigenvalues represent the total
amount of variance that can be explained by a given principal component.
• Any factor with an eigenvalue ≥1 explains more variance than a single
observed variable.
• Factor for socioeconomic status had an eigenvalue of 2.3 it would explain
as much variance as 2.3 of the three variables.
• If eigenvalues are greater than zero, then it’s a good sign.
• Since variance cannot be negative, negative eigenvalues imply the model is
ill-conditioned.
• Eigenvalues close to zero imply there is item multicollinearity, since all the
variance can be taken up by the first component.
• The factors that explain the least amount of variance are generally
discarded.
Variable1
Variable-4
Variable-3
Variable-2
Factor-1
(Eigen Value)≥1
What are factor loadings?
Variables Factor 1 Factor 2
Income 0.65 0.11
Education 0.59 0.25
Occupation 0.48 0.19
House value 0.38 0.60
Number of public parks in
neighborhood
0.13 0.57
Number of violent crimes per year
in neighborhood
0.23 0.55
The relationship of each variable to the underlying factor is expressed
by the so-called factor loading.
Example- indicators of wealth with six variables and two resulting
factors.
Interpretation
• The variable income has the strongest association to the underlying latent
variable. Factor 1, with a factor loading of 0.65.
• Two other variables, education and occupation, are also associated with
Factor 1. Based on the variables loading highly onto Factor 1, we could
call it “Individual socioeconomic status.”
• House value, number of public parks, and number of violent crimes per
year, however, have high factor loadings on the other factor, Factor 2.
They seem to indicate the overall wealth within the neighborhood, so we
may want to call Factor 2 “Neighborhood socioeconomic status.”
• Since factor loadings can be interpreted like standardized regression
coefficients, one could also say that the variable income has a correlation
of 0.65 with Factor 1.
Continue-
• Variable house value also is marginally important
in Factor 1 (loading = 0.38).
• This makes sense, since the value of a person’s
house should be associated with his or her
income.
• .50 loading denotes that 25% of variance is
accounted for by the factor.
• Loading must exceed .70 for factor to account for
.50 variance.
• Loadings .5 are considered practically significant
and .7 are indicative of well defined structure.
Communalities
• Communalities indicate the amount of variance in each
variable that is accounted for by the components.
• Initial communalities are estimates of the variance in each
variable accounted for by all components or factors.
• For principal components extraction, this is always equal to
1.0 for correlation analyses.
• Extraction communalities are estimates of the variance in each
variable accounted for by the components.
• The high communalities indicates that the extracted
components represent the variables well.
• If any communalities are very low in a principal components
extraction, you may need to extract another component.
Variance in factor analysis
• Two types of variance- common and unique
• Common variance is the amount of variance that is shared
among a set of items. Items that are highly correlated will share
a lot of variance.
– Communality (also called h2) is a definition of common variance
that ranges between 00 and 11. Values closer to 1 suggest that
extracted factors explain more of the variance of an individual item.
• Unique variance is any portion of variance that’s not common.
There are two types:
– Specific variance: is variance that is specific to a particular item
– Error variance: comes from errors of measurement and basically
anything unexplained by common or specific variance.
Kaiser-Meyer-Olkin (KMO) Test
- Measure of how suited data is for Factor
Analysis.
- Sampling adequacy for each variable in the
model and for the complete model.
- Measure of the proportion of variance among
variables that might be common variance.
- The lower the proportion, the more suited data
to Factor Analysis.
KMO Values
• KMO values between 0 and 1.
• A rule of thumb for interpreting the statistic
- KMO values between 0.8 and 1 indicate the
sampling is adequate.
- KMO values less than 0.6 indicate the sampling is
not adequate and that remedial action should be
taken.
- Some times we accept KMO values between 0.5
and 0.6.
Bartlett test of sphericity
Checks statistical significance that correlation matrix has significant
correlation among at least some variables
• Bartlett test should be significant i.e. less than 0.05 this means that
the variables are correlated highly enough to provide a reasonable
basis for factor analysis.
• Correlation matrix is significantly different from an identity matrix
in which correlations between variables are all zero.
• Test compares an observed correlation matrix to the identity matrix.
Null hypothesis of the test –variables are orthogonal, i.e. not correlated.
Alternative hypothesis - variables are not orthogonal, i.e. they are
correlated enough - correlation matrix diverges significantly from
identity matrix.
Bartlett Test
- -
Variable-1
Variabl
e-3
Variable-
2
Variable-4
V-1
V-3
V-2
V-4
V-6
V-7
V-5
Case-1
Case-2
How many number of factors should be extracted?
There are some criteria, but no 100% foolproof statistical test exists.
• Drawing screeplot: Connect eigenvalues (as representing variances
explained by each factor) for many possible factors from maximum to
minimum. The adequate number of factors is before the sudden
downward inflextion of the plot.
• Parallel analysis: Compare actual screeplot with the possible
screeplot based on randomly resampled data. The adequate number of
factors is at the crossing point of the two plots.
• Eigenvalues > 1: Eigenvalues sum to the number of items, so an
eigenvalue more than 1 is more informative than a single average item.
Factor rotation
• Un rotated factor don’t provide the
required information.
• Axis of the factors can be rotated within
the multidimensional variable space while
determining the best fit between the
variables and the latent factors.
• Factor rotation improves interpretation of
data by reducing ambiguities.
• Rotation - reference axes of factors are
turned about the origin - some best position
has been achieved.
• Un rotated factor solution extract factors
in order of their variances extracted.
• Effect of rotation - redistribute the variance
from earlier factors to later ones.
Orthogonal and oblique factor rotation
• Orthogonal factor rotation
- Axes are maintained at 90 degree.
- More suitable when research goal is data
reduction
- Rotations that assume the factors are not
correlated are called orthogonal rotations.
• Oblique factor rotation
- Axes are rotated but they don’t retain the
90 degree angle between reference axes.
- Oblique is more flexible.
- Best suited to the goal of obtaining
several theoretically meaningful factors.
- Rotations that allow for correlation are
called oblique rotations.
Correlation between factors
• High Individual socioeconomic status (Factor
1) lives also in an area that has a high
Neighborhood socioeconomic status (Factor
2).
• That means the factors should be correlated.
• the two axes of the two factors are probably
closer together than an orthogonal rotation
axis.
• the angle between the two factors is now
smaller than 90 degrees.
• meaning the factors are now correlated.
• In this example, an oblique rotation
accommodates the data better than an
orthogonal rotation.
Sample Example
 A researcher is
interested to investigate
the reasons of choosing
a university for
education. Several
variables were identified
which influence the
individuals
(guardians/students) to
choose a university.
• Variables-
1. Cost of Education
2. Quality of Education
3. Availability of experts and modern
laboratories
4. Having own campus and security
5. Numbers of years operating
6. Number of graduates in job and abroad
7. International recognition and
8. Accommodations and food
Seven Point Scale; 1=Not Important to 7=Very
Important

More Related Content

PDF
PPTX
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
PPTX
PPTX
Assumptions about parametric and non parametric tests
PPT
Chosing the appropriate_statistical_test
PPTX
Non parametric test 8
PDF
Nonparametric Statistics
PPTX
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Assumptions about parametric and non parametric tests
Chosing the appropriate_statistical_test
Non parametric test 8
Nonparametric Statistics
Workshop on Data Analysis and Result Interpretation in Social Science Researc...

What's hot (20)

PPTX
Parametric and Non Parametric methods
PPTX
3.1 non parametric test
PPTX
Non parametric test
PPTX
Repeated anova measures ppt
PPTX
Data Analysis
PPTX
Parametric versus non parametric test
PPTX
Non parametric study; Statistical approach for med student
PPT
DIstinguish between Parametric vs nonparametric test
PPTX
Non parametric test
PDF
Analysis of Variance (ANOVA)
PPTX
Parametric vs non parametric sem2 final
PDF
Student's T Test
PPTX
Statistical tests
PPT
Non parametric methods
PDF
STATISTICAL TOOLS IN RESEARCH
PPTX
Assignment AW
PPTX
Non parametric presentation
PPTX
Parametric Statistical tests
PPT
Non parametric tests
PDF
Parametric tests
Parametric and Non Parametric methods
3.1 non parametric test
Non parametric test
Repeated anova measures ppt
Data Analysis
Parametric versus non parametric test
Non parametric study; Statistical approach for med student
DIstinguish between Parametric vs nonparametric test
Non parametric test
Analysis of Variance (ANOVA)
Parametric vs non parametric sem2 final
Student's T Test
Statistical tests
Non parametric methods
STATISTICAL TOOLS IN RESEARCH
Assignment AW
Non parametric presentation
Parametric Statistical tests
Non parametric tests
Parametric tests
Ad

Similar to QCI WORKSHOP- Factor analysis- (20)

PPTX
Factor Analysis from sets of measures.pptx
PPTX
Factor analysis (fa)
PDF
Overview Of Factor Analysis Q Ti A
PPTX
Factor Analysis | Introduction to Factor Analysis
PPTX
Factor Analysis of MPH Biostatistics.pptx
PPTX
Factor_analysis in psychology clinical biostatistics
PPTX
Factor analysis
PPTX
Factor Analysis in Research
PPTX
Exploratory factor analysis
PPTX
Factor analysis (1)
ODP
Exploratory
PPT
Factor Analysis.ppt
PPT
Factor anaysis scale dimensionality
PPTX
Factor Analysis (Marketing Research)
PPT
Factor analysis
PPTX
9. Factor Analysis_JASP.pptx..................................
PPTX
MR Multivariate.pptx
PPTX
Factor analysis
PPTX
PPTX
Factor Analysis Prakash Poddar
Factor Analysis from sets of measures.pptx
Factor analysis (fa)
Overview Of Factor Analysis Q Ti A
Factor Analysis | Introduction to Factor Analysis
Factor Analysis of MPH Biostatistics.pptx
Factor_analysis in psychology clinical biostatistics
Factor analysis
Factor Analysis in Research
Exploratory factor analysis
Factor analysis (1)
Exploratory
Factor Analysis.ppt
Factor anaysis scale dimensionality
Factor Analysis (Marketing Research)
Factor analysis
9. Factor Analysis_JASP.pptx..................................
MR Multivariate.pptx
Factor analysis
Factor Analysis Prakash Poddar
Ad

More from Academy for Higher Education and Social Science Research (7)

PPTX
Learning styles pedagogy - ppt presented in Pondicherry university Workshop
PPTX
Constructivism learning/Learning Styles
PPTX
Blended and online Learning PPT Presented in Pondicherry university
PPTX
Digital Initiative in Higher Education (Flipped and Online Learning)
PPTX
Median from grouped data (special case)
PPTX
Classical Test Theory (CTT)- By Dr. Jai Singh
Learning styles pedagogy - ppt presented in Pondicherry university Workshop
Constructivism learning/Learning Styles
Blended and online Learning PPT Presented in Pondicherry university
Digital Initiative in Higher Education (Flipped and Online Learning)
Median from grouped data (special case)
Classical Test Theory (CTT)- By Dr. Jai Singh

Recently uploaded (20)

PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Business Ethics Teaching Materials for college
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Pre independence Education in Inndia.pdf
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Institutional Correction lecture only . . .
PDF
TR - Agricultural Crops Production NC III.pdf
VCE English Exam - Section C Student Revision Booklet
human mycosis Human fungal infections are called human mycosis..pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Business Ethics Teaching Materials for college
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Anesthesia in Laparoscopic Surgery in India
Basic Mud Logging Guide for educational purpose
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
O5-L3 Freight Transport Ops (International) V1.pdf
Renaissance Architecture: A Journey from Faith to Humanism
PPH.pptx obstetrics and gynecology in nursing
Pre independence Education in Inndia.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
Institutional Correction lecture only . . .
TR - Agricultural Crops Production NC III.pdf

QCI WORKSHOP- Factor analysis-

  • 1. QCI- 5 Days Online Workshop on Enhancing Research Capability Dr. Jai Singh FACTOR ANALYSIS
  • 2. Impact of Covid-19 in Health Sector • Insurance Industry • Pharmaceutical Companies • Physicians • Patients • Government • Nurse/Carers, staff, unions, • Voluntary organisations • Social services, • Local health authority • Primary care groups • Local health groups
  • 3. Factor Analysis • Factor Analysis (FA) is an exploratory technique applied to a set of large number of observed variables to find basic small number of factors (subsets of variables) from which the observed variables were generated. Ex.- An individual’s response to the questions on a college entrance test is influenced by underlying variables such as intelligence, years in school, age, emotional state on the day of the test, amount of practice taking tests, and so on. Ex.- The analyst hopes to reduce the interpretation of a 200- question test to the study of 4 or 5 factors.
  • 4. Continue- • Factor analysis is a procedure used to reduce a large amount of questions (condense the data) into few variables (Factors) according to their relevance. • Factors are assumed to represent dimensions within data. • The answers to the questions are the observed variables. The underlying, influential variables are the factors. • The most common technique is known as Principal Component Analysis (PCA).
  • 6. Continue- • Factor analysis is a useful tool for investigating variable relationships for complex concepts such as socioeconomic status, dietary patterns, or psychological scales. • It allows researchers to investigate concepts that are not easily measured directly by collapsing a large number of variables into a few interpretable underlying factors. • Factor analysis is commonly used in: -Scale development -The evaluation of the psychometric quality of a measure, and -The assessment of the dimensionality of a set of variables.
  • 7. Factor Analysis –Structural Aspect • Factor analysis provides a tool for analyzing the structure of interrelationships (Correlations) among variables by defining a set of variables which are highly correlated known as Factors.
  • 8. What is a factor? • Multiple observed variables have similar patterns of responses because they are all associated with a latent (i.e. not directly measured) variable. - Ex.- People may respond similarly to questions about income, education, and occupation, which are all associated with the latent variable socioeconomic status. • In every factor analysis, there are the same number of factors as there are variables. Each factor captures a certain amount of the overall variance in the observed variables. • Factors are always listed in order of how much variation they explain.
  • 9. Exploratory Factor Analysis (EFA) Exploratory - When the dimensions/factors are theoretically unknown - Exploratory factor analysis is a statistical approach that can be used to analyze interrelationships (correlation) among a large number of variables in a data set and to explain these variables in terms of a smaller number of common underlying dimensions. - This involves finding a way of condensing the information contained in some of the original variables into a smaller set of implicit variables (called factors) with a minimum loss of information. - This type of analysis provides a factor structure (a grouping of variables based on strong correlations).
  • 10. Confirmatory Factor Analysis • Confirmatory – When researcher has preconceived thoughts about the actual structure of data based on theoretical support or prior research. • Researcher may wish to test hypothesis involving issues as which variables should be grouped together on a factor. • Example - Retail firm identified 80 characteristics of retail stores and their services that consumers mentioned as affecting their patronage choice among stores. Retailer want to find the broader dimensions on which he can conduct a survey
  • 11. Exploratory Factor Analysis • Assumptions: Metric Data (Interval) Multicollinearity must be present i.e. Correlation among variables Adequate sample size • Purpose Obtaining independent factors Data Reduction
  • 12. Extract Factor (Example-1) • Several questions closely related to aspects of customer satisfaction -How satisfied are you with our product? -Would you recommend our product to a friend or family member? -How likely are to you purchase our product in the future
  • 13. Continue- • We want one variable to represent a customer satisfaction score. - One option would be to average the three question responses. - Another option would be to create a factor dependent variable. • This can be done by PCA and keeping the first Principal Component (also known as a factor). - The advantage of PCA over an average is that it automatically weights each of the variables in the calculation.
  • 14. Example-2 • Purchase barriers of potential customers • Possible barriers to purchase - Factor analysis can uncover the trends of how these questions will move together. - Loadings for 3 factors for each of the variables.
  • 16. Principal components weights for the variables • The first component heavily weights variables related to cost, • The second weights variables related to IT, and • The third weights variables related to organizational factors. - We can give our new super variables appropriate names.
  • 17. If we were to cluster the customers based on these three components, we can see some trends. Customers tend to be high in Cost barriers or Org barriers, but not both.
  • 19. Scope of Factor analysis  Psychographics (Agree/Disagree): • I value family • I believe brand represents value  Behavioral (Agree/Disagree): • I purchase the cheapest option • I am a bargain shopper  Attitudinal (Agree/Disagree): • The economy is not improving • I am pleased with the product  Activity-Based (Agree/Disagree): • I love sports • I sometimes shop online during work hours Behavioral and psychographic questions are especially suited for factor analysis.
  • 20. Eigenvalue • The eigenvalue is a measure of how much of the variance of the observed variables a factor explains. Eigenvalues represent the total amount of variance that can be explained by a given principal component. • Any factor with an eigenvalue ≥1 explains more variance than a single observed variable. • Factor for socioeconomic status had an eigenvalue of 2.3 it would explain as much variance as 2.3 of the three variables. • If eigenvalues are greater than zero, then it’s a good sign. • Since variance cannot be negative, negative eigenvalues imply the model is ill-conditioned. • Eigenvalues close to zero imply there is item multicollinearity, since all the variance can be taken up by the first component. • The factors that explain the least amount of variance are generally discarded.
  • 22. What are factor loadings? Variables Factor 1 Factor 2 Income 0.65 0.11 Education 0.59 0.25 Occupation 0.48 0.19 House value 0.38 0.60 Number of public parks in neighborhood 0.13 0.57 Number of violent crimes per year in neighborhood 0.23 0.55 The relationship of each variable to the underlying factor is expressed by the so-called factor loading. Example- indicators of wealth with six variables and two resulting factors.
  • 23. Interpretation • The variable income has the strongest association to the underlying latent variable. Factor 1, with a factor loading of 0.65. • Two other variables, education and occupation, are also associated with Factor 1. Based on the variables loading highly onto Factor 1, we could call it “Individual socioeconomic status.” • House value, number of public parks, and number of violent crimes per year, however, have high factor loadings on the other factor, Factor 2. They seem to indicate the overall wealth within the neighborhood, so we may want to call Factor 2 “Neighborhood socioeconomic status.” • Since factor loadings can be interpreted like standardized regression coefficients, one could also say that the variable income has a correlation of 0.65 with Factor 1.
  • 24. Continue- • Variable house value also is marginally important in Factor 1 (loading = 0.38). • This makes sense, since the value of a person’s house should be associated with his or her income. • .50 loading denotes that 25% of variance is accounted for by the factor. • Loading must exceed .70 for factor to account for .50 variance. • Loadings .5 are considered practically significant and .7 are indicative of well defined structure.
  • 25. Communalities • Communalities indicate the amount of variance in each variable that is accounted for by the components. • Initial communalities are estimates of the variance in each variable accounted for by all components or factors. • For principal components extraction, this is always equal to 1.0 for correlation analyses. • Extraction communalities are estimates of the variance in each variable accounted for by the components. • The high communalities indicates that the extracted components represent the variables well. • If any communalities are very low in a principal components extraction, you may need to extract another component.
  • 26. Variance in factor analysis • Two types of variance- common and unique • Common variance is the amount of variance that is shared among a set of items. Items that are highly correlated will share a lot of variance. – Communality (also called h2) is a definition of common variance that ranges between 00 and 11. Values closer to 1 suggest that extracted factors explain more of the variance of an individual item. • Unique variance is any portion of variance that’s not common. There are two types: – Specific variance: is variance that is specific to a particular item – Error variance: comes from errors of measurement and basically anything unexplained by common or specific variance.
  • 27. Kaiser-Meyer-Olkin (KMO) Test - Measure of how suited data is for Factor Analysis. - Sampling adequacy for each variable in the model and for the complete model. - Measure of the proportion of variance among variables that might be common variance. - The lower the proportion, the more suited data to Factor Analysis.
  • 28. KMO Values • KMO values between 0 and 1. • A rule of thumb for interpreting the statistic - KMO values between 0.8 and 1 indicate the sampling is adequate. - KMO values less than 0.6 indicate the sampling is not adequate and that remedial action should be taken. - Some times we accept KMO values between 0.5 and 0.6.
  • 29. Bartlett test of sphericity Checks statistical significance that correlation matrix has significant correlation among at least some variables • Bartlett test should be significant i.e. less than 0.05 this means that the variables are correlated highly enough to provide a reasonable basis for factor analysis. • Correlation matrix is significantly different from an identity matrix in which correlations between variables are all zero. • Test compares an observed correlation matrix to the identity matrix. Null hypothesis of the test –variables are orthogonal, i.e. not correlated. Alternative hypothesis - variables are not orthogonal, i.e. they are correlated enough - correlation matrix diverges significantly from identity matrix.
  • 31. How many number of factors should be extracted? There are some criteria, but no 100% foolproof statistical test exists. • Drawing screeplot: Connect eigenvalues (as representing variances explained by each factor) for many possible factors from maximum to minimum. The adequate number of factors is before the sudden downward inflextion of the plot. • Parallel analysis: Compare actual screeplot with the possible screeplot based on randomly resampled data. The adequate number of factors is at the crossing point of the two plots. • Eigenvalues > 1: Eigenvalues sum to the number of items, so an eigenvalue more than 1 is more informative than a single average item.
  • 32. Factor rotation • Un rotated factor don’t provide the required information. • Axis of the factors can be rotated within the multidimensional variable space while determining the best fit between the variables and the latent factors. • Factor rotation improves interpretation of data by reducing ambiguities. • Rotation - reference axes of factors are turned about the origin - some best position has been achieved. • Un rotated factor solution extract factors in order of their variances extracted. • Effect of rotation - redistribute the variance from earlier factors to later ones.
  • 33. Orthogonal and oblique factor rotation • Orthogonal factor rotation - Axes are maintained at 90 degree. - More suitable when research goal is data reduction - Rotations that assume the factors are not correlated are called orthogonal rotations. • Oblique factor rotation - Axes are rotated but they don’t retain the 90 degree angle between reference axes. - Oblique is more flexible. - Best suited to the goal of obtaining several theoretically meaningful factors. - Rotations that allow for correlation are called oblique rotations.
  • 34. Correlation between factors • High Individual socioeconomic status (Factor 1) lives also in an area that has a high Neighborhood socioeconomic status (Factor 2). • That means the factors should be correlated. • the two axes of the two factors are probably closer together than an orthogonal rotation axis. • the angle between the two factors is now smaller than 90 degrees. • meaning the factors are now correlated. • In this example, an oblique rotation accommodates the data better than an orthogonal rotation.
  • 35. Sample Example  A researcher is interested to investigate the reasons of choosing a university for education. Several variables were identified which influence the individuals (guardians/students) to choose a university. • Variables- 1. Cost of Education 2. Quality of Education 3. Availability of experts and modern laboratories 4. Having own campus and security 5. Numbers of years operating 6. Number of graduates in job and abroad 7. International recognition and 8. Accommodations and food Seven Point Scale; 1=Not Important to 7=Very Important