Survey research and design in psychology/Lectures/Exploratory factor analysis/Notes

Resource type: this resource contains a lecture or lecture notes.

Exploratory factor analysis lecture notes

About these notes

These lecture notes:

were converted from .odp slides to mediawiki syntax.
subsequently copyedited and further wikified (ongoing)
they do not yet include all:
1. images
2. additional notes (i.e., provide slide text only)
3. notes from updated slides

To see an embedded version of this presentation:

Exploratory factor analysis (slideshare)

What is factor analysis?

A multivariate statistical technique for identifying clusters of inter-correlated variables (or 'factors').
A family of techniques to examine linear correlations amongst variables.
Aim is to identify groups of variables which are relatively homogeneous.
Groups of related variables are called 'factors'.
Involves empirical testing of theoretical data structures

Purposes

There are two main applications of factor analytic techniques; to:

reduce the number of variables, and
detect structure in the relationships between variables, that is, to classify variables.

Factor analysis is commonly used in psychometric instrument development.

History

Invented by Spearman (1904)
Usage was hampered by onerousness of hand calculation
Since the advent of computers, usage has thrived, esp. to develop:
Theory-e.g., determining the structure of personality
Practice-e.g., development of 10,000s+ of psychological screening & measurement tests
IQ -is it separate but multiple, related factors, e.g,.
Verbal
Mathematical
Interpersonal, etc.

...or is it one global factor (g)?

...or is it hierarchically structured?

"Introduced and established by Pearson in 1901 and Spearman three years thereafter, factor analysis is a process by which large clusters and grouping of data are replaced and represented by factors in the equation. As variables are reduced to factors, relationships between the factors begin to define the relationships in the variables they represent (Goldberg & Digman, 1994). In the early stages of the process' development, there was little widespread use due largely in part to the immense amount of hand calculations required to determine accurate results, often spanning periods of several months. Later on a mathematical foundation would be developed aiding in the process and contributing to the later popularity of the methodology. In present day, the power of super computers makes the use of factor analysis a simplistic process compared to the 1900's when only the devoted researchers could use it to accurately attain results (Goldberg & Digman, 1994)." (from http://www.personalityresearch.org/papers/fehringer.html)
Terminology was introduced by Thurstone (1931) - http://www.statsoft.com/textbook/stfacan.html

Conceptual models

Here are some visual ways of conceptualising factor analytic models:

2-d

Hierarchical

A simple factor analytic model (2-d) ... e.g., 12 items testing might actually tap only 3 underlying factors
Figure 2.1 (DeCoster, 1998) (pdf)

Cluster

Factor analysis uses correlations among many items to search for common clusters...Exploratory factor analysis is a tool to help a researcher ‘throw a hoop’ around clusters of related items, to distinguish between clusters, and to identify and eliminate irrelevant or indistinct (overlapping) items.

3-d

Figure 3 (Clemants & Moore, 2003) (gif/html).

Factor analysis process

FA can be conceived of as a method for examining a matrix of correlations in search of clusters of highly correlated variables.
A major purpose of factor analysis is data reduction, i.e., to reduce complexity in the data, by identifying underlying (latent) clusters of association.

Examples of psychological factor structures

Intelligence

IQ – does intelligence consist of separate factors, e.g,.

Verbal
Mathematical
Interpersonal, etc.?

...or is it one global factor (g)?

...or is it hierarchically structured?

Personality

Personality – does it consist of 2, 3, or 5, 16, etc. factors? e.g., the “Big 5”?

Neuroticism
Extraversion
Agreeableness
Openness
Conscientiousness

Essential facial features

Six orthogonal factors, represent 76.5 % of the total variability in facial recognition (in order of importance):

upper-lip
eyebrow-position
nose-width
eye-position
eye/eyebrow-length
face-width.

Problems

Problems with factor analysis include:

Mathematically complicated
Technical vocabulary
Results usually absorb a dozen or so pages
Students do not ordinarily learn factor analysis
Lay people find the results incomprehensible

“The very power of FA to create apparent order from real chaos contributes to its somewhat tarnished reputation as a scientific tool” - Tabachnick & Fidell (2001)

“It is mathematically complicated and entails diverse and numerous considerations in application. Its technical vocabulary includes strange terms such as eigenvalues, rotate, simple structure, orthogonal, loadings, and communality. Its results usually absorb a dozen or so pages in a given report, leaving little room for a methodological introduction or explanation of terms. Add to this the fact that students do not ordinarily learn factor analysis in their formal training, and the sum is the major cost of factor analysis: most laymen, social scientists, and policy-makers find the nature and significance of the results incomprehensible.” Rummell - http://www.hawaii.edu/powerkills/UFA.HTM

Exploratory vs. Confirmatory Factor Analysis

EFA = Exploratory Factor Analysis

explores & summarises underlying correlational structure for a data set

CFA = Confirmatory Factor Analysis

tests the correlational structure of a data set against a hypothesised structure and rates the 'goodness of fit'

Data reduction

Simplifies data by revealing a smaller number of underlying factors
Helps to eliminate:
1. redundant variables - (e.g, items which are highly correlated are unnecessary)
2. unclear variables - (e.g., items which don’t load cleanly on a single factor)
3. irrelevant variables - (e.g., variables which have low loadings)

Steps

Test assumptions
Select type of analysis
1. Extraction (PC/PAF)
2. Rotation (Orthogonal/Oblique)
Determine no. of factors
Identify which items belong in each factor
Drop items as necessary and repeat steps 3 to 4
Name and define factors
Examine correlations amongst factors
Analyse internal reliability

Garbage In - Garbage Out

Assumption testing

Sample size

Here are some guidelines about sample size for exploratory factor analysis. Factor analysis requires a reasonable sample size in order to be effective - and requires a larger sample size than for multivariate analyses such as multiple linear regression and ANOVA. Also note that factor analysis is based on correlations amongst the items, so a good estimate of each pair-wise correlation is needed - e.g., check the scatterplots.

Typical guidelines for factor analysis sample size requirements reported in the research method literature are:

A Total N > 200 is recommended. Comrey and Lee (1992) provide this description of total sample sizes' adequacy for factor analysis:
1. 50 = very poor,
2. 100 = poor,
3. 200 = fair,
4. 300 = good,
5. 500 = very good
6. 1000+ = excellent
Min/Ideal sample size based on variable:factor ratio
1. Min. N > 5 cases per variable (item)
  e.g., if I have 30 variables, I should have at least 150 cases (i.e., 1:5)
2. Ideal N > 20 cases per variable
  e.g., if I have 30 variables, I would ideally have at least 600 cases (1:20)

For more information, see EFA assumptions.

Example factor analysis - Classroom behaviour

Based on Francis Section 5.6, which is based on the Victorian Quality Schools Project (google search).
15 classroom behaviours of high-school children were rated by teachers using a 5-point scale.
Task: Identify groups of variables (behaviours) that are strongly inter-related & represent underlying factors.

Items

Cannot concentrate ↔ can concentrate
Curious & enquiring ↔ little curiousity
Perseveres ↔ lacks perseverance
Irritable ↔ even-tempered
Easily excited ↔ not easily excited
Patient ↔ demanding
Easily upset ↔ contented
Control ↔ no control
Relates warmly to others ↔ provocative,disruptive
Persistent ↔ easily frustrated
Difficult ↔ easy
Restless ↔ relaxed
Lively ↔ settled
Purposeful ↔ aimless
Cooperative ↔ disputes

Assumption testing

LOM

All variables must be suitable for correlational analysis, i.e., they should be ratio/metric data or at least Likert data with several interval levels.

Normality

FA is robust to assumptions of normality(if the variables are normally distributed then the solution is enhanced)

Linearity

Because FA is based on correlations between variables, it is important to check there are linear relations amongst the variables (i.e., check scatterplots)

Outliers

FA is sensitive to outlying cases
Bivariate outliers(e.g., check scatterplots)
Multivariate outliers (e.g., Mahalanobis' distance)
Identify outliers, then remove or transform

Factorability

It is important to check the factorability of the correlation matrix (i.e., how suitable is the data for factor analysis?)
Check correlation matrix for correlations over .3
Check the anti-image matrix for diagonals over .5
Check measures of sampling adequacy (MSAs)
Bartlett's
KMO

Correlations

The most manual and time consuming but thorough and accurate way to examine the factorability of a correlation matrix is simply to examine each correlation in the correlation matrix
Take note whether there are SOME correlations over .30 -if not, reconsider doing an FA-remember garbage in, garbage out

Anti-image correlation matrix

Anti-image: Medium effort, reasonably accurate
Examine the diagonals on the anti-image correlation matrix to assess the sampling adequacy of each variable
Variables with diagonal anti-image correlations of less that .5 should be excluded from the analysis -they lack sufficient correlation with other variables

Measures of sampling adequacy

Quickest method, but least reliable
Global diagnostic indicators - correlation matrix is factorable if:
Bartlett's test of sphericity is significant and/or
Kaiser-Mayer Olkin (KMO) measure of sampling adequacy > .5

Summary

Are there several correlations > .3?
Are the anti-image matrix diagonals > .5?
Is Bartlett's test significant?
Is KMO > .5 to .6?(depends on whose rule of thumb)

Extraction method

There are two main approaches to EFA based on:

Analysing only shared variancePrinciple Axis Factoring (PAF)
Analysing all variancePrinciple Components (PC)

Principal components (PC)

More common
More practical
Used to reduce data to a set of factor scores for use in other analyses
Analyses all the variance in each variable

Principal axis factoring (PAF)

Used to uncover the structure of an underlying set of p original variables
More theoretical
Analyses only shared variance(i.e. leaves out unique variance)

Total variance explained

Often there is little difference in the solutions for the two procedures.
It's a good idea to check your solution using both techniques
If you get a different solution between the two methods try to work out why and decide on which solution is more appropriate

==Communalities==p

The proportion of variance in each variable which can be explained by the factors
Communality for a variable = sum of the squared loadings for the variable on each of the factors
Communalities range between 0 and 1
High communalities (> .5) show that the factors extracted explain most of the variance in the variables being analysed
Low communalities (< .5) mean there is considerable variance unexplained by the factors extracted
May then need to extract MORE factors to explain the variance

Eigen values

EV = sum of squared correlations for each factor
EV = overall strength of relationship between a factor and the variables
Successive EVs have lower values
Eigen values over 1 are 'stable'

Explained variance

A good factor solution is one that explains the most variance with the fewest factors
Realistically happy with 50-75% of the variance explained

How many factors?

A subjective process ... Seek to explain maximum variance using fewest factors, considering:

Theory -what is predicted/expected?
Eigen Values > 1? (Kaiser's criterion)
Scree Plot -where does it drop off?
Interpretability of last factor?
Try several different solutions?
Factors must be able to be meaningfully interpreted & make theoretical sense?

Aim for 50-75% of variance explained with 1/4 to 1/3 as many factors as variables/items.
Stop extracting factors when they no longer represent useful/meaningful clusters of variables
Keep checking/clarifying the meaning of each factor and its items.

Scree plot

A bar graph of Eigen Values
Depicts the amount of variance explained by each factor.
Look for point where additional factors fail to add appreciably to the cumulative explained variance.
1st factor explains the most variance
Last factor explains the least amount of variance
Factor loadings (FLs) indicate the relative importance of each item to each factor.
In the initial solution, each factor tries â€œselfishlyâ€ to grab maximum unexplained variance.
All variables will tend to load strongly on the 1st factor
Factors are made up of linear combinations of the variables (max. poss. sum of squared rs for each variable)

Initial solution - Unrotated factor structure

1st factor extracted:

best possible line of best fit through the original variables
seeks to explain maximum overall variance
a single summary of the main variance in set of items
Each subsequent factor tries to maximise the amount of unexplained variance which it can explain.

Second factor extracted:

orthogonal to first factor - seeks to maximize its own eigen value (i.e., tries to gobble up as much of the remaining unexplained variance as possible)
Vectors = lines of best fit
Seldom see a simple unrotated factor structure
Many variables will load on 2 or more factors
Some variables may not load highly on any factors
Until the FLs are rotated, they are difficult to interpret.
Rotation of the FL matrix helps to find a more interpretable factor structure.

Rotation

Two basic types

Orthogonal (Varimax): Minimises factor covariation, produces factors which are uncorrelated
Oblique (Oblimin): allows factors to covary, allows correlations between factors.

Why rotate a factor loading matrix?

After rotation, the vectors (lines of best fit) are rearranged to optimally go through clusters of shared variance
Then the FLs and the factor they represent can be more readily interpreted
A rotated factor structure is simpler & more easily interpretable
each variable loads strongly on only one factor
each factor shows at least 3 strong loadings
all loading are either strong or weak, no intermediate loadings

Orthogonal versus oblique rotations

Think about purpose of factor analysis
Try both
Consider interpretability
Look at correlations between factors in oblique solution - if >.32 then go with oblique rotation (>10% shared variance between factors)

Factor structure

Factor structure is most interpretable when:

Each variable loads strongly on only one factor
Each factor has three or more strong loadings
Most factor loadings are either high or low with few of intermediate value

(Loadings of > +.40 are generally OK)

How do I eliminate items?

Elminating items from an EFA is a subjective process, but consider:

Communalities (each ideally > .5)
Size of main loading (bare min > |.4|, preferably > |.5|, ideally > |.6|)
Meaning of item (face validity)
Contribution it makes to the factor (i.e., is a better measure of the latent factor achieved by including or not including this item?)
Number of items already in the factor (i.e., if there are already many items (e.g., > 6) in the factor, then the researcher can be more selective about which ones to include and which ones to drop)
Eliminate 1 variable at a time, then re-run, before deciding which/if any items to eliminate next
(Size of cross loadings max ~ |.3|)

How many items per factor?

Bare min. = 2
Recommended min. = 3
Max. = unlimited
More items
1. -> greater reliability
2. -> more 'roundedness'
3. -> Law of diminishing returns
Typically = 4 to 10 is reasonable

Interpretability

The researcher must be able to understand and interpret a factor if it is going to be extracted.
Be guided by theory and common sense in selecting factor structure.
However, watch out for 'seeing what you want to see' when factor analysis evidence might suggest a different solution.
There may be more than one good solution! e.g.,
1. 2 factor model of personality
2. 5 factor model of personality
3. 16 factor model of personality

Factor loadings & item selection

A factor structure is most interpretable when:

Each variable loads strongly on only one factor (strong is > +.40).
Each factor shows 3 or more strong loadings, more = greater reliability.
Most loadings are either high or low, few intermediate values.
These elements give a 'simple' factor structure.

Factor loading guidelines (Comrey & Lee, 1992)

Loadings:

> .70 - excellent
> .63 - very good
> .55 - good
> .45 - fair
> .32 - poor

Example - Condom use

The Condom Use Self-Efficacy Scale (CUSES) was administered to 447 multicultural college students.
PC FA with a varimax rotation.
Three distinct factors were extracted:
1. 'Appropriation'
2. 'Sexually Transmitted Diseases'
3. 'Partners' Disapproval'
Barkley, T. W. Jr., & Burns, J. L. (2000). Factor analysis of the Condom Use Self-Efficacy Scale among multicultural college students. Health Education Research, 15(4), 485-489.
Condom Use Self-Efficacy Scale (CUSES)

Summary

Factor analysis is a family of multivariate correlational data analysis methods for summarising clusters of covariance.
FA analyses and summarises correlations amongst items
These common clusters (the factors) can be used as summary indicators of the underlying construct

Assumptions

5+ cases per variables (ideal is 20 per)
N > 200
Outliers
Factorability of correlation matrix
Normality enhances the solution

Steps

Communalities
Eigen Values & % variance
Scree Plot
Number of factors extracted
Rotated factor loadings
Theoretical underpinning

Type of FA

PC vs. PAF
PC for data reduction e.g., computing composite scores for another analysis (uses all variance)
PAF for theoretical data exploration (uses shared variance)
Choose technique depending on the goal of your analysis.

Rotation

Rotation
orthogonal -perpendicular vectors
oblique -angled vectors
Try both ways -are solutions different?

Factor analysis in practice

To find a good solution, most researchers, try out each combination of

PC-varimax
PC-oblimin
PAF-varimax
PAF-oblimin

The above methods would then be commonly tried out on a range of possible/likely factors, e.g., for 2, 3, 4, 5, 6, and 7 factors

Try different numbers of factors
Try orthogonal & oblimin solutions
Try eliminating poor items
Conduct reliability analysis
Check factor structure across sub-groups if sufficient data
You will probably come up with a different solution from someone else!

No. of factors to extract?

Inspect EVs - look for > 1
% of variance explained
Inspect scree plot
Communalities
Interpretability
Theoretical reason

Survey research and design in psychology/Lectures/Exploratory factor analysis/Notes

About these notes

What is factor analysis?

Purposes

History

Conceptual models

2-d

Hierarchical

Cluster

3-d

Factor analysis process

Examples of psychological factor structures

Intelligence

Personality

Essential facial features

Problems

Exploratory vs. Confirmatory Factor Analysis

EFA = Exploratory Factor Analysis

CFA = Confirmatory Factor Analysis

Data reduction

Steps

Garbage In - Garbage Out

Assumption testing

Sample size

Example factor analysis - Classroom behaviour

Items

Assumption testing

LOM

Normality

Linearity

Outliers

Factorability

Correlations

Anti-image correlation matrix

Measures of sampling adequacy

Summary

Extraction method

Principal components (PC)

Principal axis factoring (PAF)

Total variance explained

Eigen values

Explained variance

How many factors?

Scree plot

Initial solution - Unrotated factor structure

Rotation

Two basic types

Why rotate a factor loading matrix?

Orthogonal versus oblique rotations

Factor structure

How do I eliminate items?

How many items per factor?

Interpretability

Factor loadings & item selection

Factor loading guidelines (Comrey & Lee, 1992)

Example - Condom use

Summary

Assumptions

Steps

Type of FA

Rotation

Factor analysis in practice

See also