Exploratory factor analysis

This page summarises key points about the use of exploratory factor analysis particularly for the purposes of psychometric instrument development. For a hands-on tutorial about the steps involved, see EFA tutorial.

Assumed knowledge

edit

Purposes of factor analysis

edit

There are two main purposes or applications of factor analysis:

1. Data reduction

Reduce data to a smaller set of underlying summary variables. For example, psychological questionnaires often aim to measure several psychological constructs, with each construct being measured by responses to several items. Responses to several related items are combined to create a single score for the construct. A measure which involves several related items is generally considered to be more reliable and valid than relying on responses to a single item.

2. Exploring theoretical structure

Theoretical questions about the underlying structure of psychological phenomena can be explored and empirically tested using factor analysis. For example, is intelligence better understood as a single, general factor, or as consisting of multiple, independent dimensions? Or, how many personality factors are there and what are they?

History

edit

Factor analysis was initially developed by Charles Spearman in 1904. For more information, see factor analysis history.

There are several requirements for a dataset to be suitable for factor analysis:

  1. Normality: Statistical inference is improved if the variables are multivariate normal[1]
  2. Linear relations between variables - Test by visually examining all or at least some of the bivariate scatterplots:
    1. Is the relationship linear?
    2. Are there bivariate outliers?
    3. Is the spread about the line of best fit homoscedastic (even (or cigar-shaped) as opposed to fanning in or out))?
    4. If there are a large number of variables (and bivariate scatterplots), then consider using Matrix Scatterplots to efficiently visualise relations amongst the sets of variables within each factor (e.g., a Matrix Scatterplot for the variables which belong to Factor 1, and another Matrix Scatterplot for the variables which belong to Factor 2 etc.)
  3. Factorability is the assumption that there are at least some correlations amongst the variables so that coherent factors can be identified. Basically, there should be some degree of collinearity among the variables but not an extreme degree or singularity among the variables. Factorability can be examined via any of the following:
    1. Inter-item correlations (correlation matrix) - are there at least several small-moderate sized correlations e.g., > .3?
    2. Anti-image correlation matrix diagonals - they should be > ~.5.
    3. Measures of sampling adequacy (MSAs):
      • Kaiser-Meyer-Olkin (KMO) (should be > ~.5 or .6)[2] and
      • Bartlett's test of sphericity (should be significant)
  4. Sample size: The sample size should be large enough to yield reliable estimates of correlations among the variables:
    1. Ideally, there should be a large ratio of N / k (Cases / Items) e.g., > ~20:1
      1. e.g., if there are 20 items in the survey, ideally there would be at least 400 cases)
    2. EFA can still be reasonably done with > ~5:1
    3. Bare min. for pilot study purposes, as low as 3:1.

For more information, see these lecture notes.

Types (methods of extraction)

edit

The researcher will need to choose between two main types of extraction:

  1. Principal components (PC): Analyses all variance in the items. This method is usually preferred when the goal is data reduction (i.e., to reduce a set of variables down to a smaller number of factors and to create composite scores for these factors for use in subsequent analysis).
  2. Principal axis factoring (PAF): Analyses shared variance amongst the items. This method is usually preferred when the goal is to undertake theoretical exploration of the underlying factor structure.

Rotation

edit

The researcher will need to choose between two main types of factor matrix rotation:

  1. Orthogonal (Varimax - in SPSS): Factors are independent (i.e., correlations between factors are less than ~.3)
  2. Oblique (Oblimin - in SPSS): Factors are related (i.e., at least some correlations between factors are greater than ~.3). The extent of correlation between factors can be controlled using delta[3].
    • Negative values "decrease" factor correlations (towards full orthogonality)
    • "0" is the default
    • Positive values (don't go over .8) "permit" higher factor correlations.

If the researcher hypothesises uncorrelated factors, then use orthogonal rotation. If the researchers hypothesises correlated factors, then use oblique rotation.

In practice, researchers will usually try different types of rotation, then decide on the best form of rotation based on the rotation which produces the "cleanest" model (i.e., with lowest cross-loadings).

Determining the number of factors

edit

There is no definitive, simple way to determine the number of factors. The number of factors is a subjective decision made by the researcher. The researcher should be guided by several considerations, including:

  1. Theory: e.g., How many factors were expected? Do the extracted factors make theoretical sense?
  2. Eigen values:
    1. Kaiser's criterion: How many factors have eigen-values over 1? Note, however, that this cut-off is arbitrary, so is only a general guide and other considerations are also important.
    2. Scree-plot: Plots eigen-values. Look for the 'elbow' minus 1 (i.e., where there is a notable drop); the rest is 'scree'. Extract the number of factors that make up the 'cliff' (i.e., which explain most of the variance).
    3. Total variance explained: Ideally, try to explain approximately 50 to 75% of the variance using the least number of factors
  3. Interpretability: Are all factors interpretable? (especially the last one?) In other words, can you reasonably name and describe each set of items as being indicative of an underlying factor?
  4. Alternative models: Try several different models with different numbers of factors before deciding on a final model and number of factors. Depending on the Eigen Values and the screen plot, examine, say, 2, 3, 4, 5, 6 and 7 factor models before deciding.
  5. Remove items that don't belong: Having decided on the number of factors, items which don't seem to belong should be removed because this can potentially change and clarify the structure/number of factors. Remove items one at a time and then re-run. After removing all items which don't seem to belong, re-check whether you still have a clear factor structure for the targetted number of factors. It may be that a different number of factors (probably one or two fewer) is now more appropriate. For more information, see criteria for selecting items.
  6. Number of items per factor: The more items per factor, the greater the reliability of the factor, but the law of diminishing returns would apply. Nevertheless, a factor could, in theory, be indicated by as little as a single item.
  7. Factor correlations - What are the correlations between the factors? If they are too high (e.g., over ~.7), then some of the factors may be too similar (and therefore redundant). Consider merging the two related factors (i.e., run an EFA with one less factor).
  8. Check the factor structure across sub-samples - For example, is the factor structure consistent for males and females? (e.g., in SPSS this can be done via Data - Split file - Compare Groups or Organise Output by Groups - Select a categorical variable to split the analyses by (e.g., Gender) - Paste/Run or OK - Then re-run the EFA syntax)

Mistakes in factor extraction may consist in extracting too few or too many factors. A comprehensive review of the state-of-the-art and a proposal of criteria for choosing the number of factors is presented in [3] [4].

Criteria for selecting items

edit

In general, aim for a simple factor structure (unless there is a particular reason why a complex structure would be preferable). In a simple factor structure each item has a relatively strong loading on one factor (target loading; e.g., > |.5|) and relatively small loadings on other factors (cross-loadings; e.g., < |.3|).

Consider the following criteria to help decide whether to include or remove each item. Remember that these are rules of thumb only – avoid over-reliance on any single indicator. The overarching goal is to include items which contribute to a meaningful measure of an underlying factor and to remove items that weaken measurement of the underlying factor(s). In making these decisions, consider:

  1. Communality - indicates the variance in each item explained by the extracted factors; ideally, above .5 for each item.
  2. Primary (target) factor loading - indicates how strongly each item loads on each factor; should generally be above |.5| for each item; preferably above |.6|.
  3. Cross-loadings - indicate how strongly each item loads on the other (non-target) factors. There should be a gap of at least ~.2 between the primary target loadings and each of the cross-loadings. Cross-loadings above .3 are worrisome.
  4. Meaningful and useful contribution to a factor - read the wording of each item and consider the extent to which each item appears to make a meaningful and useful (non-redundant) contribution to the underlying target factor (i.e., assess its face validity)
  5. Reliability - check the internal consistency of the items included for each factor using Cronbach's alpha and check the "Alpha if item removed" option to determine whether removal of any additional items would improve reliability)
  6. See also: How do I eliminate items? (lecture notes)

Name and describe the factors

edit

Once the number of factors has been decided and any items which don't belong have been removed, then

  1. Give each extracted factor a name
    1. Be guided by the items with the highest primary loadings on the factor – what underlying factor do they represent?
    2. If unsure, emphasise the top loading items in naming the factor
  2. Describe each factor
    1. Develop a one sentence definition or description of each factor

Data analysis exercises

edit

Pros and cons

edit
Basic terms
  1. Anti-image correlation matrix: Contains the negative partial covariances and correlations. Diagonals are used as a measure of sampling adequacy (MSA). Note: Be careful not to confuse this with the anti-image covariance matrix.
  2. Bartlett's test of sphericity: Statistical test for the overall significance of all correlations within a correlation matrix. Used as a measure of sampling adequacy (MSA).
  3. Common variance: Variance in a variable that is shared with other variables.
  4. Communality: The proportion of a variable's variance explained by the extracted factor structure. Final communality estimates are the sum of squared loadings for a variable in an orthogonal factor matrix.
  5. Complex variable: A variable which has notable loadings (e.g., > .4) on two or more factors.
  6. Correlation: The Pearson or product-moment correlation coefficient.
  7. Composite score: A variable which represents combined responses to multiple other variables. A composite score can be created as unit-weighted or regression-weighted. A composite score is created for each case for each factor.
  8. Correlation matrix: A table showing the linear correlations between all pairs of variables.
  9. Data reduction: Reducing the number of variables (e.g., by using factor analysis to determine a smaller number of factors to represent a larger set of factors).
  10. Eigen Value: Column sum of squared loadings for a factor. Represents the variance in the variables which is accounted for by a specific factor.
  11. Exploratory factor analysis: A factor analysis technique used to explore the underlying structure of a collection of observed variables.
  12. Extraction: The process for determining the number of factors to retain.
  13. Factor: Linear combination of the original variables. Factors represent the underlying dimensions (constructs) that summarise or account for the original set of observed variables.
  14. Factor analysis: A statistical technique used to estimate factors and/or reduce the dimensionality of a large number of variables to a fewer number of factors.
  15. Factor loading: Correlation between a variable and a factor, and the key to understanding the nature of a particular factor. Squared factor loadings indicate what percentage of the variance in an original variable is explained by a factor.
  16. Factor matrix: Table displaying the factor loadings of all variables on each factor. Factors are presented as columns and the variables are presented as rows.
  17. Factor rotation: A process of adjusting the factor axes to achieve a simpler and pragmatically more meaningful factor solution - the goal is a usually a simple factor structure.
  18. Factor score: Composite score created for each observation (case) for each factor which uses factor weights in conjunction with the original variable values to calculate each observation's score. Factor scores are standardised to according to a z-score.
  19. Measure of sampling adequacy (MSA): Measures which indicate the appropriateness of applying factor analysis.
  20. Oblique factor rotation: Factor rotation such that the extracted factors are correlated. Rather than arbitrarily constraining the factor rotation to an orthogonal (90 degree angle), the oblique solution allows the factors to be correlated. In SPSS, this is called Oblimin rotation.
  21. Orthogonal factor rotation: Factor rotation such that their axes are maintained at 90 degrees. Each factor is independent of, or orthogonal to, all other factors. In SPSS, this is called Varimax rotation.
  22. Parsimony principle: When two or more theories explain the data equally well, select the simplest theory e.g., if a 2-factor and a 3-factor model explain about the same amount of variance, interpret the 2-factor model.
  23. Principal axis factoring (PAF): A method of factor analysis in which the factors are based on a reduced correlation matrix using a priori communality estimates. That is, communalities are inserted in the diagonal of the correlation matrix, and the extracted factors are based only on the common variance, with unique variance excluded.
  24. Principal component analysis (PC or PCA): The factors are based on the total variance of all items. [5][6]
  25. Scree plot: A line graph of Eigen Values which is helpful for determining the number of factors. The Eigen Values are plotted in descending order. The number of factors is chosen where the plot levels off (or drops) from cliff to scree.
  26. Simple structure: A pattern of factor loading results such that each variable loads highly onto one and only one factor.
  27. Unique variance: The proportion of a variable's variance that is not shared with a factor structure. Unique variance is composed of specific and error variance.
Other
  1. Common factor: A factor on which two or more variables load.
  2. Common factor analysis: A statistical technique which uses the correlations between observed variables to estimate common factors and the structural relationships linking factors to observed variables.
  3. Error variance: Unreliable and inexplicable variation in a variable. Error variance is assumed to be independent of common variance, and a component of the unique variance of a variable.
  4. Image of a variable: The component of a variable which is predicted from other variables. Antonym: anti-image of a variable.
  5. Indeterminacy: If it is impossible to estimate population factor structures exactly because an infinite number of factor structures can produce the same correlation matrix, then there are more unknowns than equations in the common factor model, and we say that the factor structure is indeterminate.
  6. Latent factor: A theoretical underlying factor hypothesised to influence a number of observed variables. Common factor analysis assumes latent variables are linearly related to observed variables.
  7. Specific variance: (1) Variance of each variable unique to that variable and not explained or associated with other variables in the factor analysis. [7] (2) The component of unique variance which is reliable but not explained by common factors. [8]


References

edit
  1. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272–299.
  2. Tabachnick, B. G. & Fidell, L. S. (2001). Principal components and factor analysis. In Using multivariate statistics (4th ed., pp. 582–633). Needham Heights, MA: Allyn & Bacon.
  3. Iantovics, L.B., Rotar, C., Morar, F.: Survey on establishing the optimal number of factors in exploratory factor analysis applied to data mining, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9(2), 2019, e1294.

See also

edit
Wikiversity
Wikipedia & Wikibooks
edit