R language Access Menu

Title Text Both  

Factor analysis

This technique is used to reduce dimensions by combining multiple related variables into factors, like principal component analysis (PCA). Some consider PCA to be a type of factor analysis. Basic commands using fa() function of psych package are as follows (the data should have only numeric values, hence birtwt dataset is used here):

Code:

> library(psych)
> ff = fa(birthwt[-1], 4)   # 4 is number of factors needed
> ff
Factor Analysis using method =  minres
Call: fa(r = birthwt[-1], nfactors = 4)
Standardized loadings (pattern matrix) based upon correlation matrix
        MR1   MR2   MR3   MR4   h2    u2 com
age   -0.05 -0.01  0.04  0.65 0.41 0.588 1.0
lwt    0.01  0.25 -0.30  0.23 0.24 0.759 2.8
race  -0.39  0.01  0.28 -0.23 0.29 0.708 2.5
smoke  0.99  0.00  0.03 -0.02 1.00 0.005 1.0
ptl    0.12 -0.03  0.37  0.17 0.16 0.838 1.6
ht     0.00  1.00  0.01 -0.01 1.00 0.005 1.0
ui    -0.02 -0.12  0.48  0.05 0.23 0.771 1.2
ftv   -0.02 -0.07 -0.05  0.32 0.12 0.882 1.1
bwt   -0.08 -0.12 -0.60  0.01 0.40 0.600 1.1

                       MR1  MR2  MR3  MR4
SS loadings           1.17 1.09 0.91 0.67
Proportion Var        0.13 0.12 0.10 0.07
Cumulative Var        0.13 0.25 0.35 0.43
Proportion Explained  0.30 0.28 0.24 0.18
Cumulative Proportion 0.30 0.59 0.82 1.00

 With factor correlations of 
     MR1  MR2   MR3   MR4
MR1 1.00 0.02  0.15  0.02
MR2 0.02 1.00  0.03  0.00
MR3 0.15 0.03  1.00 -0.28
MR4 0.02 0.00 -0.28  1.00

Mean item complexity =  1.5
Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are  36  and the objective function was  0.77 with Chi Square of  142.28
The degrees of freedom for the model are 6  and the objective function was  0.05 

The root mean square of the residuals (RMSR) is  0.03 
The df corrected root mean square of the residuals is  0.07 

The harmonic number of observations is  189 with the empirical chi square  10.23  with prob <  0.12 
The total number of observations was  189  with MLE Chi Square =  9.05  with prob <  0.17 

Tucker Lewis Index of factoring reliability =  0.824
RMSEA index =  0.055  and the 90 % confidence intervals are  NA 0.116
BIC =  -22.4
Fit based upon off diagonal values = 0.96
Measures of factor score adequacy             
                                                MR1  MR2  MR3  MR4
Correlation of scores with factors             1.00 1.00 0.76 0.72
Multiple R square of scores with factors       0.99 0.99 0.58 0.52
Minimum correlation of possible factor scores  0.99 0.99 0.17 0.04

The function fa.diagram() of psych package plots a path diagram structure of factors:

Code:

    > fa.diagram(ff)

Output graph:

                        

 

The diagram shows that there is common factor behind bwt, lwt, ptl and ui. The variables ui and ptl are inversely related to bwt and lwt. The values indicate loadings of the factors by variables (placed in descending order of importance for the factor). 
Number of factors to be extracted in factor analysis:

This is often an issue to be solved. The package nFactors provides many functions for it. They can be combined into one function: 

Code:

> myfn_nfactors = function(ddf){
    library(nFactors)
    ev = eigen(cor(ddf)) # get eigenvalues
    ap = parallel(subject=nrow(ddf),var=ncol(ddf), rep=100,cent=.05)
    nS = nScree(x=ev$values, aparallel=ap$eigen$qevpea)
    nn = sum(nS$Analysis[1]>1)
    plotnScree(nS)
    nn
}
> myfn_nfactors(birthwt[-1])
[1] 4

Hence, 4 is the optimal number of factors to be extracted. A graph for this 'Scree' test is also plotted. Factors above the eigenvalue of 1 are generally considered:

Output graph:

                     

References:
Revelle, W. (2014) psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, http://CRAN.R-project.org/package=psych Version = 1.4.8.
 


    Comments & Feedback