rPCA: Generate a Correlation Matrix from a Truncated PCA Loadings...

View source: R/rPCA.R

rPCAR Documentation

Generate a Correlation Matrix from a Truncated PCA Loadings Matrix.

Description

This function generates a random (or possibly unique) correlation matrix (R) from an unrotated or orthogonally rotated PCA loadings matrix via a modified alternating projections algorithm.

Usage

rPCA(
  F,
  epsMax = 1e-18,
  maxit = 2000,
  Seed = NULL,
  InitP2 = 2,
  Eigs = NULL,
  PrintLevel = 1
)

Arguments

F

(Matrix) A p (variables) by k (components) PCA loadings matrix. F can equal either an unrotated or an orthogonally rotated loadings matrix.

epsMax

(Scalar) A small number used to evaluate function convergence. Default (epsMax = 1E-18).

maxit

(Integer) An integer that specifies the maximum number of iterations of the modified alternating projections algorithm (APA).

Seed

(Integer) A user-defined starting seed for the random number generator. If Seed = NULL then rPCA will generate a random starting seed. Setting Seed to a positive integer will generate reproducible results. Default (Seed = NULL)

InitP2

(Integer) The method used to initiate the remaining columns of the truncated principal components solution. If InitP2 = 1 then the starting P2 will be a random semi-orthogonal matrix. If If InitP2 = 2 then the starting P2 will be a semi-orthogonal matrix that is in the left null space of P1. Default (InitP2 = 2). Of the two options, InitP2 = 2 generally converges to a single feasible solution in less time. InitP2 = 1 can be used to generate different solutions from different starting seeds.

Eigs

(Vector) Under some conditions, rPCA can generate (or reproduce) a unique correlation matrix with known (i.e., user-specified) eigenvalues from a truncated PC loadings matrix, F, even when the rank of F is less than p (the number of observed variables). Eigs is an optional p-length vector of eigenvalues for R. Default (Eigs = NULL).

PrintLevel

(Integer) If PrintLevel = 0 no output will be printed (choose this option for Monte Carlo simulations). If PrintLevel = 1 the program will print the APA convergence status and the number of iterations used to achieve convergence. If PrintLevel = 2 then rPCA will print the iteration convergence history of the modified APA algorithm. Default (PrintLevel = 1).

Value

  • R (Matrix) A p by p correlation matrix that generates the desired PCA loadings.

  • Tmat (Matrix) A k by k orthogonal rotation matrix that will rotate the unrotated PCA loadings matrix, P1, to F (if F is an orthogonally rotated loadings matrix).

  • P1 (Matrix) The p by k unrotated PCA loadings matrix that is associated with F.

  • Fhat (Matrix) The p by k estimated (and possibly rotated) PCA loadings matrix from the simulated matrix R.

  • error (Logical) A logical that indicates whether F is a legitimate PCA loadings matrix.

  • Lambda (Vector) The sorted eigenvalues of R.

  • iterHx (Vector) Criterion (i.e., fit) values for for each iteration of the modified APA algorithm.

  • converged (Logical) A logical that signifies function convergence.

  • Seed (Integer) Either a user-defined or function generated starting seed for the random number generator.

Author(s)

Niels G. Waller (nwaller@umn.edu)

References

Escalante, R. and Raydan, M. (2011). Alternating projection methods. Society for Industrial and Applied Mathematics.

ten Berge, J. M. and Kiers, H. A. (1999). Retrieving the correlation matrix from a truncated PCA solution: The inverse principal component problem. Psychometrika, 64(3), 317–324.

Examples


# External PCA function ---
# used to check results
 
PCA <- function(R, k = NULL){
  if(is.null(k)) k <- ncol(R)
  VLV <- eigen(R)
 V <- VLV$vectors
 L <- VLV$values

 
 if( k > 1){
   P <-  V[, 1:k] %*% diag(L[1:k]^.5)
 }
 else{
   P <- as.matrix(V[, 1], drop=False) * L[1]^.5
 }
  Psign <- sign(apply(P, 2, sum))
  if(k > 1) Psign = diag(Psign)
  P <- P %*%  Psign
 P
}#END PCA  

  
## Generate Desired Population rotated PCA loadings matrix
## Example = 1
 k = 2
 F <- matrix(0, 8, 2) 
 F[1:4, 1] <- seq(.75, .72, length= 4)  
 F[5:8, 2] <- seq(.65, .62, length= 4)  
 F[1,2] <- .1234
 F[8,1] <- .4321
 colnames(F) <-   paste0("F", 1:k) 
 (F)
 
 ## Run Example 1
 pout <- rPCA(F, 
              maxit = 5000, 
              Seed = 1, 
              epsMax = 1E-18,
              PrintLevel = 1)
pout$converged
eigen(pout$R)$values
if(pout$error == FALSE & pout$converged){ 
    Fhat <- pout$Fhat
    cat("\nPCA Loadings\n")
    ( round( cbind(F,Fhat ), 5) )
 }
 
 ## Example = 2      
 ## Single component example from Widaman 2018

 k = 1
 F <- matrix(rep(c(.8,.6, .4), each = 3 ), nrow = 9, ncol = 1)
 colnames(F) <-   paste0("F", 1:k) 
 (F)

 ## Run Example 2
 pout <- rPCA(F, 
              maxit = 5000, 
              Seed = 1, 
              epsMax = 1E-18,
              PrintLevel = 1)
 pout$converged
 pout$Fhat
 eigen(pout$R)$values
if(pout$error == FALSE & pout$converged){ 
    Fhat <- pout$Fhat
    cat("\nPCA Loadings\n")
    ( round( cbind(F,Fhat ), 5) )
 }
  
## Example 3 ----
## 2 Component example from Goldberg and Velicer (2006).
 k = 2
 F = matrix(c( .18, .75,
               .65, .19,
               .12, .69,
               .74, .06,
               .19, .80,
               .80, .14,
              -.05, .65,
               .71, .02), 8, 2, byrow=TRUE)
 colnames(F) <-   paste0("F", 1:k) 
 (F)

## Run Example 3
pout <- rPCA(F, 
            maxit = 5000, 
            Seed = 1, 
            epsMax = 1E-18,
            PrintLevel = 1)
pout$converged

eigen(pout$R)$values

if(pout$error == FALSE & pout$converged){ 
  Fhat <- pout$Fhat
  cat("\nPCA Loadings\n")
  ( round( cbind(F,Fhat ), 5) )
#
#
## Example 4
#
SEED = 4321
set.seed(SEED)
k= 3
## Generate eigenvalues for example R matrix
L7 <- eigGen(nDimensions = 7,
             nMaj = 3,
             PrcntMajor = .85,
             threshold = .8)

## Scree Plot
plot(1:7, L7, 
    type = "b", 
    ylim = c(0,4),
    main = "Scree Plot for R",
    ylab = "Eigenvalues",
    xlab = "Dimensions")

## Generate R
R <- rGivens(eigs=L7, Seed = SEED)$R
print( R, digits = 4)

#Extract loadings for 3 principal components
F <- PCA(R, k = k)

# rotate loadings with varimax to examine underlying structure
print( round(varimax(F)$loadings[], 3) )

## run rPCA with user-defined eigenvalues
rout <- rPCA(F,
            epsMax = 1e-20, 
            maxit = 25000, 
            Seed = SEED,   
            InitP2 = 1,
            Eigs = L7,
            PrintLevel = 1) 

## Compute PCA on generated R

Fhat <- PCA(rout$R, k = 3)
#
## align factors
Fhat <- fungible::faAlign(F, Fhat)$F2

## Compare solutions
print( round( cbind(F, Fhat), 5) )

## Compare Eigenvalues
print( cbind(L7, eigen(rout$R)$values ), digits=8) 
#
## Compare R matrices: 8 digit accuracy
print( round(R - rout$R, 8) )

}

fungible documentation built on May 29, 2024, 8:28 a.m.