ESA: Estimate Latent Factor Matrix With Known Number of Factors

Description Usage Arguments Details Value References Examples

View source: R/ESA_BCV.R


Estimate the latent factor matrix and noise variance using early stopping alternation (ESA) given the number of factors.


ESA(Y, r, X = NULL, center = F, niter = 3, svd.method = "fast")



observed data matrix. p is the number of variables and n is the sample size. Dimension is c(n, p)


The number of factors to use


the known predictors of size c(n, k) if any. Default is NULL (no known predictors). k is the number of known covariates.


logical, whether to add an intercept term in the model. Default is False.


the number of iterations for ESA. Default is 3.


either "fast", "propack" or "standard". "fast" is using the fast.svd function in package corpcor to compute SVD, "propack" is using the propack.svd to compute SVD and "standard" is using the svd function in the base package. Because of PROPACK issues, "propack" fails for some matrices, and when that happens, the function will use "fast" to compute the SVD of that matrix instead. Default method is "fast".


The model used is

Y = 1 μ' + X β + n^{1/2}U D V' + E Σ^{1/2}

where D and Σ are diagonal matrices, U and V are orthogonal and μ' and V' mean _mu transposed_ and _V transposed_ respectively. The entries of E are assumed to be i.i.d. standard Gaussian. The model assumes heteroscedastic noises and especially works well for high-dimensional data. The method is based on Owen and Wang (2015). Notice that when nonnull X is given or centering the data is required (which is essentially adding a known covariate with all 1), for identifiability, it's required that <X, U> = 0 or <1, U> = 0 respectively. Then the method will first make a rotation of the data matrix to remove the known predictors or centers, and then use the latter n - k (or n - k - 1 if centering is required) samples to estimate the latent factors.


The returned value is a list with components


the diagonal entries of estimated Σ which is a vector of length p


the estimated U. Dimension c(n, r)


the estimated diagonal entries of D which is a vector of length r


the estimated V. Dimension is c(p, r)


the estimated beta which is a matrix of size c(k, p). Return NULL if the argument X is NULL.


the estimated signal (factor) matrix S where

S = 1 μ' + X β + n^{1/2}U D V'


the sample centers of each variable which is a vector of length p. It's an estimate of μ. Return NULL if the argument center is False.


Art B. Owen and Jingshu Wang(2015), Bi-cross-validation for factor analysis,


Y <- matrix(rnorm(100), nrow = 10) + 3 * rnorm(10) %*% t(rep(1, 10))
ESA(Y, 1)

esaBcv documentation built on May 30, 2017, 4:09 a.m.