jaws.pca: The Jackstraw Weighted Shrinkage Estimation Method for Sparse...

Description Usage Arguments Details Value Detailed explanation of the output contained in PIP and PNV Author(s) References See Also Examples

Description

Estimates sparse/shrunken loadings of principal component analysis. Based on statistical sginificance of association between variables and principal components, the sample loadings of principal components are shruken towards zeros, which improve its accuracy. The only required inputs are the data matrix dat and the number of principal components r whose loadings you would like to estimate.

Usage

1
2
3
jaws.pca(dat, p = NULL, r = NULL, s = NULL, B = NULL,
  stat.shrinkage = "F-statistics", extra.shrinkage = NULL, verbose = TRUE,
  seed = NULL, save.all = TRUE)

Arguments

dat

a data matrix with m rows as variables and n columns as observations.

p

a m * r matrix of p-values for association tests between variables and r principal components, generally computed from the jackstraw method. If p is not given, jackstraw.PCA is automatically applied.

r

a number (a positive integer) of significance principal components.

s

a number (a positive integer) of “synthetic” null variables (optional).

B

a number (a positive integer) of resampling iterations (optional).

stat.shrinkage

PNV shrinkage may be applied to "F-statistics" or "loadings" (default: F-statistics).

extra.shrinkage

extra shrinkage methods may be used; see details below (optional).

verbose

a logical specifying to print the progress (default: TRUE).

seed

a seed for the random number generator (optional).

save.all

a logical specifying to save all objects, including a large SVD object (default: FALSE).

Details

By default, jaws.pca computes two canonical jackstraw weighted shrinkage estimators, namely PIP and PNV. Additionally, other extra shrinkage techniques may apply, such as combining two canonical estimaotrs by setting extra.shrinkage="PIPhard" and applying soft-thresholding to local fdr by setting extra.shrinkage to numerical threshold values between 0 and 1. Please provide r numerical threshold values to be applied to r principal components.

It is strongly advised that you take a careful look at your data and use appropriate graphical and statistical criteria to determine a number of significant PCs, r. For example, see a contributed R package called ‘nFactors’. In a case when you fail to specify r, r will be estimated from permutation Parallel Analysis (Buja and Eyuboglu, 1992) via a function permutationPA, with a very liberal threshold.

If s is not supplied, s is set to about 10% of m variables. If B is not supplied, B is set to m*10/s.

Value

jaws.pca returns a list consisting of

p

p-values for association tests between variables and each of r principal components

pi0

proportion of variables not associated with r principal components, individually

svd

SVD object from decomposing dat

PIP

a list of outputs derived from the posterior inclusion probabilities method (including pr, u, var, PVE)

PNV

a list of outputs derived from the proportion of null variables method (including pi0, u, var, PVE)

With appropriate extra.shrinkage options (for details, see the Supplementary Information of Chung and Storey (2013), the output may also include

PIPhard

a list of outputs from hard-threshoding the PIP loadings (including u, var, PVE)

PIPsoft

a list of outputs from soft-threshoding the PIP loadings (including pr, u, var, PVE)

Detailed explanation of the output contained in PIP and PNV

pr

a matrix of posterior inclusion probabilities (equivalent to 1-lfdr) for m coefficients and r PCs.

pi0

a vector of estimated proportion of null variables for r PCs.

u

a m*r matrix of shrunken loadings.

var

a vector of shrunken variances explained by r PCs.

PVE

a vector of shrunken percent variances explained by r PCs.

Author(s)

Neo Chung nchchung@gmail.com

References

Chung and Storey (2015) Forthcoming

See Also

jackstraw.PCA jaws.cov

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
set.seed(1234)
## simulate data from a latent variable model: Y = BX + E
B = c(rep(1,50),rep(-1,50), rep(0,900))
X = rnorm(20)
E = matrix(rnorm(1000*20), nrow=1000)
dat = B %*% t(X) + E
dat = t(scale(t(dat), center=TRUE, scale=FALSE))

## estimate sparse loadings in PCA
jaws.pca.out = jaws.pca(dat, r=1)

ncchung/jaws documentation built on May 23, 2019, 1:05 p.m.