jaws.pca: The Jackstraw Weighted Shrinkage Estimation Method for Sparse...
In ncchung/jaws: Jackstraw Weighted Shrinkage Methods

Description Usage Arguments Details Value Detailed explanation of the output contained in PIP and PNV Author(s) References See Also Examples

Estimates sparse/shrunken loadings of principal component analysis. Based on statistical sginificance of association between variables and principal components, the sample loadings of principal components are shruken towards zeros, which improve its accuracy. The only required inputs are the data matrix dat and the number of principal components r whose loadings you would like to estimate.

1
2
3

jaws.pca(dat, p = NULL, r = NULL, s = NULL, B = NULL,
  stat.shrinkage = "F-statistics", extra.shrinkage = NULL, verbose = TRUE,
  seed = NULL, save.all = TRUE)

`dat`	a data matrix with `m` rows as variables and `n` columns as observations.
`p`	a `m * r` matrix of p-values for association tests between variables and `r` principal components, generally computed from the jackstraw method. If `p` is not given, `jackstraw.PCA` is automatically applied.
`r`	a number (a positive integer) of significance principal components.
`s`	a number (a positive integer) of “synthetic” null variables (optional).
`B`	a number (a positive integer) of resampling iterations (optional).
`stat.shrinkage`	PNV shrinkage may be applied to "F-statistics" or "loadings" (default: F-statistics).
`extra.shrinkage`	extra shrinkage methods may be used; see details below (optional).
`verbose`	a logical specifying to print the progress (default: TRUE).
`seed`	a seed for the random number generator (optional).
`save.all`	a logical specifying to save all objects, including a large SVD object (default: FALSE).

By default, jaws.pca computes two canonical jackstraw weighted shrinkage estimators, namely PIP and PNV. Additionally, other extra shrinkage techniques may apply, such as combining two canonical estimaotrs by setting extra.shrinkage="PIPhard" and applying soft-thresholding to local fdr by setting extra.shrinkage to numerical threshold values between 0 and 1. Please provide r numerical threshold values to be applied to r principal components.

It is strongly advised that you take a careful look at your data and use appropriate graphical and statistical criteria to determine a number of significant PCs, r. For example, see a contributed R package called ‘nFactors’. In a case when you fail to specify r, r will be estimated from permutation Parallel Analysis (Buja and Eyuboglu, 1992) via a function permutationPA, with a very liberal threshold.

If s is not supplied, s is set to about 10% of m variables. If B is not supplied, B is set to m*10/s.

jaws.pca returns a list consisting of

`p`	p-values for association tests between variables and each of `r` principal components
`pi0`	proportion of variables not associated with `r` principal components, individually
`svd`	SVD object from decomposing `dat`
`PIP`	a list of outputs derived from the posterior inclusion probabilities method (including `pr`, `u`, `var`, `PVE`)
`PNV`	a list of outputs derived from the proportion of null variables method (including `pi0`, `u`, `var`, `PVE`)

With appropriate extra.shrinkage options (for details, see the Supplementary Information of Chung and Storey (2013), the output may also include

`PIPhard`	a list of outputs from hard-threshoding the `PIP` loadings (including `u`, `var`, `PVE`)
`PIPsoft`	a list of outputs from soft-threshoding the `PIP` loadings (including `pr`, `u`, `var`, `PVE`)

Detailed explanation of the output contained in `PIP` and `PNV`

pr: a matrix of posterior inclusion probabilities (equivalent to 1-lfdr) for m coefficients and r PCs.
pi0: a vector of estimated proportion of null variables for r PCs.
u: a m*r matrix of shrunken loadings.
var: a vector of shrunken variances explained by r PCs.
PVE: a vector of shrunken percent variances explained by r PCs.

Neo Chung nchchung@gmail.com

Chung and Storey (2015) Forthcoming

jackstraw.PCA jaws.cov

set.seed(1234)
## simulate data from a latent variable model: Y = BX + E
B = c(rep(1,50),rep(-1,50), rep(0,900))
X = rnorm(20)
E = matrix(rnorm(1000*20), nrow=1000)
dat = B %*% t(X) + E
dat = t(scale(t(dat), center=TRUE, scale=FALSE))

## estimate sparse loadings in PCA
jaws.pca.out = jaws.pca(dat, r=1)