null_eigval: Eigenvalue estimation for null Gaussian based testing...
In pkimes/sigclust2: sigclust2: Statistical Significance of Clustering

Description Usage Arguments Details Value Author(s) References

Function to compute the eigenvalues of the null Gaussian distribution for significance of clustering testing procedures which rely on a null Gaussian factor model assumption. When the number of observations is substantially greater than the number of features, the sample covariance matrix should be used.

1	null_eigval(x, n, p, icovest = 1, bkgd_pca = FALSE)

`x`	a matrix of size n by p containing the original data.
`n`	an integer number of samples.
`p`	an integer number of features/covariates.
`icovest`	an integer between 1 and 3 corresponding to the covariance estimation procedure to use. See details for more information on the possible estimation procedures. (default = 1)
`bkgd_pca`	a logical value specifying whether to use scaled PCA scores over raw data to estimate the background noise. When FALSE, raw estimate is used; when TRUE, minimum of PCA and raw estimates is used. (default = FALSE)

The following possible options are given for null covariance estimation

soft thresholding: recommended approach described in Huang et al. 2014
sample: uses sample covariance matrix, equivalent to soft and hard options when n > p, but when p > n, will produce conservative results, i.e. less significant p-values
hard thresholding: approach described in Liu et al. 2008, no longer recommended - retained for historical purposes

The function returns a list of estimated parameters for the null Gaussian distribution used in significance of clustering testing. The list includes:

eigval_dat: eigenvalues for sample covariance matrix
backvar: background noise, sigma_b^2
eigval_sim: eigenvalues to be used for simulation

Patrick Kimes

Huang, H., Liu, Y., Yuan, M., and Marron, J. S. (2014). Statistical Significance of Clustering using Soft Thresholding. Journal of Computational and Graphical Statistics, preprint.
Liu, Y., Hayes, D. N., Nobel, A. B., and Marron, J. S. (2008). Statistical Significance of Clustering for High-Dimension, Low-Sample Size Data. Journal of the American Statistical Association, 103(483):1281-1293.

pkimes/sigclust2 documentation built on May 25, 2019, 8:20 a.m.