select.nspike: Finding Distant Spikes

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/pca_functions.R

Description

Estimates the number of distant spikes in the population based on the Generalized Spiked Population model. A finite upper bound (n.spikes.max) of the number of distant spikes must be provided.

Usage

1
select.nspike(samp.eval, p, n, n.spikes.max, evals.out = FALSE, smooth = TRUE)

Arguments

samp.eval

Numeric vector containing the sample eigenvalues. The vector must have dimension n or n-1, it may be unordered.

p

The number of features.

n

The number of samples.

n.spikes.max

Upper bound of the number of distant spikes in the population.

evals.out

Logical. If TRUE, the estimated spikes and non-spikes are returned.

smooth

Logical. If TRUE, kernel smoothing will be performed on the estimated population eigenvalue spectrum. Default is TRUE.

Details

The function searches between 0 and n.spikes.max to find out the number of distant spikes in the population. It also estimates both non-spiked and spiked eigenvalues based on the λ-estimation method.

The argument smooth is useful when the user assumes the population spectral distribution to be continuous.

Value

n.spikes

Estimated number of distant spikes.

spikes

If evals.out=TRUE, estimated distant spikes are returned.

nonspikes

If evals.out=TRUE, estimated non-spikes are returned.

loss

If evals.out=TRUE, L-infinity loss function for the spectrum estimation is returned.

Author(s)

Rounak Dey, deyrnk@umich.edu

References

Dey, R. and Lee, S. (2019). Asymptotic properties of principal component analysis and shrinkage-bias adjustment under the generalized spiked population model. Journal of Multivariate Analysis, Vol 173, 145-164.

See Also

hdpc_est,pc_adjust

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(hapmap)
#n = 198, p = 75435 for this data

####################################################
## Not run: 
#If you just want the estimated number of spikes
train.eval<-hapmap$train.eval
n<-hapmap$nSamp
p<-hapmap$nSNP

select.nspike(train.eval,p,n,n.spikes.max=10,evals.out=FALSE)

#If you want the estimated spikes and non-spikes
out<-select.nspike(train.eval,p,n,n.spikes.max=10,evals.out=TRUE)

## End(Not run)

hdpca documentation built on Jan. 16, 2021, 5:33 p.m.

Related to select.nspike in hdpca...