cv_method_corr: Formula-based method to calculate the PCC of a CV-based...

Description Usage Arguments Details Value Author(s) References Examples

Description

Determine the probability of correct classification (PCC) for a high dimensional classification study employing Cross validation classifier. This is similar to cv_method, but features generated are correlated.

Usage

1
2
	cv_method_corr(mu0, p, m, n, alpha_list, nrep, p1 = 0.5, ss = F, pcorr, 
	chol.rho,sampling.p=0.5)

Arguments

mu0

The effect size of the important features.

p

The number of the features in total.

m

The number of the important features.

n

The total sample size for the two groups.

alpha_list

The search grid for the p-value threshold.

nrep

The number of simulation replicates employed to compute the expected PCC and/or sensitivity and specificity.

p1

The prevalence of the group 1 in the population, default to 0.5.

ss

Boolean variable, default to FALSE. The TRUE value instruct the program to compute the sensitivity and the specificity of the classifier.

pcorr

Number of correlated features.

chol.rho

Cholesky decomposition of the covariance of the pcorr features that are correlated. It is assumed that the m important features are part of the pcorr correlated features.

sampling.p

The assumed proportion of group 1 samples in the training data; default of 0.5 assumes groups are equally represented regardless of p1.

Details

Refer to Sanchez, Wu, Song, Wang 2015, Section 3 and Supplementary materials.

Value

If ss=FALSE, the function returns the expected PCC. If ss=TRUE, the function returns a vector containing the expected PCC, sensitivity and specificity.

Author(s)

Meihua Wu <meihuawu@umich.edu> Brisa N. Sanchez <brisa@umich.edu> Peter X.K. Song <pxsong@umich.edu> Raymond Luu <raluu@umich.edu> Wen Wang <wangwen@umich.edu>

References

Sanchez, B.N., Wu, M., Song, P.X.K., and Wang W. (2016). "Study design in high-dimensional classification analysis." Biostatistics, in press.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
	## Sigma_1 in the paper
	#first block is pcorr x pcorr of compound symmetry
	#other diagonal block is Identity; off diagonal blocks are 0
	pcorr=10  
	p=500
	rho.cs=.8
	#create first block
	rho=diag(c((1-rho.cs)*rep(1,pcorr),rep(1,p-pcorr)))+ matrix(c(rho.cs*
	rep(1,pcorr),rep(0,p-pcorr)),ncol=1) %*% c(rep(1,pcorr),rep(0,p-pcorr))
	chol.rho1.500=chol(rho[1:pcorr,1:pcorr])
	lmax= max(eigen(rho)$values)
	print(lmax)
	set.seed(1)
	cv_method_corr(mu0=0.4,p=500,m=10,n=80,alpha_list=c(0.0000001,0.0001,0.01),
	nrep=10,p1=0.6,ss=TRUE,pcorr=pcorr,chol.rho=chol.rho1.500,sampling.p=0.5)
	#return 0.6689385 0.6806896 0.6513119
	#alpha_list should be a dense grid of pvalue cut-offs; 
	#three values are used here for simplicity of the example 

HDDesign documentation built on May 2, 2019, 6:41 a.m.