ds_method: Estimate PCC by DS Method

Description Usage Arguments Details Value Author(s) References Examples

Description

Determine the probability of correct classification (PCC) for studies employing high dimensional features for classification; uses the method proposed by (Dobbin and Simon 2007) to choose the p-value threshold for feature selection.

Usage

1
	ds_method(mu0, p, m, n, p1=0.5, lmax=1, ss=F, sampling.p)

Arguments

mu0

The effect size of the important features.

p

The number of the features in total.

m

The number of the important features.

n

The total sample size for the two groups.

p1

The prevalence of the group 1 in the population, default to 0.5.

lmax

The maximum eigenvalue of the variance-covariance matrix of the p features. Defaults to 1 which implies that the features are assumed i.i.d.

ss

Boolean variable, default to FALSE. The TRUE value instruct the program to compute the sensitivity and the specificity of the classifier.

sampling.p

The assumed proportion of group 1 samples in the training data; default of 0.5 assumes groups are equally represented regardless of p1.

Details

Refer to Dobbin and Simon (2007)

Value

If ss=FALSE, the function returns the expected PCC. If ss=TRUE, the function returns a vector containing the expected PCC, sensitivity and specificity.

Author(s)

Meihua Wu <meihuawu@umich.edu> Brisa N. Sanchez <brisa@umich.edu> Peter X.K. Song <pxsong@umich.edu> Raymond Luu <raluu@umich.edu> Wen Wang <wangwen@umich.edu>

References

Dobbin, K.K., and Simon R.M. (2007). "Sample Size Planning for Developing Classifiers Using High-dimensional DNA Microarray Data." Biostatistics 8 (1): 101-117.

Examples

1
2
ds_method(mu0=0.6, p=500, m=10, n=38, p1=0.5, lmax=1, ss=TRUE)
#[1] 0.9252471 0.9252471 0.9252471

Example output

[1] 0.9252471 0.9252471 0.9252471

HDDesign documentation built on May 2, 2019, 6:41 a.m.