Description Usage Arguments Value References See Also Examples
Given a type I error upper bound alpha and a violation upper bound delta, npc
calculates the Neyman-Pearson Classifier
which controls the type I error under alpha with probability at least 1-delta.
1 2 3 4 |
x |
n * p observation matrix. n observations, p covariates. |
y |
n 0/1 observatons. |
method |
base classification method.
|
alpha |
the desirable upper bound on type I error. Default = 0.05. |
delta |
the violation rate of the type I error. Default = 0.05. |
split |
the number of splits for the class 0 sample. Default = 1. For ensemble version, choose split > 1. |
split.ratio |
the ratio of splits used for the class 0 sample to train the
base classifier. The rest are used to estimate the threshold. Can also be set to be "adaptive", which will be determined using a data-driven method implemented in |
n.cores |
number of cores used for parallel computing. Default = 1. WARNING: windows machine is not supported. |
band |
whether to generate both lower and upper bounds of type II error. Default = FALSE. |
nfolds |
number of folds for performing adaptive split ratio selection. Default = 10. |
randSeed |
the random seed used in the algorithm. |
warning |
whether to show various warnings in the program. Default = TRUE. |
... |
additional arguments. |
An object with S3 class npc.
fits |
a list of length max(1,split), represents the fit during each split. |
method |
the base classification method. |
split |
the number of splits used. |
Xin Tong, Yang Feng, and Jingyi Jessica Li (2018), Neyman-Pearson (NP) classification algorithms and NP receiver operating characteristic (NP-ROC), Science Advances, 4, 2, eaao1659.
nproc
and predict.npc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | set.seed(1)
n = 1000
x = matrix(rnorm(n*2),n,2)
c = 1+3*x[,1]
y = rbinom(n,1,1/(1+exp(-c)))
xtest = matrix(rnorm(n*2),n,2)
ctest = 1+3*xtest[,1]
ytest = rbinom(n,1,1/(1+exp(-ctest)))
##Use lda classifier and the default type I error control with alpha=0.05, delta=0.05
fit = npc(x, y, method = 'lda')
pred = predict(fit,xtest)
fit.score = predict(fit,x)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ', accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')
## Not run:
##Ensembled lda classifier with split = 11, alpha=0.05, delta=0.05
fit = npc(x, y, method = 'lda', split = 11)
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ', accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')
##Now, change the method to logistic regression and change alpha to 0.1
fit = npc(x, y, method = 'logistic', alpha = 0.1)
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ', accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')
##Now, change the method to adaboost
fit = npc(x, y, method = 'ada', alpha = 0.1)
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ', accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')
##Now, try the adaptive splitting ratio
fit = npc(x, y, method = 'ada', alpha = 0.1, split.ratio = 'adaptive')
pred = predict(fit,xtest)
accuracy = mean(pred$pred.label==ytest)
cat('Overall Accuracy: ', accuracy,'\n')
ind0 = which(ytest==0)
typeI = mean(pred$pred.label[ind0]!=ytest[ind0]) #type I error on test set
cat('Type I error: ', typeI, '\n')
cat('Splitting ratio:', fit$split.ratio)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.