cv.PO.EN: Cross-validation function of PO-EN model

Description Usage Arguments Details Value Examples

View source: R/cross-validation.R

Description

Does k-fold cross-validation for PO-EN, produces a pair values of lambda and the prevalence parameter for an optimal fitting.

Usage

1
2
3
cv.PO.EN(X, Y, alpha=0.5, o.iter=5, i.iter=20,
epsilon=1e-4,nfolds=10,type.measure='deviance',
depth=100,input.pi=0.5,a=sqrt(0.5),seed=1)

Arguments

X

Input design matrix. Should not include the intercept vector.

Y

Response variable. Should be a binary vector.

alpha

The elastic net mixing parameter, with 0≤alpha≤ 1.

o.iter

Number of outer loop iteration.

i.iter

Number of inner loop iteration.

epsilon

The threshold for stopping the coordinate descent algorithm.

nfolds

The number of folds for applying cross validation. The default setting is 10. The number of presence observations must be a multiple of nfolds.

type.measure

The loss function to use for tuning lambda. The default is type.measure='deviance'. Other choices include AUROC (type.measure='auc') and F measure (type.measure='F.measure').

depth

The ratio between the largest lambda and the smallest lambda of the candidate sequence of lambda.

input.pi

The user-supplied prevalence sequence.

a

The parameter of F measure for tuning the true prevalence, the default value is √{0.5}.

seed

A single value used for random number generation of the functions.

Details

The cross-validation function runs a n-folds cross-validation for selecting an optimal pair of lambda and the prevalence parameter. The default setting is 10-folds cross validation. The candidate sequence of lambda is automatically generated by the function based on a warm start. The values of input.pi should be supplied by users.

Value

lambda.min value of lambda that returns the minimum (or maximum,
depending on type.measure) of mean cross-validated error.
lambda.1se largest value of lambda such that error is within 1 standard error of the minimum.
pi value of the prevalence parameter that returns maximum F measure.

Examples

1
2
3
4
5
6
7
data(example.data) # example datasets, including training dataset and testing dataset
train_data<-example.data$train.data
y_train=train_data$response;x_train=train_data[,-1]  # response and design matrix of training data
PO.EN.cv<-cv.PO.EN(x_train,y_train,input.pi=seq(0.01,0.4,length.out=4))

PO.EN.beta<-PO.EN(x_train,y_train,lambda=PO.EN.cv$lambda.min,
           true.prob=PO.EN.cv$pi,beta_start=rep(0,ncol(x_train)+1))

PO.EN documentation built on Aug. 19, 2020, 9:06 a.m.