gcv: Estimate EPE Using Delete-d Cross-Validation
In gencve: General Cross Validation Engine

Description Usage Arguments Details Value Note Author(s) References See Also Examples

This is a general purpose function to estimate the EPE of a specified cost function in regression and classification problems. For regression, the default cost function is for mean-square error and for classification it is the misclassification rate. Direct support for elastic penalty regression, LASSO, PCR, PLSR, nearest neighbour and Random Forest regression are included in the package. And for classification, built-in support functions are provided for LDA, QDA, Naive Bayes, kNN, CART, C5.0, Random Forest and SVM. Examples included in vignette section are provided for SCAD, MCP and best subset regression. Illustrative example datasets and data generation models are also provided.

1
2
3

gcv(X, y, MaxIter = 1000, d = ceiling(length(y)/10), NCores = 1,
  cost = mse,  yhat = yhat_lm, libs = character(0), seed = "default",
  ...)

`X`	inputs, matrix or dataframe
`y`	output vector
`MaxIter`	Number of iterations of the CV procedure
`d`	Number of observations for the hold-out sample
`NCores`	Default is 1 which does not use the parallel package. Otherwise, you can set to the number of cores available. If unsure, just experiment!
`cost`	Average cost. See examples mse, mae, mape.
`yhat`	In general it must be a function with arguments dfTrain and dfTest. See examples below.
`libs`	Required libraries needed for the predictor.
`seed`	Default is to use R's default which is based on the current time. Otherwise set to an integer value. See Details.
`...`	Additional arguments that are passed to yhat.

If only serial evaluation was implemented then I would have used set.seed to control the random. But I have included it as an argument since it can be used to set the parallel random number generator seed. This is sometimes useful for replicating the simulations. If the argument seed is used, it will also set the seed when only serial computation is done.

Matrix with one row and four columns: epe, sd_epe, snr, pcorr. These are respectively the estimated EPE, standard deviation of this estimate, an estimate of the snr (signal-to-noise ratio) out-of-sample and an out-of-sample estimate of the correlation between the prediction and the true value.

The statistical distribution of the EPE's when the argument outAllQ is set to TRUE is often positively skewed. This may be of interest in applications.

A. I. McLeod

ESL

mse, mae, mape, misclassificationrate, logloss, yhat_lm, yhat_nn, yhat_lars, yhat_plus, yhat_gel, yhat_step, yh_lda, yh_qda, yh_svm, yh_NB, yh_RF, yh_CART, yh_C50, yh_kNN, featureSelect, cv.glm

1
2
3

#Simple example but in general, MaxIter >= 1000 is recommended.
Xy <- ShaoReg()
gcv(Xy[,1:8], Xy[,9], MaxIter=25, d=5)

gencve documentation built on May 2, 2019, 6:08 a.m.

gencve index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gencve
General Cross Validation Engine

gcv: Estimate EPE Using Delete-d Cross-Validation
In gencve: General Cross Validation Engine

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to gcv in gencve...

R Package Documentation

Browse R Packages

We want your feedback!

gencve General Cross Validation Engine

gcv: Estimate EPE Using Delete-d Cross-Validation In gencve: General Cross Validation Engine

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to gcv in gencve...

R Package Documentation

Browse R Packages

We want your feedback!

gencve
General Cross Validation Engine

gcv: Estimate EPE Using Delete-d Cross-Validation
In gencve: General Cross Validation Engine