Description Usage Arguments Details Value Note Author(s) References See Also Examples
This is a general purpose function to estimate the EPE of a specified cost function in regression and classification problems. For regression, the default cost function is for mean-square error and for classification it is the misclassification rate. Direct support for elastic penalty regression, LASSO, PCR, PLSR, nearest neighbour and Random Forest regression are included in the package. And for classification, built-in support functions are provided for LDA, QDA, Naive Bayes, kNN, CART, C5.0, Random Forest and SVM. Examples included in vignette section are provided for SCAD, MCP and best subset regression. Illustrative example datasets and data generation models are also provided.
1 2 3 |
X |
inputs, matrix or dataframe |
y |
output vector |
MaxIter |
Number of iterations of the CV procedure |
d |
Number of observations for the hold-out sample |
NCores |
Default is 1 which does not use the parallel package. Otherwise, you can set to the number of cores available. If unsure, just experiment! |
cost |
Average cost. See examples mse, mae, mape. |
yhat |
In general it must be a function with arguments dfTrain and dfTest. See examples below. |
libs |
Required libraries needed for the predictor. |
seed |
Default is to use R's default which is based on the current time. Otherwise set to an integer value. See Details. |
... |
Additional arguments that are passed to yhat. |
If only serial evaluation was implemented then I would have used
set.seed
to control the random. But I have included it as an argument
since it can be used to set the parallel random number generator seed. This
is sometimes useful for replicating the simulations. If the argument
seed
is used, it will also set the seed when only serial computation
is done.
Matrix with one row and four columns: epe, sd_epe, snr, pcorr. These are respectively the estimated EPE, standard deviation of this estimate, an estimate of the snr (signal-to-noise ratio) out-of-sample and an out-of-sample estimate of the correlation between the prediction and the true value.
The statistical distribution of the EPE's when the argument
outAllQ
is set to TRUE is often positively skewed. This may be of
interest in applications.
A. I. McLeod
ESL
mse
,
mae
,
mape
,
misclassificationrate
,
logloss
,
yhat_lm
,
yhat_nn
,
yhat_lars
,
yhat_plus
,
yhat_gel
,
yhat_step
,
yh_lda
,
yh_qda
,
yh_svm
,
yh_NB
,
yh_RF
,
yh_CART
,
yh_C50
,
yh_kNN
,
featureSelect
,
cv.glm
1 2 3 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.