Cross-Validation of Logistic Regression Model

View source: R/LogisticRegression.R

cv.lrR Documentation

Cross-Validation of Logistic Regression Model


Implementation of cross-validation for a lr object, calculation of error across a number of subsets of the inputted data set.

  metric = "mse",
  leave_out = nrow(lrfit$data)/10,
  verbose = TRUE,
  seed = 1



an object of class "lr", the output to lr


which metric to calculate, one of "mse", "auc" or "both". See 'Details'.


number of points to leave out for cross-validation.


logical; whether to print information about number of iterations completed.


optional; number to be passed to set.seed before shuffling the data set


k-fold cross-validation, where k is the input to the leave_out argument. This can be used to judge the out-of-sample predictive power of the model by subsetting the original data set into two partitions; fitting the model for the (usually larger) one, and testing the predictions of that model on the (usually smaller) partition. The position of the k points separated from the data set are selected uniformly at random.

The error metrics available are that of mean squared error, AUC, or log score; selected by the metric argument being one of "mse", "auc", "log" or "all". See for details on AUC. If metric is "all", then a vector will be output containing all three metrics.

Note that the output from metric = "auc" has non-deterministic elements due to the shuffling of the data set. To mitigate this, include a number to the seed argument.


error value or vector consisting of the average of the chosen metric

dannyjameswilliams/danielR documentation built on Aug. 20, 2023, 3:25 a.m.