cv_performance_glm: cv_performance_glm

View source: R/dCVnet_utilities.R

cv_performance_glmR Documentation

cv_performance_glm

Description

Cross-validated estimates of model performance by repeated k-fold cross-validation.

Usage

cv_performance_glm(
  y,
  data,
  f = "~.",
  folds = NULL,
  k = 10,
  nrep = 2,
  family = "binomial",
  opt.ystratify = TRUE,
  opt.uniquefolds = FALSE,
  return_summary = TRUE,
  offset = NULL,
  ...
)

Arguments

y

outcome vector (numeric or factor)

data

predictors in a data.frame

f

a formula to apply to x

folds

This is a list where each element is an integer vector of length n_cases. The integer for each case labels it as belonging to a fold 1:n_folds. This argument overrides the number of repeats and the k in repeated k-fold cv.

k

the number of folds for k-fold cross-validation.

nrep

the number of repetitions

family

Either a character string representing one of the built-in families, or else a glm() family object. For more information, see Details section below or the documentation for response type (above).

opt.ystratify

Boolean. Outer and inner sampling is stratified by outcome. This is implemented with createFolds

opt.uniquefolds

Boolean. In most circumstances folds will be unique. This requests that random folds are checked for uniqueness in inner and outer loops. Currently it warns if non-unique values are found.

return_summary

bool. return summarised performance (default), or performance objects for further analysis (set to FALSE)

offset

optional model offset (see glmnet)

...

other arguments

Details

This function is nothing revolutionary. The idea is to extend boot{cv.glm} with an interface that better matches the other functions in this package.

The additions are:

  • Repeated k-fold rather than single k-fold

  • Option to provide the fold membership

  • Default use of stratified sampling by outcome class

  • Performance assessed with summary.performance

Value

A list containing the following:

  • glm.performance - summary(performance(x)) for the uncrossvalidated model

  • cv.performance - report_performance_summary(cv.fits) for the crossvalidated model

  • folds - the folds used in cross-validation

  • call - the function call

See Also

cv.glm, performance


AndrewLawrence/dCVnet documentation built on Sept. 24, 2024, 5:24 a.m.