mod_cv: Compare models with k-fold cross validation
In ProjectMOSAIC/mosaicModel: Create, Visualize, and Predict with Models

Compare models with k-fold cross validation

1 2	mod_cv(..., k = 10, ntrials = 5, error_type = c("default", "mse", "sse", "mad", "LL", "mLL", "dev", "class_error"), blockwise = FALSE)

`...`	one or more models on which to perform the cross validation
`k`	the k in k-fold. cross-validation will use k-1/k of the data for training.
`ntrials`	how many random partitions to make. Each partition will be one case in the output of the function
`error_type`	The kind of output to produce from each cross-validation. See `mod_error` for details.
`blockwise`	When TRUE carries out the trials in a blockwise mode, suitable for time series.

The purpose of cross-validation is to provide "new" data on which to test a model's performance. In k-fold cross-validation, the data set used to train the model is broken into new training and testing data. This is accomplished simply by using most of the data for training while reserving the remaining data for evaluating the model: testing. Rather than training a single model, k models are trained, each with its own particular testing set. The testing sets in the k models are arranged to cover the whole of the data set. On each of the k testing sets, a performance output is calculated. Which output is most appropriate depends on the kind of model: regression model or classifier. The most basic measure is the mean square error: the difference between the actual response variable in the testing data and the output of the model when presented with inputs from the testing data. This is appropriate in many regression models.

The blockwise mode is intended for time-series like data, where successive rows may be correlated. Rather than picking each k block as a random sample of rows, adjacent rows will be used for model evaluation.

ProjectMOSAIC/mosaicModel documentation built on May 13, 2019, 1:35 a.m.