model_crossval: Train all insuRglm models on CV data
In realgabon/insuRglm: Tools for GLM Modeling in Insurance Context

model_crossval

R Documentation

Train all insuRglm models on CV data

Description

Train the current (last) and any saved insuRglm model using CV data. Predictions are stored for later use. Uses parallel processing when future::plan(multiprocess is declared beforehand.

Usage

model_crossval(setup, cv_folds = 10, stratified = FALSE, seed = NULL)

Arguments

`setup`	Setup object. Created at the start of the workflow. Usually piped in from previous step.
`cv_folds`	Integer scalar. Number of rancom CV folds to be used.
`stratified`	Boolean scalar. Whether to stratify losses and non-losses. This will help in creating more representative crossvalidation folds with datasets that contain very few non-zero losses.
`seed`	Numeric scalar. Seed for reproducible random number generation, e.g. for creating CV folds. Will override seed created during setup.

Value

Setup object with updated attributes.

Examples

require(dplyr) # for the pipe operator
data('sev_train')

setup <- setup(
  data_train = sev_train,
  target = 'sev',
  weight = 'numclaims',
  family = 'gamma',
  keep_cols = c('pol_nbr', 'exposure', 'premium')
)

modeling <- setup %>%
  factor_add(pol_yr) %>%
  factor_add(agecat) %>%
  model_fit()

modeling_cv <- modeling %>%
  model_crossval()

modeling_cv %>%
  model_lift(data = 'crossval')

# let's do more folds and use parallel processing for that
plan(multiprocess)

modeling_cv <- modeling %>%
  model_crossval(cv_folds = 100)

modeling_cv %>%
  model_lift(data = 'crossval')

realgabon/insuRglm documentation built on Jan. 2, 2023, 2:51 a.m.