| setup | R Documentation |
Creates a setup object that is the basis for any insuRglm modeling workflow. This object is subsequently used as a main input in most functions in the package.
setup(
data_train,
data_test = NULL,
target,
weight = NULL,
offset = NULL,
family = c("poisson", "gamma", "tweedie"),
tweedie_p = NULL,
simple_factors = NULL,
keep_cols = NULL,
glm_backend = c("speedglm", "stats"),
folder = getwd(),
load_file_nm = NULL,
save_file_nm = NULL,
seed = NULL
)
data_train |
Dataframe. Training data |
data_test |
Dataframe. Test data |
target |
Character scalar. Name of the target variable |
weight |
Character scalar. Name of the weight variable |
family |
Character scalar. Name of distribution family. One of |
tweedie_p |
Numeric scalar. Tweedie variance power, if family |
simple_factors |
Character vector. Names of potential predictors. These predictors need to be |
keep_cols |
Character vector. Names of columns that are not potential predictors, but should be kept in data. |
glm_backend |
Character scalar. Either 'speedglm' or 'stats'. Choosing 'speedglm' results in using
|
folder |
Character scalar. Path to an existing folder where setup/model files will be stored. |
load_file_nm |
Character scalar. Filename of an existing setup object created by running setup.
Must be within folder specified by |
save_file_nm |
Character scalar. Filename of a setup object saved during this run of the setup function.
Will be saved within the folder specified by |
seed |
Numeric scalar. Seed for reproducible random number generation, e.g. for creating CV folds. |
offset. |
Character scalar. Name of the offset variable, applicable for |
List of class setup. Contains attributes and objects used by other functions in the package.
Short summary of the train/test datasets is written to the console
require(dplyr) # for the pipe operator#'
# poisson distribution target
data('freq_train')
setup <- setup(
data_train = freq_train,
target = 'freq',
offset = 'exposure',
family = 'poisson',
keep_cols = c('pol_nbr', 'premium')
)
# gamma distribution target
data('sev_train')
setup <- setup(
data_train = sev_train,
target = 'sev',
weight = 'numclaims',
family = 'gamma',
keep_cols = c('pol_nbr', 'exposure', 'premium')
)
# tweedie distribution - burning cost
data('bc_train')
setup <- setup(
data_train = bc_train,
target = 'bc',
weight = 'exposure',
family = 'tweedie',
tweedie_p = 1.75, # use tweedie::tweedie.profile to determine the best value
keep_cols = c('pol_nbr', 'premium')
)
# tweedie distribution - loss ratio
data('lr_train')
setup <- setup(
data_train = lr_train,
target = 'lr',
weight = 'premium',
family = 'tweedie',
tweedie_p = 1.75, # use tweedie::tweedie.profile to determine the best value
keep_cols = c('pol_nbr', 'exposure')
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.