knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.height = 3.5, fig.path = "man/figures/README-" )
Generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The underlying model is still linear, but it is related to the response variable via a link function.
This type of model is implemented in many software packages, including R base glm
function. However, in insurance setting, the methodology for building GLM models is sometimes different from the usual use. While there's certainly an opportunity for automation to narrow down the predictor space, the core predictors are usually selected very carefully with almost manual approach. It is considered important to inspect every predictor from multiple angles and make sure it has both statistical and business significance.
This approach is usually carried out through the use of specialized commercial software products, which offer graphical user interface and limit the possibility of any automation or customization. The insuRglm package aims to provide an opensource and transparent alternative, which can be integrated into existing R workflows, thus allowing bigger degree of automation and customization.
The tools provided by this package are:
# Make sure you have devtools installed install.packages("devtools") # Then you can install from this repository devtools::install_github("dgabris/insuRglm", build_vignettes = TRUE)
The vignette can be also found here
# Each function is documented ?setup ?model_visualize # See more in-depth usage examples in the main vignette vignette('insuRglm')
First step of using the package should always be the one-time setup
function.
data('sev_train') library(insuRglm) library(magrittr) # for the %>% setup <- setup( data_train = sev_train, target = 'sev', weight = 'numclaims', family = 'gamma', keep_cols = c('pol_nbr', 'exposure', 'premium') )
You can explore the target, data and correlations using functions explore_target
, explore_data
and explore_corr
.
setup %>% explore_target(lower_quantile = 0.05, upper_quantile = 0.95) setup %>% explore_data(type = 'tabular', factors = c('pol_yr', 'agecat'))
Add or remove predictors to model formula with factor_add
and factor_remove
setup %>% factor_add(pol_yr) %>% factor_add(agecat)
Fit and visualize current model's predictors, as well as unfitted variables
modeling <- setup %>% factor_add(pol_yr) %>% factor_add(agecat) %>% model_fit() modeling %>% model_visualize(factors = 'fitted') modeling %>% model_visualize(factors = 'unfitted')
file.remove("sev_gamma_model.rds") file.remove("sev_gamma_setup.rds")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.