knitr::opts_chunk$set(echo = TRUE) library(ReSurv) library(ggplot2)
In this vignette we show how to simulate the individual data we included in the simulation study of @hiabu23. The simulations are based on the SynthETIC
package and they can be used to replicate our results.
In the manuscript, we named the $5$ scenarios Alpha, Beta, Gamma, Delta, Epsilon. The $5$ scenarios have the same data features described in the following table. Conversely, they have specific characteristics that we will describe in the coming sections.
| Covariates | Description |
|--------------------------------------------------|--------------------|
| claim_number
| Policy identifier. |
| claim_type
$\in \left{0, 1 \right}$ | Type of claim. |
| AP
| Accident month. |
| RP
| Reporting month. |
For each scenario we will show if they satisfy the chain ladder assumptions (CL), the proportionality assumption in @cox72 (PROP) and if interactions are present (INT). Details on the simulation mechanism and the simulation parameters can be found in the manuscript.
This scenario is a mix of claim_type 0
and claim_type 1
with same number of claims at each accident month (i.e. the claims volume).
# Input data input_data_0 <- data_generator( random_seed = 1964, scenario = "alpha", time_unit = 1 / 360, years = 4, period_exposure = 200 )
input_data_0 %>% as.data.frame() %>% mutate(claim_type = as.factor(claim_type)) %>% ggplot(aes(x = RT - AT, color = claim_type)) + stat_ecdf(size = 1) + labs(title = "Empirical distribution of simulated notification delays", x = "Notification delay (in days)", y = "Cumulative Density") + xlim(0, 1500) + scale_color_manual( values = c("royalblue", "#a71429"), labels = c("Claim type 0", "Claim type 1") ) + scale_linetype_manual(values = c(1, 3), labels = c("Claim type 0", "Claim type 1")) + guides( color = guide_legend(title = "Claim type", override.aes = list( color = c("royalblue", "#a71429"), size = 2 )), linetype = guide_legend( title = "Claim type", override.aes = list(linetype = c(1, 3), size = 0.7) ) ) + theme_bw()
This scenario is similar to simulation Alpha
but the volume of claim_type 1
is decreasing in the most recent accident dates. When the longer tailed bodily injuries have a decreasing claim volume, aggregated chain ladder methods will overestimate reserves, see @ajne94.
input_data_1 <- data_generator( random_seed = 1964, scenario = 1, time_unit = 1 / 360, years = 4, period_exposure = 200 )
input_data_1 %>% as.data.frame() %>% mutate(claim_type = as.factor(claim_type)) %>% ggplot(aes(x = RT - AT, color = claim_type)) + stat_ecdf(size = 1) + labs(title = "Empirical distribution of simulated notification delays", x = "Notification delay (in days)", y = "Cumulative Density") + xlim(0, 1500) + scale_color_manual( values = c("royalblue", "#a71429"), labels = c("Claim type 0", "Claim type 1") ) + scale_linetype_manual(values = c(1, 3), labels = c("Claim type 0", "Claim type 1")) + guides( color = guide_legend(title = "Claim type", override.aes = list( color = c("royalblue", "#a71429"), size = 2 )), linetype = guide_legend( title = "Claim type", override.aes = list(linetype = c(1, 3), size = 0.7) ) ) + theme_bw()
An interaction between claim_type 1
and accident period affects the claims occurrence. One could imagine a scenario, where a change in consumer behavior or company policies resulted in different reporting patterns over time. For the last simulated accident month, the two reporting delay distributions will be identical.
# Input data input_data_2 <- data_generator( random_seed = 1964, scenario = 2, time_unit = 1 / 360, years = 4, period_exposure = 200 )
input_data_2 %>% as.data.frame() %>% mutate(claim_type = as.factor(claim_type)) %>% ggplot(aes(x = RT - AT, color = claim_type)) + stat_ecdf(size = 1) + labs(title = "Empirical distribution of simulated notification delays", x = "Notification delay (in days)", y = "Cumulative Density") + xlim(0, 1500) + scale_color_manual( values = c("royalblue", "#a71429"), labels = c("Claim type 0", "Claim type 1") ) + scale_linetype_manual(values = c(1, 3), labels = c("Claim type 0", "Claim type 1")) + guides( color = guide_legend(title = "Claim type", override.aes = list( color = c("royalblue", "#a71429"), size = 2 )), linetype = guide_legend( title = "Claim type", override.aes = list(linetype = c(1, 3), size = 0.7) ) ) + theme_bw()
A seasonality effect dependent on the accident months for claim_type 0
and claim_type 1
is present. This could occur in a real world setting with increased work load during winter for certain claim types, or a decreased workforce during the summer holidays.
input_data_3 <- data_generator( random_seed = 1964, scenario = 3, time_unit = 1 / 360, years = 4, period_exposure = 200 )
input_data_3 %>% as.data.frame() %>% mutate(claim_type = as.factor(claim_type)) %>% ggplot(aes(x = RT - AT, color = claim_type)) + stat_ecdf(size = 1) + labs(title = "Empirical distribution of simulated notification delays", x = "Notification delay (in days)", y = "Cumulative Density") + xlim(0, 1500) + scale_color_manual( values = c("royalblue", "#a71429"), labels = c("Claim type 0", "Claim type 1") ) + scale_linetype_manual(values = c(1, 3), labels = c("Claim type 0", "Claim type 1")) + guides( color = guide_legend(title = "Claim type", override.aes = list( color = c("royalblue", "#a71429"), size = 2 )), linetype = guide_legend( title = "Claim type", override.aes = list(linetype = c(1, 3), size = 0.7) ) ) + theme_bw()
The data generating process violates the proportional likelihood in @cox72. We generate the data assuming that a) there is an effect of the covariates on the baseline and b) the proportionality assumption is not valid.
# Input data input_data_4 <- data_generator( random_seed = 1964, scenario = 4, time_unit = 1 / 360, years = 4, period_exposure = 200 )
input_data_4 %>% as.data.frame() %>% mutate(claim_type = as.factor(claim_type)) %>% ggplot(aes(x = RT - AT, color = claim_type)) + stat_ecdf(size = 1) + labs(title = "Empirical distribution of simulated notification delays", x = "Notification delay (in days)", y = "Cumulative Density") + xlim(0, 1500) + scale_color_manual( values = c("royalblue", "#a71429"), labels = c("Claim type 0", "Claim type 1") ) + scale_linetype_manual(values = c(1, 3), labels = c("Claim type 0", "Claim type 1")) + guides( color = guide_legend(title = "Claim type", override.aes = list( color = c("royalblue", "#a71429"), size = 2 )), linetype = guide_legend( title = "Claim type", override.aes = list(linetype = c(1, 3), size = 0.7) ) ) + theme_bw()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.