cm | R Documentation |
Fit the following credibility models: Bühlmann, Bühlmann-Straub, hierarchical, regression (Hachemeister) or linear Bayes.
cm(formula, data, ratios, weights, subset,
regformula = NULL, regdata, adj.intercept = FALSE,
method = c("Buhlmann-Gisler", "Ohlsson", "iterative"),
likelihood, ...,
tol = sqrt(.Machine$double.eps), maxit = 100, echo = FALSE)
## S3 method for class 'cm'
print(x, ...)
## S3 method for class 'cm'
predict(object, levels = NULL, newdata, ...)
## S3 method for class 'cm'
summary(object, levels = NULL, newdata, ...)
## S3 method for class 'summary.cm'
print(x, ...)
formula |
character string |
data |
a matrix or a data frame containing the portfolio structure, the ratios or claim amounts and their associated weights, if any. |
ratios |
expression indicating the columns of |
weights |
expression indicating the columns of |
subset |
an optional logical expression indicating a subset of observations to be used in the modeling process. All observations are included by default. |
regformula |
an object of class |
regdata |
an optional data frame, list or environment (or object
coercible by |
adj.intercept |
if |
method |
estimation method for the variance components of the model; see Details. |
likelihood |
a character string giving the name of the likelihood function in one of the supported linear Bayes cases; see Details. |
tol |
tolerance level for the stopping criteria for iterative estimation method. |
maxit |
maximum number of iterations in iterative estimation method. |
echo |
logical; whether to echo the iterative procedure or not. |
x, object |
an object of class |
levels |
character vector indicating the levels to predict or to
include in the summary; if |
newdata |
data frame containing the variables used to predict credibility regression models. |
... |
parameters of the prior distribution for |
cm
is the unified front end for credibility models fitting. The
function supports hierarchical models with any number of levels (with
Bühlmann and Bühlmann-Straub models as
special cases) and the regression model of Hachemeister. Usage of
cm
is similar to lm
for these cases.
cm
can also fit linear Bayes models, in which case usage is
much simplified; see the section on linear Bayes below.
When not "bayes"
, the formula
argument symbolically
describes the structure of the portfolio in the form ~ terms
.
Each term is an interaction between risk factors contributing to the
total variance of the portfolio data. Terms are separated by +
operators and interactions within each term by :
. For a
portfolio divided first into sectors, then units and finally
contracts, formula
would be ~ sector + sector:unit +
sector:unit:contract
, where sector
, unit
and
contract
are column names in data
. In general, the
formula should be of the form ~ a + a:b + a:b:c + a:b:c:d +
...
.
If argument regformula
is not NULL
, the regression model
of Hachemeister is fit to the data. The response is usually time. By
default, the intercept of the model is located at time origin. If
argument adj.intercept
is TRUE
, the intercept is moved
to the (collective) barycenter of time, by orthogonalization of the
design matrix. Note that the regression coefficients may be difficult
to interpret in this case.
Arguments ratios
, weights
and subset
are used
like arguments select
, select
and subset
,
respectively, of function subset
.
Data does not have to be sorted by level. Nodes with no data (complete
lines of NA
except for the portfolio structure) are allowed,
with the restriction mentioned above.
Function cm
computes the structure parameters estimators of the
model specified in formula
. The value returned is an object of
class cm
.
An object of class "cm"
is a list with at least the following
components:
means |
a list containing, for each level, the vector of linearly sufficient statistics. |
weights |
a list containing, for each level, the vector of total weights. |
unbiased |
a vector containing the unbiased variance components
estimators, or |
iterative |
a vector containing the iterative variance components
estimators, or |
cred |
for multi-level hierarchical models: a list containing, the vector of credibility factors for each level. For one-level models: an array or vector of credibility factors. |
nodes |
a list containing, for each level, the vector of the number of nodes in the level. |
classification |
the columns of |
ordering |
a list containing, for each level, the affiliation of a node to the node of the level above. |
Regression fits have in addition the following components:
adj.models |
a list containing, for each node, the credibility
adjusted regression model as obtained with
|
transition |
if |
terms |
the |
The method of predict
for objects of class "cm"
computes
the credibility premiums for the nodes of every level included in
argument levels
(all by default). Result is a list the same
length as levels
or the number of levels in formula
, or
an atomic vector for one-level models.
The credibility premium at one level is a convex combination between
the linearly sufficient statistic of a node and the credibility
premium of the level above. (For the first level, the complement of
credibility is given to the collective premium.) The linearly
sufficient statistic of a node is the credibility weighted average of
the data of the node, except at the last level, where natural weights
are used. The credibility factor of node i
is equal to
\frac{w_i}{w_i + a/b},
where w_i
is the weight of the node used in the linearly
sufficient statistic, a
is the average within node variance and
b
is the average between node variance.
The credibility premium of node i
is equal to
y^\prime b_i^a,
where y
is a matrix created from newdata
and
b_i^a
is the vector of credibility adjusted regression
coefficients of node i
. The latter is given by
b_i^a = Z_i b_i + (I - Z_I) m,
where b_i
is the vector of regression coefficients based
on data of node i
only, m
is the vector of collective
regression coefficients, Z_i
is the credibility matrix and
I
is the identity matrix. The credibility matrix of node i
is equal to
A^{-1} (A + s^2 S_i),
where S_i
is the unscaled regression covariance matrix of
the node, s^2
is the average within node variance and
A
is the within node covariance matrix.
If the intercept is positioned at the barycenter of time, matrices
S_i
and A
(and hence Z_i
) are diagonal.
This amounts to use Bühlmann-Straub models for each
regression coefficient.
Argument newdata
provides the “future” value of the
regressors for prediction purposes. It should be given as specified in
predict.lm
.
For hierarchical models, two sets of estimators of the variance components (other than the within node variance) are available: unbiased estimators and iterative estimators.
Unbiased estimators are based on sums of squares of the form
B_i = \sum_j w_{ij} (X_{ij} - \bar{X}_i)^2 - (J - 1) a
and constants of the form
c_i = w_i - \sum_j \frac{w_{ij}^2}{w_i},
where X_{ij}
is the linearly sufficient statistic of
level (ij)
; \bar{X_{i}}
is the weighted average of
the latter using weights w_{ij}
; w_i = \sum_j
w_{ij}
; J
is the effective number of
nodes at level (ij)
; a
is the within variance of this
level. Weights w_{ij}
are the natural weights at the
lowest level, the sum of the natural weights the next level and the
sum of the credibility factors for all upper levels.
The Bühlmann-Gisler estimators (method =
"Buhlmann-Gisler"
) are given by
b = \frac{1}{I} \sum_i \max \left( \frac{B_i}{c_i}, 0
\right),
that is the average of the per node variance estimators truncated at 0.
The Ohlsson estimators (method = "Ohlsson"
) are given by
b = \frac{\sum_i B_i}{\sum_i c_i},
that is the weighted average of the per node variance estimators without any truncation. Note that negative estimates will be truncated to zero for credibility factor calculations.
In the Bühlmann-Straub model, these estimators are equivalent.
Iterative estimators method = "iterative"
are pseudo-estimators
of the form
b = \frac{1}{d} \sum_i w_i (X_i - \bar{X})^2,
where X_i
is the linearly sufficient statistic of one
level, \bar{X}
is the linearly sufficient statistic of
the level above and d
is the effective number of nodes at one
level minus the effective number of nodes of the level above. The
Ohlsson estimators are used as starting values.
For regression models, with the intercept at time origin, only
iterative estimators are available. If method
is different from
"iterative"
, a warning is issued. With the intercept at the
barycenter of time, the choice of estimators is the same as in the
Bühlmann-Straub model.
When formula
is "bayes"
, the function computes pure
Bayesian premiums for the following combinations of distributions
where they are linear credibility premiums:
X|\Theta = \theta \sim \mathrm{Poisson}(\theta)
and
\Theta \sim \mathrm{Gamma}(\alpha, \lambda)
;
X|\Theta = \theta \sim \mathrm{Exponential}(\theta)
and
\Theta \sim \mathrm{Gamma}(\alpha, \lambda)
;
X|\Theta = \theta \sim \mathrm{Gamma}(\tau, \theta)
and
\Theta \sim \mathrm{Gamma}(\alpha, \lambda)
;
X|\Theta = \theta \sim \mathrm{Normal}(\theta, \sigma_2^2)
and
\Theta \sim \mathrm{Normal}(\mu, \sigma_1^2)
;
X|\Theta = \theta \sim \mathrm{Bernoulli}(\theta)
and
\Theta \sim \mathrm{Beta}(a, b)
;
X|\Theta = \theta \sim \mathrm{Binomial}(\nu, \theta)
and
\Theta \sim \mathrm{Beta}(a, b)
;
X|\Theta = \theta \sim \mathrm{Geometric}(\theta)
and
\Theta \sim \mathrm{Beta}(a, b)
.
X|\Theta = \theta \sim \mathrm{Negative~Binomial}(r, \theta)
and
\Theta \sim \mathrm{Beta}(a, b)
.
The following combination is also supported:
X|\Theta = \theta \sim \mathrm{Single~Parameter~Pareto}(\theta)
and \Theta \sim \mathrm{Gamma}(\alpha, \lambda)
. In this case, the Bayesian estimator not of
the risk premium, but rather of parameter \theta
is linear with
a “credibility” factor that is not restricted to (0, 1)
.
Argument likelihood
identifies the distribution of X|\Theta
= \theta
as one of
"poisson"
,
"exponential"
,
"gamma"
,
"normal"
,
"bernoulli"
,
"binomial"
,
"geometric"
,
"negative binomial"
or
"pareto"
.
The parameters of the distributions of X|\Theta = \theta
(when
needed) and \Theta
are set in ...
using the argument
names (and default values) of dgamma
,
dnorm
, dbeta
,
dbinom
, dnbinom
or
dpareto1
, as appropriate. For the Gamma/Gamma case, use
shape.lik
for the shape parameter \tau
of the Gamma
likelihood. For the Normal/Normal case, use sd.lik
for the
standard error \sigma_2
of the Normal likelihood.
Data for the linear Bayes case may be a matrix or data frame as usual;
an atomic vector to fit the model to a single contract; missing or
NULL
to fit the prior model. Arguments ratios
,
weights
and subset
are ignored.
Vincent Goulet vincent.goulet@act.ulaval.ca, Xavier Milhaud, Tommy Ouellet, Louis-Philippe Pouliot
Bühlmann, H. and Gisler, A. (2005), A Course in Credibility Theory and its Applications, Springer.
Belhadj, H., Goulet, V. and Ouellet, T. (2009), On parameter estimation in hierarchical credibility, Astin Bulletin 39.
Goulet, V. (1998), Principles and application of credibility theory, Journal of Actuarial Practice 6, ISSN 1064-6647.
Goovaerts, M. J. and Hoogstad, W. J. (1987), Credibility Theory, Surveys of Actuarial Studies, No. 4, Nationale-Nederlanden N.V.
subset
, formula
,
lm
, predict.lm
.
data(hachemeister)
## Buhlmann-Straub model
fit <- cm(~state, hachemeister,
ratios = ratio.1:ratio.12, weights = weight.1:weight.12)
fit # print method
predict(fit) # credibility premiums
summary(fit) # more details
## Two-level hierarchical model. Notice that data does not have
## to be sorted by level
X <- data.frame(unit = c("A", "B", "A", "B", "B"), hachemeister)
fit <- cm(~unit + unit:state, X, ratio.1:ratio.12, weight.1:weight.12)
predict(fit)
predict(fit, levels = "unit") # unit credibility premiums only
summary(fit)
summary(fit, levels = "unit") # unit summaries only
## Regression model with intercept at time origin
fit <- cm(~state, hachemeister,
regformula = ~time, regdata = data.frame(time = 12:1),
ratios = ratio.1:ratio.12, weights = weight.1:weight.12)
fit
predict(fit, newdata = data.frame(time = 0))
summary(fit, newdata = data.frame(time = 0))
## Same regression model, with intercept at barycenter of time
fit <- cm(~state, hachemeister, adj.intercept = TRUE,
regformula = ~time, regdata = data.frame(time = 12:1),
ratios = ratio.1:ratio.12, weights = weight.1:weight.12)
fit
predict(fit, newdata = data.frame(time = 0))
summary(fit, newdata = data.frame(time = 0))
## Poisson/Gamma pure Bayesian model
fit <- cm("bayes", data = c(5, 3, 0, 1, 1),
likelihood = "poisson", shape = 3, rate = 3)
fit
predict(fit)
summary(fit)
## Normal/Normal pure Bayesian model
cm("bayes", data = c(5, 3, 0, 1, 1),
likelihood = "normal", sd.lik = 2,
mean = 2, sd = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.