Description Usage Arguments Details Value Author(s) References See Also Examples
This function fits the sparse Tweedie model on multi-source datasets along a sequence of regularization parameters lambda. The optimization is done by a Fortran95 routine.
1 2 3 4 5 |
x |
Either (1) a data frame containing the predictors, the responses (identifying the sources either by different columns in the simultaneous case or via an additionnal index column) and, optionnaly, the observation weigths or (2) a list of matrices containing only the predictors (mostly used internally for cross-validation.) |
y |
Either (1) a single integer identifying the column of |
w |
(Optional) Either (1) a single integer identifying the column of |
source |
When |
rho |
Power used for the mean-variance relation of the Tweedie distribution. Possible range is [1,2], default is 1.5. |
nlambda |
The length of the regularization path. Disregarded if |
lambda.min |
The fraction of the first regularization parameter (which is computed to be the smallest such that no predictors are included) defining the last regularization parameter. Disregarded if |
lambda |
(Optional) User specified sequence of regularization parameter with positive values. When omitted, the sequence is computed starting from the smallest value excluding all predictors from the model and decreasing to a fraction |
x.normalize |
Logical flag for stadardization of the predictors prior to fitting the model. If |
eps |
Convergence threshold. Default is 1e-3. |
sr |
Logical flag for using the strong rule in the fit. Default is |
kktstop |
Logical flag for using the KKT conditions to stop the fit before the end of the regularization parameter sequence. Default is |
reg |
Either |
alpha |
Parameter controlling the balance between across-feature and within-feature sparsity in the penalty term (1-α)||β||_q +α||β||_1. Possible range is [0,1], default is 0. |
dfmax |
Maximum number of variables included in the model at a single time. Default is |
pmax |
Limits the number of features ever to be nonzero. The difference with |
pf |
Penalty weights in the penalty term by feature. Mostly used intternaly when the Adaptive Lasso is used in cross-validation. Expects a vector of length |
maxit |
Maximum number of inner-loop iterations. Default is 10,000. |
The sequence of regularization parameters implies a sequence of models fitted by the IRLS-BSUM algorithm described in the reference. For each value of the parameter, this function yield a model optimizing the penalzed Tweedie log-likelihood of multi-source data. The type of sparsity can be controlled by the arguments reg
and alpha
.
The computation time is influence by the arguments eps
, nlambda
, lambda.min
(or lambda
) and maxit
. Consider ajusting these parameters to speed up computation. Small values of regularization parameters are the often the longest to fit; the kktstop
argument can stop the algorithm before the end if convergence is judged sufficient in term of KKT conditions.
To pass sources with missing features compared to other sources, simply add a column of zero instead.
An object with S3 class MSTweedie
:
beta0 |
A |
beta |
A list of length |
df |
The number of included variables along the regularization path. |
lambda |
The sequence of regularization parameters. |
npasses |
The number of inner-loop iterations. |
idvars |
The index of the variables in order of inclusion in the model. |
dim |
The dimesions of the model ( |
call |
The original call that produce this object. |
pf |
The penalty factors for the features. |
eps |
The convergence threshold used in the algorithm. |
kkt |
A |
norm |
A |
reg |
The type of regularization used in the algorithm. |
alpha |
The value of the argument |
y |
A list of length |
x |
A list of length |
w |
A list of length |
rho |
The power of the mean-variance relation used in the algorithm. |
M |
A |
time |
Computing time. |
Simon Fontaine, Yi Yang, Bo Fan, Wei Qian and Yuwen Gu.
Maintainer: Simon Fontaine fontaines@dms.umontreal.ca
Fontaine, S., Yang, Y., Fan, B., Qian, W. and Gu, Y. (2018). "A Unified Approach to Sparse Tweedie Model with Big Data Applications to Multi-Source Insurance Claim Data Analysis," to be submitted.
MSTweedie
,
coef.MSTweedie
,
print.MSTweedie
,
plot.MSTweedie
,
kkt.check
,
predict.MSTweedie
1 2 3 4 5 6 7 8 9 10 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.