Description Usage Arguments Details Value Author(s) References See Also Examples
The package provides some predefined GAMLSS families, e.g.
NBionomialLSS
. Objects of the class families
provide a
convenient way to specify GAMLSS distributions to be fitted by one of
the boosting algorithms implemented in this package. By using the
function Families
, a new object of the class families
can be generated.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59  ############################################################
# Families for continuous response
# Gaussian distribution
GaussianLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
# Student's tdistribution
StudentTLSS(mu = NULL, sigma = NULL, df = NULL,
stabilization = c("none", "MAD", "L2"))
############################################################
# Families for continuous nonnegative response
# Gamma distribution
GammaLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
############################################################
# Families for fractions and bounded continuous response
# Beta distribution
BetaLSS(mu = NULL, phi = NULL,
stabilization = c("none", "MAD", "L2"))
############################################################
# Families for count data
# Negative binomial distribution
NBinomialLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
# Zeroinflated Poisson distribution
ZIPoLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
# Zeroinflated negative binomial distribution
ZINBLSS(mu = NULL, sigma = NULL, nu = NULL,
stabilization = c("none", "MAD", "L2"))
############################################################
# Families for survival models (accelerated failure time
# models) for data with right censoring
# Lognormal distribution
LogNormalLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
# Loglogistic distribution
LogLogLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
# Weibull distribution
WeibullLSS(mu = NULL, sigma = NULL,
stabilization = c("none", "MAD", "L2"))
############################################################
# Constructor function for new GAMLSS distributions
Families(..., qfun = NULL, name = NULL)

... 
subfamilies to be passed to constructor. 
qfun 
quantile function. This function can for example be used
to compute (marginal) prediction intervals. See

name 
name of the families. 
mu 
offset value for mu. 
sigma 
offset value for sigma. 
phi 
offset value for phi. 
df 
offset value for df. 
nu 
offset value for nu. 
stabilization 
governs if the negative gradient should be standardized in each boosting step. It can be either "none", "MAD" or "L2". See also Details below. 
The arguments of the families are the offsets for each distribution
parameter. Offsets can be either scalar, a vector with length equal to
the number of observations or NULL
(default). In the latter
case, a scalar offset for this component is computed by minimizing the
risk function w.r.t. the corresponding distribution parameter (keeping
the other parameters fixed).
Note that gamboostLSS
is not restricted to three components but
can handle an arbitrary number of components (which, of course,
depends on the GAMLSS distribution). However, it is important that the
names (for the offsets, in the subfamilies etc.) are chosen
consistently.
The ZIPoLSS
families can be used to fit zeroinflated Poisson
models. Here, mu
and sigma
refer to the location
parameter of the Poisson component (with log link) and the mean of the
zerogenerating process (with logit link), respectively.
Similarly, ZINBLSS
can be used to fit zeroinflated negative
binomial models. Here, mu
and sigma
refer to the
location and scale parameters (with log link) of the negative binomial
component of the model. The zerogenerating process (with logit link)
is represented by nu
.
The Families
function can be used to implements a new GAMLSS
distribution which can be used for fitting by mboostLSS
.
Thereby, the function builds a list of subfamilies, one for each
distribution parameter. The subfamilies themselves are objects of the
class boost_family
, and can be constructed via the function
Family
of the mboost
Package.
Arguments to be passed to Family
: The loss
for every
distribution parameter (contained in objects of class
boost_family
) is the negative loglikelihood of the
corresponding distribution. The ngradient
is the negative
partial derivative of the loss function with respect to the
distribution parameter. For a twoparameter distribution (e.g. mu and
sigma), the user therefore has to specify two subfamilies with
Family
. The loss
is basically the same function
for both paramters, only ngradient
differs. Both subfamilies
are passed to the Families
constructor, which returns an object
of the class families
.
To (potentially) stabilize the model estimation by standardizing the
negative gradients one can use the argument stabilization
of
the families. If stabilization = "MAD"
, the negative gradient
is divided by its (weighted) median absolute deviation
median_i (u_{k,i}  median_j(u_{k,j}))
in each boosting step. See Hofner et
al. (2016) for details. An alternative is stabilization = "L2"
, where the gradient is divided by its (weighted) mean L2 norm. This results in negative gradient vectors (and hence also updates) of similar size for each distribution parameter, but also for every boosting iteration.
An object of class families
.
BetaLSS
for boosting beta regression was implmented by Florian
Wickler.
B. Hofner, A. Mayr, M. Schmid (2016). gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework. Journal of Statistical Software, 74(1), 131.
Available as vignette("gamboostLSS_Tutorial")
.
Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012): Generalized additive models for location, scale and shape for highdimensional data  a flexible approach based on boosting. Journal of the Royal Statistical Society, Series C (Applied Statistics) 61(3): 403427.
Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society, Series C (Applied Statistics), 54, 507554.
as.families
for applying GAMLSS distributions provided
in the framework of the gamlss
package.
The functions gamboostLSS
and glmboostLSS
can be used for model fitting.
See also the corresponding constructor function
Family
in mboost
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91  ## Example to define a new distribution:
## Students tdistribution with two parameters, df and mu:
## subFamily for mu
## > generate object of the class family from the package mboost
newStudentTMu < function(mu, df){
# loss is negative logLikelihood, f is the parameter to be fitted with
# id link > f = mu
loss < function(df, y, f) {
1 * (lgamma((df + 1)/2)  lgamma(1/2) 
lgamma(df/2)  0.5 * log(df) 
(df + 1)/2 * log(1 + (y  f)^2/(df )))
}
# risk is sum of loss
risk < function(y, f, w = 1) {
sum(w * loss(y = y, f = f, df = df))
}
# ngradient is the negative derivate w.r.t. mu (=f)
ngradient < function(y, f, w = 1) {
(df + 1) * (y  f)/(df + (y  f)^2)
}
# use the Family constructor of mboost
Family(ngradient = ngradient, risk = risk, loss = loss,
response = function(f) f,
name = "new Student's tdistribution: mu (id link)")
}
## subFamily for df
newStudentTDf < function(mu, df){
# loss is negative logLikelihood, f is the parameter to be fitted with
# loglink: exp(f) = df
loss < function( mu, y, f) {
1 * (lgamma((exp(f) + 1)/2)  lgamma(1/2) 
lgamma(exp(f)/2)  0.5 * f 
(exp(f) + 1)/2 * log(1 + (y  mu)^2/(exp(f) )))
}
# risk is sum of loss
risk < function(y, f, w = 1) {
sum(w * loss(y = y, f = f, mu = mu))
}
# ngradient is the negative derivate of the loss w.r.t. f
# in this case, just the derivative of the loglikelihood
ngradient < function(y, f, w = 1) {
exp(f)/2 * (digamma((exp(f) + 1)/2)  digamma(exp(f)/2)) 
0.5  (exp(f)/2 * log(1 + (y  mu)^2 / (exp(f) )) 
(y  mu)^2 / (1 + (y  mu)^2 / exp(f)) * (exp(f) + 1)/2)
}
# use the Family constructor of mboost
Family(ngradient = ngradient, risk = risk, loss = loss,
response = function(f) exp(f),
name = "Student's tdistribution: df (log link)")
}
## families object for new distribution
newStudentT < Families(mu= newStudentTMu(mu=mu, df=df),
df=newStudentTDf(mu=mu, df=df))
### Do not test the following code per default on CRAN as it takes some time to run:
### usage of the new Student's t distribution:
library(gamlss) ## required for rTF
set.seed(1907)
n < 5000
x1 < runif(n)
x2 < runif(n)
mu < 2 1*x1  3*x2
df < exp(1 + 0.5*x1 )
y < rTF(n = n, mu = mu, nu = df)
## model fitting
model < glmboostLSS(y ~ x1 + x2, families = newStudentT,
control = boost_control(mstop = 100),
center = TRUE)
## shrinked effect estimates
coef(model, off2int = TRUE)
## compare to predefined three parametric tdistribution:
model2 < glmboostLSS(y ~ x1 + x2, families = StudentTLSS(),
control = boost_control(mstop = 100),
center = TRUE)
coef(model2, off2int = TRUE)
## with effect on sigma:
sigma < 3+ 1*x2
y < rTF(n = n, mu = mu, nu = df, sigma=sigma)
model3 < glmboostLSS(y ~ x1 + x2, families = StudentTLSS(),
control = boost_control(mstop = 100),
center = TRUE)
coef(model3, off2int = TRUE)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.