Fit BAMLSS with Backfitting
Description
This optimizer function is a generic tool for fitting BAMLSS using a backfitting algorithm. The backfitting procedure is based on iteratively weighted least squares (IWLS) for finding posterior mode estimates, however, the updating methods for model terms can be more general, see the details section. In addition, the default IWLS updating scheme implements optimum smoothing variance selection based on information criteria using a stepwise approach.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13  ## Optimizer function:
bfit(x, y, family, start = NULL, weights = NULL, offset = NULL,
update = "iwls", criterion = c("AICc", "BIC", "AIC"),
eps = .Machine$double.eps^0.25, maxit = 400,
outer = FALSE, inner = FALSE, mgcv = FALSE,
verbose = TRUE, digits = 4, flush = TRUE, ...)
## Model term updating functions:
bfit_iwls(x, family, y, eta, id, weights, criterion, ...)
bfit_iwls_Matrix(x, family, y, eta, id, weights, criterion, ...)
bfit_iwls_spam(x, family, y, eta, id, weights, criterion, ...)
bfit_lm(x, family, y, eta, id, weights, criterion, ...)
bfit_optim(x, family, y, eta, id, weights, criterion, ...)

Arguments
x 
For function 
y 
The model response, as returned from function 
family 
A bamlss family object, see 
start 
A named numeric vector containing possible starting values, the names are based on
function 
weights 
Prior weights on the data, as returned from function 
offset 
Can be used to supply model offsets for use in fitting,
returned from function 
update 
Sets the updating function for model terms, e.g. for a term 
criterion 
Set the information criterion that should be used, e.g., for smoothing
variance selection. Options are the corrected AIC 
eps 
The relative convergence tolerance of the backfitting algorithm. 
maxit 
The maximum number of iterations for the backfitting algorithm 
outer 
Should the current working observations and weights be computed in one outer iteration, otherwise the working observations are computed anew for each model term updating step. 
inner 
Should the model terms for one parameter of the modeled distribution be fully updated until convergence in an inner iteration, i.e., the algorithm waits until coefficients for the current distribution parameter do not change anymore before updating the next parameter. 
mgcv 
Should the mgcv 
verbose 
Print information during runtime of the algorithm. 
digits 
Set the digits for printing when 
flush 
use 
eta 
The current value of the predictors, provided as a named list, one list entry
for each parameter. The names correspond to the parameter names in the family object,
see 
id 
Character, the name of the current parameter for which the model term should be updated. 
... 
For function 
Details
This algorithm is based on iteratively weighted least squares (IWLS) for BAMLSS, i.e., a NewtonRaphson or Fisher scoring algorithm is applied, similar to Rigby~and~Stasinopoulos~(2005). The algorithm utilizes the chain rule for computing derivatives of the logposterior w.r.t. regression coefficients, therefore, to compute the working observations and weights only the derivatives of the loglikelihood w.r.t. the linear predictors are required.
It is assumed that the provided family
object holds functions for computing the first
and second order derivatives of the loglikelihood w.r.t. the linear predictors. These Functions
are provided within the named lists "score"
and "hess"
within the family
object. See the documentation of family.bamlss
and the code of the provided
families, e.g. gaussian_bamlss
, for examples of the required structure.
The algorithm either updates each model term over all distributional parameters sequentially,
or does a full update until convergence for model terms for one distributional parameter before
updating the next parameter, see argument inner
. Additionally, working observations and
weights can be computed only once in an outer
iteration.
Starting values of regression coefficients and smoothing variances can be supplied, moreover,
if a family object holds functions for initializing the distributional parameters, see also
family.bamlss
, starting values are based on the initialize functions.
The default updating function for model terms is based on IWLS, which is assigned by function
bamlss.engine.setup
, however, special updating functions can be used.
This is achieved by providing an updating function to argument
update
, which should be used for all model terms. Another option is to set the updating
function within the xt
argument of the mgcv smooth term constructor functions, see
e.g. function s
. If the xt
list then holds an element named "update"
,
which is a valid updating function, this updating function is used for the corresponding model
term. This way it is possible to call different (special) updating functions for specific terms,
e.g., that do not fit in the IWLS scheme. See the examples below. Note that this does not work if
mgcv = TRUE
, since the gam
function assumes a strict linear
representation of smooth terms.
A model term updating function has the following arguments:
update(x, family, y, eta, id, weights, criterion, ...)
Here x
is an object as returned from function smooth.construct
or smoothCon
. The x
object is preprocessed by function
bamlss.engine.setup
, i.e., an element called "state"
is assigned. The state
element represents the current state of the model term holding the current values of the
parameters with corresponding fitted values, as well as equivalent degrees of freedom, see
also the values that are returned by such functions below. The backfitting algorithm uses the
state of a model term for generating updates of the parameters. Note that for special model
terms the state list should already be provided within the call to the corresponding
smooth constructor function, see the growth curve example below.
In addition, for special model terms the fitted values may not be computed by a linear combination
of the design matrix and the coefficients. Therefore, the x
object should hold an element
named "fit.fun"
which is a function for computing the fitted values.
See also smooth.construct.bamlss.frame
and predict.bamlss
that use
this setup. The arguments of fitting functions are
fit.fun(X, b, ...)
where X
is the design matrix and b
is the vector of coefficients. Hence, for
usual IWLS updating the fitted values are computed by X %*% b
. For special terms like
nonlinear growth curves this may not be the case, see the example below. The fitting functions
are assigned by bamlss.engine.setup
, unless the function is already provided
after calling the constructor function smooth.construct
or
smoothCon
. Note that the dots argument is usually not needed by the user.
The default updating function is bfit_iwls()
. Functions bfit_iwls_Matrix()
and
bfit_iwls_spam()
use the sparse matrix infrastructures of package Matrix and
spam. The Matrix package and bfit_iwls_Matrix()
is used for model terms where
the maximum number of nonzero entries in the design matrix is less than half of the total number
of columns, if an additional argument force.spam
is set to TRUE
in the bfit()
call. Similarly the spam package and bfit_iwls_spam()
is used if
force.spam = TRUE
is supplied in the bfit()
call.
The IWLS updating functions find optimum smoothing variances according to an information criterion using a stepwise approach, i.e., in each iteration and for each model term update the updating functions try to find a better smoothing variance to control the tradeoff between oversmoothing and nonlinear functional estimation. The search interval is centered around the current state of the smoothing variances, hence, in each iteration only a slight improvement is achieved. This algorithm is based on Belitz~and~Lang~(2008) and can also be viewed as a boosting approach for optimization.
Value
For function bfit()
a list containing the following objects:
fitted.values 
A named list of the fitted values of the modeled parameters of the selected distribution. 
parameters 
The estimated set regression coefficients and smoothing variances. 
edf 
The equivalent degrees of freedom used to fit the model. 
logLik 
The value of the loglikelihood. 
logPost 
The value of the logposterior. 
IC 
The value of the information criterion. 
converged 
Logical, indicating convergence of the backfitting algorithm. 
For updating functions a list providing the current state 

fitted.values 
The resulting fitted values after updating. 
parameters 
The resulting named numeric vector of updated model term parameters.
Coefficients should be named with 
edf 
The equivalent degrees of freedom used to produce the fitted values. 
hessian 
Optional, the coefficient Hessian information 
log.prior 
Optional, the value of the logprior of the model term. 
References
Belitz C, Lang S (2008). Simultaneous Selection of Variables and Smoothing Parameters in Structured Additive Regression Models. Computational Statistics \& Data Analysis, 53, pp 6181.
Umlauf N, Klein N, Zeileis A (2016). Bayesian Additive Models for Location Scale and Shape (and Beyond). (to appear)
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape, (with discussion), Appl. Statist., 54, part 3, pp 507554.
See Also
bamlss
, bamlss.frame
,
bamlss.engine.setup
, set.starting.values
, s2
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92  ## Not run: ## Simulated data example illustrating
## how to call the optimizer function.
## This is done internally within
## the setup of function bamlss().
d < GAMart(n = 200)
f < num ~ s(x1) + s(x2) + s(x3)
bf < bamlss.frame(f, data = d, family = "gaussian")
opt < with(bf, bfit(x, y, family))
print(str(opt))
## Same with bamlss().
b < bamlss(f, data = d, family = "gaussian", sampler = FALSE)
plot(b)
summary(b)
## Use of different updating function.
b < bamlss(f, data = d, family = "gaussian",
sampler = FALSE, update = bfit_lm)
plot(b)
## Use mgcv gam() function for updating.
b < bamlss(f, data = d, family = "gaussian",
sampler = FALSE, mgcv = TRUE)
plot(b)
## Special smooth constructor including updating/sampler
## function for nonlinear Gompertz curves.
## Note: element special.npar is needed here since this
## function has 3 parameters but the design matrix only
## one column!
smooth.construct.gc.smooth.spec < function(object, data, knots)
{
object$X < matrix(as.numeric(data[[object$term]]), ncol = 1)
center < if(!is.null(object$xt$center)) {
object$xt$center
} else FALSE
object$by.done < TRUE
if(object$by != "NA")
stop("by variables not supported!")
object$fit.fun < function(X, b, ...) {
f < b[1] * exp(b[2] * exp(b[3] * drop(X)))
if(center)
f < f  mean(f)
f
}
object$update < bfit_optim
object$propose < GMCMC_slice
object$prior < function(b) { sum(dnorm(b, sd = 1000, log = TRUE)) }
object$fixed < TRUE
object$state$parameters < c("b1" = 0, "b2" = 0.5, "b3" = 0.1)
object$state$fitted.values < rep(0, length(object$X))
object$state$edf < 3
object$special.npar < 3 ## Important!
class(object) < c("gc.smooth", "no.mgcv", "special")
object
}
## Work around for the "prediction matrix" of a growth curve.
Predict.matrix.gc.smooth < function(object, data, knots)
{
X < matrix(as.numeric(data[[object$term]]), ncol = 1)
X
}
## Heteroscedastic growth curve data example.
set.seed(123)
d < data.frame("time" = 1:30)
d$y < 2 + 1 / (1 + exp(0.5 * (15  d$time))) +
rnorm(30, sd = exp(4 + 0.1 * d$time))
## Special model terms must be called with s2()!
f < list(
"mu" = y ~ s2(time, bs = "gc"),
"sigma" = ~ s(time)
)
## Fit model with special model term.
b < bamlss(f, data = d,
optimizer = bfit, sampler = GMCMC)
## Plot the fitted curves.
plot(b)
## Predict with special model term.
nd < data.frame("time" = seq(1, 30, length = 100))
p < predict(b, newdata = nd, model = "mu", FUN = c95)
plot(d, ylim = range(c(d$y, p)))
matplot(nd$time, p, type = "l",
lty = c(2, 1, 2), col = "black", add = TRUE)
## End(Not run)
