FDboostLSS: Model-based Gradient Boosting for Functional GAMLSS

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Function for fitting GAMLSS (generalized additive models for location, scale and shape) with functional data using component-wise gradient boosting, for details see Brockhaus et al. (2015a).

Usage

1
2
FDboostLSS(formula, timeformula, data = list(), families = GaussianLSS(),
  control = boost_control(), weights = NULL, ...)

Arguments

formula

a symbolic description of the model to be fit. If formula is a single formula, the same formula is used for all distribution parameters. formula can also be a (named) list, where each list element corresponds to one distribution parameter of the GAMLSS distribution. The names must be the same as in the families.

timeformula

one-sided formula for the expansion over the index of the response. For a functional response Y_i(t) typically ~bbs(t) to obtain a smooth expansion of the effects along t. In the limiting case that Y_i is a scalar response use ~bols(1), which sets up a base-learner for the scalar 1. Or you can use timeformula=NULL, then the scalar response is treated as scalar. Analogously to formula, timeformula can either be a one-sided formula or a named list of one-sided formulas.

data

a data frame or list containing the variables in the model.

families

an object of class families. It can be either one of the pre-defined distributions that come along with the package gamboostLSS or a new distribution specified by the user (see Families for details). Per default, the two-parametric GaussianLSS family is used.

control

a list of parameters controlling the algorithm. For more details see boost_control.

weights

does not work!

...

additional arguments passed to FDboost, including, family and control.

Details

For details on the theory of GAMLSS see Rigby and Stasinopoulos (2005). FDboostLSS uses FDboost to fit the distibution parameters of a GAMLSS - a functional boosting model is fitted for each parameter. See FDboost for details on boosting functional regression models as introduced by Brockhaus et al. (2015b). See mboostLSS for details on boosting of GAMLSS for scalar variables as introduced by Mayr et al. (2012).

Value

An object of class FDboostLSS that inherits from mboostLSS. The FDboostLSS-object is a named list containing one list entry per distribution parameter and some attributes. The list is named like the parameters, e.g. mu and sigma, if the parameters mu and sigma are modelled. Each list-element is an object of class FDboost.

Author(s)

Sarah Brockhaus

References

Brockhaus, S. and Fuest, A. and Mayr, A. and Greven, S. (2015a): Functional regression models for location, scale and shape applied to stock returns. In: Friedl H. and Wagner H. (eds), Proceedings of the 30th International Workshop on Statistical Modelling: 117-122.

Brockhaus, S., Scheipl, F., Hothorn, T. and Greven, S. (2015b). The functional linear array model. Statistical Modelling, 15(3), 279-300.

Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012): Generalized additive models for location, scale and shape for high-dimensional data - a flexible approach based on boosting. Journal of the Royal Statistical Society: Series C (Applied Statistics), 61(3), 403-427.

Rigby, R. A. and D. M. Stasinopoulos (2005): Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society: Series C (Applied Statistics), 54(3), 507-554.

See Also

Note that FDboostLSS calls FDboost directly.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
########### simulate Gaussian scalar-on-function data
n <- 500 ## number of observations
G <- 120 ## number of observations per functional covariate
set.seed(123) ## ensure reproducibility
z <- runif(n) ## scalar covariate
z <- z - mean(z)
s <- seq(0, 1, l=G) ## index of functional covariate
## generate functional covariate
if(require(splines)){
   x <- t(replicate(n, drop(bs(s, df = 5, int = TRUE) %*% runif(5, min = -1, max = 1))))
}else{
  x <- matrix(rnorm(n*G), ncol = G, nrow = n)
}
x <- scale(x, center = TRUE, scale = FALSE) ## center x per observation point

mu <- 2 + 0.5*z + (1/G*x) %*% sin(s*pi)*5 ## true functions for expectation
sigma <- exp(0.5*z - (1/G*x) %*% cos(s*pi)*2) ## for standard deviation

y <- rnorm(mean = mu, sd = sigma, n = n) ## draw respone y_i ~ N(mu_i, sigma_i)

## save data as list containing s as well
dat_list <- list(y = y, z = z, x = I(x), s = s)

## model fit assuming Gaussian location scale model
m_boost <- FDboostLSS(list(mu = y ~ bols(z, df = 2) + bsignal(x, s, df = 2, knots = 16),
                           sigma = y ~ bols(z, df = 2) + bsignal(x, s, df = 2, knots = 16)),
                           timeformula = NULL, data = dat_list)
summary(m_boost)

## Not run: 
 if(require(gamboostLSS)){
  ## find optimal number of boosting iterations on a grid in [1, 500]
  ## using 5-fold bootstrap
  grid <-  make.grid(c(mu = 500, sigma = 500), length.out = 10)
  ## takes some time, easy to parallelize on Linux
  set.seed(123)
  cvr <- cvrisk(m_boost, folds = cv(model.weights(m_boost[[1]]), B = 5),
                grid = grid, trace = FALSE)
  ## use model at optimal stopping iterations
  m_boost <- m_boost[mstop(cvr)] ## [c(172, 63)]

  ## plot smooth effects of functional covariates
  par(mfrow = c(1,2))
  plot(m_boost$mu, which = 2, ylim = c(0,5))
  lines(s, sin(s*pi)*5, col = 3, lwd = 2)
  plot(m_boost$sigma, which = 2, ylim = c(-2.5,2.5))
  lines(s, -cos(s*pi)*2, col = 3, lwd = 2)
 }

## End(Not run)

FDboost documentation built on May 2, 2019, 6:48 p.m.

Related to FDboostLSS in FDboost...