Distribution Families in bamlss

Share:

Description

Family objects in bamlss specify the information that is needed for using (different) model fitting engines, e.g., the parameter names and corresponding link functions, the density function, derivatives of the log-likelihood w.r.t. the linear predictors, and so forth. The optimizer or sampler functions that are called by bamlss must know how much information is needed to interpret the model since the family objects are simply passed through. Family objects are also used for computing post-modeling statistics, e.g., for residual diagnostics or random number generation. See the details and examples.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Family objects in bamlss:
beta_bamlss(...)
binomial_bamlss(...)
cnorm_bamlss(...)
cox_bamlss(...)
gaussian_bamlss(...)
gamma_bamlss(...)
multinomial_bamlss(...)
mvnorm_bamlss(k = 2, ...)
poisson_bamlss(...)

## Extractor functions:
## S3 method for class 'bamlss'
family(object, ...)
## S3 method for class 'bamlss.frame'
family(object, ...)

Arguments

object

An object of class "bamlss" or "bamlss.frame", see function bamlss and bamlss.frame.

k

The dimension of the multivariate normal. Note, if k = 1 function gaussian_bamlss() is called.

...

Arguments passed to functions that are called within the family object.

Details

The following lists the minimum requirements on a bamlss family object to be used with bamlss and bamlss.frame:

  • The family object must return a list of class "family.bamlss".

  • The object must contain the family name as a character string.

  • The object must contain the names of the parameters as a character string, as well as the corresponding link functions as character string.

For most optimizer and sampling functions at least the density function, including a log argument, should be provided. When using generic model fitting engines like bfit or GMCMC, as well as for computing post-modeling statistics with function samplestats, and others, it is assumed that the density function in a family object has the following arguments:

d(y, par, log = FALSE, ...)

where argument y is the response (possibly a matrix) and par is a named list holding the evaluated parameters of the distribution, e.g., using a normal distribution par has two elements, one for the mean par$mu and one for the standard deviation par$sigma. The dots argument is for passing special internally used objects, depending on the type of model this feature is usually not needed.

Similarly, for derivative based algorithms, e.g. using iteratively weighted least squares (IWLS, see function bfit, the family object holds derivative functions evaluating derivatives of the log-likelihood w.r.t. the linear predictors (or expectations of derivatives). For each parameter, these functions also hold the following arguments:

score(y, par, ...)

for computing the first derivative of the log-likelihood w.r.t. a linear predictor and

hess(y, par, ...)

for computing the negative second derivatives. Within the family object these functions are organized in a named list, see the examples below.

In addition, for the cumulative distribution function (p(y, par, ...)), for the quantile function (q(y, par, ...)) or for creating random numbers (r(y, par, ...)) the same structure is assumed. See, e.g., the code of function gaussian.bamlss().

Some model fitting engines can initialize the distributional parameters which oftentimes leads to much faster convergence. The initialize functions are again organized within a named list, one entry for each parameter, similar to the score and hess functions, e.g., see the code of family object gaussian.bamlss.

Using function bamlss, residuals.bamlss and predict.bamlss the family objects may also specify the transform()er, optimizer(), sampler(), samplestats(), results(), residuals() and predict() function that should be used with this family. See for example the setup of cox_bamlss.

For using specialized estimation engines like JAGS it is recommended to supply any extra arguments needed by those engines with an additional list entry within the family object, e.g., using gaussian_bamlss() with JAGS the family objects holds special details in an element named "bugs".

The examples below illustrate this setup. See also the code of the bamlss family functions.

See Also

bamlss, bamlss.frame

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
## New family object for the normal distribution,
## can be used by function bfit() and GMCMC().
normal_bamlss <- function(...) {
  f <- list(
    "family" = "normal",
    "names" = c("mu", "sigma"),
    "links" = c("identity", "log"),
    "d" = function(y, par, log = FALSE) {
      dnorm(y, mean = par$mu, sd = par$sigma, log = log)
    },
    "score" = list(
      "mu" = function(y, par, ...) {
        drop((y - par$mu) / (par$sigma^2))
      },
      "sigma" = function(y, par, ...) {
        drop(-1 + (y - par$mu)^2 / (par$sigma^2))
      }
    ),
    "hess" = list(
      "mu" = function(y, par, ...) {
        drop(1 / (par$sigma^2))
      },
      "sigma" = function(y, par, ...) { 
        rep(2, length(y))
      }
    )
  )
  class(f) <- "family.bamlss"
  return(f)
}

## Not run: ## Test on simulated data.
d <- GAMart()
b <- bamlss(num ~ s(x1) + s(x2) + s(x3),
  data = d, family = "normal")
plot(b)

## Compute the log-likelihood using the family object.
f <- family(b)
sum(f$d(y = d$num, par = f$map2par(fitted(b)), log = TRUE))

## For using JAGS() more details are needed.
norm4JAGS.bamlss <- function(...) {
  f <- normal_bamlss()
  f$bugs <- list(
    "dist" = "dnorm",
    "eta" = BUGSeta,
    "model" = BUGSmodel,
    "reparam" = c(sigma = "1 / sqrt(sigma)")
  )
  return(f)
}

## Now with bfit() and JAGS().
b <- bamlss(num ~ s(x1) + s(x2) + s(x3), data = d,
  optimizer = bfit, sampler = JAGS, family = "norm4JAGS")
plot(b)

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.