family.bamlss: Distribution Families in 'bamlss'

family.bamlssR Documentation

Distribution Families in bamlss

Description

Family objects in bamlss specify the information that is needed for using (different) model fitting engines, e.g., the parameter names and corresponding link functions, the density function, derivatives of the log-likelihood w.r.t. the predictors, and so forth. The optimizer or sampler functions that are called by bamlss must know how much information is needed to interpret the model since the family objects are simply passed through. Family objects are also used for computing post-modeling statistics, e.g., for residual diagnostics or random number generation. See the details and examples.

Usage

## Family objects in bamlss:
ALD_bamlss(..., tau = 0.5, eps = 0.01)
beta_bamlss(...)
binomial_bamlss(link = "logit", ...)
cnorm_bamlss(...)
cox_bamlss(...)
dw_bamlss(...)
DGP_bamlss(...)
dirichlet_bamlss(...)
ELF_bamlss(..., tau = 0.5)
gaussian_bamlss(...)
gaussian2_bamlss(...)
Gaussian_bamlss(...)
gamma_bamlss(...)
logNN_bamlss(...)
multinomial_bamlss(...)
mvnorm_bamlss(k = 2, ...)
mvnormAR1_bamlss(k = 2, ...)
poisson_bamlss(...)
gpareto_bamlss(...)
glogis_bamlss(...)
AR1_bamlss(...)
beta1_bamlss(ar.start, ...)
nbinom_bamlss(...)
ztnbinom_bamlss(...)
lognormal_bamlss(...)
weibull_bamlss(...)
Sichel_bamlss(...)
GEV_bamlss(...)
gumbel_bamlss(...)
mix_bamlss(f1, f2, ...)
ZANBI_bamlss(...)

## Extractor functions:
## S3 method for class 'bamlss'
family(object, ...)
## S3 method for class 'bamlss.frame'
family(object, ...)

Arguments

object

An object of class "bamlss" or "bamlss.frame", see function bamlss and bamlss.frame.

k

The dimension of the multivariate normal. Note, if k = 1 function gaussian_bamlss() is called.

ar.start

Logical vector of length equal to the number of rows of the full data set used for modeling. Must hold entries TRUE indicating the start of a time series of a section. If ar.start = NULL lagged residuals are computed by simple shifting. See also bam.

link

Possible link functions.

tau

The quantile the should be fitted.

eps

Constant to be used for the approximation of the absolute function.

f1, f2

A family of class "gamlss.family", see package gamlss.dist.

...

Arguments passed to functions that are called within the family object.

Details

The following lists the minimum requirements on a bamlss family object to be used with bamlss and bamlss.frame:

  • The family object must return a list of class "family.bamlss".

  • The object must contain the family name as a character string.

  • The object must contain the names of the parameters as a character string, as well as the corresponding link functions as character string.

For most optimizer and sampling functions at least the density function, including a log argument, should be provided. When using generic model fitting engines like opt_bfit or sam_GMCMC, as well as for computing post-modeling statistics with function samplestats, and others, it is assumed that the density function in a family object has the following arguments:

d(y, par, log = FALSE, ...)

where argument y is the response (possibly a matrix) and par is a named list holding the evaluated parameters of the distribution, e.g., using a normal distribution par has two elements, one for the mean par$mu and one for the standard deviation par$sigma. The dots argument is for passing special internally used objects, depending on the type of model this feature is usually not needed.

Similarly, for derivative based algorithms, e.g. using iteratively weighted least squares (IWLS, see function opt_bfit, the family object holds derivative functions evaluating derivatives of the log-likelihood w.r.t. the predictors (or expectations of derivatives). For each parameter, these functions also hold the following arguments:

score(y, par, ...)

for computing the first derivative of the log-likelihood w.r.t. a predictor and

hess(y, par, ...)

for computing the negative second derivatives. Within the family object these functions are organized in a named list, see the examples below.

In addition, for the cumulative distribution function (p(y, par, ...)), for the quantile function (q(y, par, ...)) or for creating random numbers (r(n, par, ...)) the same structure is assumed. See, e.g., the code of function gaussian.bamlss().

Some model fitting engines can initialize the distributional parameters which oftentimes leads to much faster convergence. The initialize functions are again organized within a named list, one entry for each parameter, similar to the score and hess functions, e.g., see the code of family object gaussian.bamlss.

Using function bamlss, residuals.bamlss and predict.bamlss the family objects may also specify the transform()er, optimizer(), sampler(), samplestats(), results(), residuals() and predict() function that should be used with this family. See for example the setup of cox_bamlss.

For using specialized estimation engines like sam_JAGS it is recommended to supply any extra arguments needed by those engines with an additional list entry within the family object, e.g., using gaussian_bamlss() with sam_JAGS the family objects holds special details in an element named "bugs".

The examples below illustrate this setup. See also the code of the bamlss family functions.

See Also

bamlss, bamlss.frame

Examples

## New family object for the normal distribution,
## can be used by function opt_bfit() and sam_GMCMC().
normal_bamlss <- function(...) {
  f <- list(
    "family" = "normal",
    "names" = c("mu", "sigma"),
    "links" = c("identity", "log"),
    "d" = function(y, par, log = FALSE) {
      dnorm(y, mean = par$mu, sd = par$sigma, log = log)
    },
    "score" = list(
      "mu" = function(y, par, ...) {
        drop((y - par$mu) / (par$sigma^2))
      },
      "sigma" = function(y, par, ...) {
        drop(-1 + (y - par$mu)^2 / (par$sigma^2))
      }
    ),
    "hess" = list(
      "mu" = function(y, par, ...) {
        drop(1 / (par$sigma^2))
      },
      "sigma" = function(y, par, ...) { 
        rep(2, length(y))
      }
    )
  )
  class(f) <- "family.bamlss"
  return(f)
}

## Not run: ## Test on simulated data.
d <- GAMart()
b <- bamlss(num ~ s(x1) + s(x2) + s(x3),
  data = d, family = "normal")
plot(b)

## Compute the log-likelihood using the family object.
f <- family(b)
sum(f$d(y = d$num, par = f$map2par(fitted(b)), log = TRUE))

## For using JAGS() more details are needed.
norm4JAGS_bamlss <- function(...) {
  f <- normal_bamlss()
  f$bugs <- list(
    "dist" = "dnorm",
    "eta" = BUGSeta,
    "model" = BUGSmodel,
    "reparam" = c(sigma = "1 / sqrt(sigma)")
  )
  return(f)
}

## Now with opt_bfit() and sam_JAGS().
b <- bamlss(num ~ s(x1) + s(x2) + s(x3), data = d,
  optimizer = opt_bfit, sampler = sam_JAGS, family = "norm4JAGS")
plot(b)

## End(Not run)

bamlss documentation built on Oct. 11, 2024, 5:07 p.m.