fitnvmix: Fitting Multivariate Normal Variance Mixtures
In nvmix: Multivariate Normal Variance Mixtures

fitnvmix

R Documentation

Fitting Multivariate Normal Variance Mixtures

Description

Functionalities for fitting multivariate normal variance mixtures (in particular also Multivariate t distributions) via an ECME algorithm.

Usage

fitnvmix(x, qmix, mix.param.bounds, nu.init = NA, loc = NULL, scale = NULL,
         init.size.subsample = min(n, 100), size.subsample = n,
         control = list(), verbose = TRUE)

fitStudent(x, loc = NULL, scale = NULL, mix.param.bounds = c(1e-3, 1e2), ...)
fitNorm(x)

Arguments

`x`	`(n, d)`-data `matrix`.
`qmix`	specification of the mixing variable `W`; see McNeil et al. (2015, Chapter 6). Supported are the following types of specification (see also the examples below): `character`: `character` string specifying a supported distribution; currently available are `"constant"` (in which case `W = 1` and thus the multivariate normal distribution with mean vector `loc` and covariance matrix `scale` results), `"inverse.gamma"` (in which case `W` is inverse gamma distributed with shape and rate parameters `df`/2 and thus the multivariate Student t distribution with `df` degrees of freedom results) and `"pareto"` (in which case `W` is Pareto distributed with scale equal to unity and shape equal to `alpha`). `function`: `function` interpreted as the quantile function of the mixing variable `W`. In this case, `qmix` must have the form `qmix = function(u, nu)`, where the argument `nu` corresponds to the parameter (vector) specifying the distribution of the mixing variable.
`mix.param.bounds`	either `numeric(2)` or a `matrix` with two columns. The first/second column corresponds to the lower/upper bound of `nu_i`, the ith component of the parameter vector `nu` of the mixing variable `W`. All elements need to be finite, numeric values. Note: The algorithm tends to converge quicker if the parameter ranges supplied are smaller.
`nu.init`	either `NA` or an initial guess for the parameter (vector) `nu`. In the former case an initial estimate is calculated by the algorithm. If `nu.init` is provided, the algorithm often converges faster; the better the starting value, the faster convergence.
`loc`	`d`-`vector`; if provided, taken as the 'true' location vector in which case `loc` is not estimated.
`scale`	positive definite `(d, d)`-`matrix`; if provided, taken as the 'true' scale matrix in which case `scale` is not estimated.
`init.size.subsample`	`numeric`, non-negative, giving the sub-samplesize used to get an initial estimate for `nu`. Only used if `is.na(nu.init)`, otherwise ignored.
`size.subsample`	`numeric`, non-negative, specifying the size of the subsample that is being used in the ECME iterations to optimize the log-likelihood over `nu`. Defaults to `n`, so that the full sample is being used. Decreasing this number can lead to faster run-times (as fewer log-densities need to be estimated) but also to an increase in bias and variance.
`control`	`list` specifying algorithm specific parameters; see below under 'Details' and `get_set_param()`.
`verbose`	`numeric` or `logical` (in which case it is converted to `numeric`) specifying the amount of tracing to be done. If `0` or `FALSE`, neither tracing nor warnigns are communicated; if `1`, only warnigns are communicated, if `2` or `3`, warnings and (shorter or longer) tracing information is provided.
`...`	additional arguments passed to the underlying `fitnvmix()`.

Details

The function fitnvmix() uses an ECME algorithm to approximate the MLEs of the parameters nu, loc and scale of a normal variance mixture specified by qmix. The underlying procedure successively estimates nu (with given loc and scale) by maximizing the likelihood which is approximated by dnvmix() (unless qmix is a character string, in which case analytical formulas for the log-densities are used) and scale and loc (given nu) using weights (which again need to be approximated) related to the posterior distribution, details can be found in the first reference below.

It should be highlighted that (unless unless qmix is a character string), every log-likelihood and every weight needed in the estimation is numerically approximated via RQMC methods. For large dimensions and sample sizes this procedure can therefore be slow.

Various tolerances and convergence criteria can be changed by the user via the control argument. For more details, see get_set_param().

Value

The function fitnvmix() returns an S3 object of class "fitnvmix", basically a list which contains, among others, the components

nu: Estimated mixing parameter (vector) nu.
loc: Estimated or provided loc vector.
scale: Estimated or provided scale matrix.
max.ll: Estimated log-likelihood at reported estimates.
x: Input data matrix x.

The methods print(), summary() and plot() are defined for the class "fitnvmix".

fitStudent() is a wrapper to fitnvmix() for parameter estimation of multivariate Student t distributions; it also returns an S3 object of class "fitnvmix" where the fitted degrees of freedom are called "df" instead of "nu" (to be consistent with the other wrappers for the Student t distributions).

fitNorm() just returns a list with components loc (columnwise sample means) and scale (sample covariance matrix).

Author(s)

Erik Hintz, Marius Hofert and Christiane Lemieux

References

Hintz, E., Hofert, M. and Lemieux, C. (2021), Normal variance mixtures: Distribution, density and parameter estimation. Computational Statistics and Data Analysis 157C, 107175.

Hintz, E., Hofert, M. and Lemieux, C. (2022), Multivariate Normal Variance Mixtures in R: The R Package nvmix. Journal of Statistical Software, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v102.i02")}.

McNeil, A. J., Frey, R. and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press.

Liu, C. and Rubin, D. (1994). The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81(4), 633–648.

Examples

## Sampling parameters
set.seed(274) # for reproducibility
nu               <- 2.8 # parameter used to sample data
d                <- 4 # dimension
n                <- 75 # small sample size to have examples run fast
loc              <- rep(0, d) # location vector
A                <- matrix(runif(d * d), ncol = d)
diag_vars        <- diag(runif(d, min = 2, max = 5))
scale            <- diag_vars %*% cov2cor(A %*% t(A)) %*% diag_vars # scale matrix
mix.param.bounds <- c(1, 5) # nu in [1, 5]

### Example 1: Fitting a multivariate t distribution ###########################

if(FALSE){
    ## Define 'qmix' as the quantile function of an IG(nu/2, nu/2) distribution
    qmix <- function(u, nu) 1 / qgamma(1 - u, shape = nu/2, rate = nu/2)
    ## Sample data using 'rnvmix'
    x <- rnvmix(n, qmix = qmix, nu = nu, loc = loc, scale = scale)
    ## Call 'fitvnmix' with 'qmix' as a function (so all densities/weights are estimated)
    (MyFit11 <- fitnvmix(x, qmix = qmix, mix.param.bounds = mix.param.bounds))
    ## Call 'fitnvmix' with 'qmix = "inverse.gamma"' in which case analytical formulas
    ## for weights and densities are used:
    (MyFit12 <- fitnvmix(x, qmix = "inverse.gamma",
                         mix.param.bounds = mix.param.bounds))
    ## Alternatively, use the wrapper 'fitStudent()'
    (MyFit13 <- fitStudent(x))
    ## Check
    stopifnot(all.equal(MyFit11$nu, MyFit12$nu, tol = 5e-2),
              all.equal(MyFit11$nu, MyFit13$nu, tol = 5e-2))
    ## Can also provide 'loc' and 'scale' in which case only 'nu' is estimated
    (MyFit13 <- fitnvmix(x, qmix = "inverse.gamma", mix.param.bounds = mix.param.bounds,
                         loc = loc, scale = scale))
    (MyFit14 <- fitStudent(x, loc = loc, scale = scale))
    stopifnot(all.equal(MyFit13$nu, MyFit14$df, tol = 1e-6))
}

### Example 2: Fitting a Pareto mixture ########################################

## Define 'qmix' as the quantile function of a Par(nu, 1) distribution
qmix <- function(u, nu) (1-u)^(-1/nu)
## Sample data using 'rnvmix':
x <- rnvmix(n, qmix = qmix, nu = nu, loc = loc, scale = scale)
## Call 'fitvnmix' with 'qmix' as function (=> densities/weights estimated)
(MyFit21 <- fitnvmix(x, qmix = qmix, mix.param.bounds = mix.param.bounds))
## Call 'fitnvmix' with 'qmix = "pareto"' in which case an analytical formula
## for the density is used
(MyFit22 <- fitnvmix(x, qmix = "pareto", mix.param.bounds = mix.param.bounds))
stopifnot(all.equal(MyFit21$nu, MyFit22$nu, tol = 5e-2))

nvmix documentation built on March 19, 2024, 3:07 a.m.