# theta-utils: Utilities for the parameter vector of Lambert W\times F... In LambertW: Probabilistic Models to Analyze and Gaussianize Heavy-Tailed, Skewed Data

 theta-utils R Documentation

## Utilities for the parameter vector of Lambert W\times F distributions

### Description

These functions work with \boldsymbol θ = (\boldsymbol β, γ, δ, α), which fully parametrizes Lambert W\times F distributions.

See Details for more background information on some functions.

check_theta checks if θ = (α, \boldsymbol β, γ, δ) describes a well-defined Lambert W distribution.

complete_theta completes missing values in a parameters list so users don't have to specify everything in detail. If not supplied, then alpha = 1, gamma = 0, and delta = 0 will be set by default.

flatten_theta and unflatten_theta convert between the list theta and its vector-style flattened type. The flattened version is required for several optimization routines, since they optimize over multivariate vectors – not lists.

get_initial_theta provides initial estimates for α, \boldsymbol β, γ, and δ, which are then used in maximum likelihood (ML) estimation (MLE_LambertW).

get_theta_bounds returns lower and upper bounds for θ (necessary for optimization such as MLE_LambertW).

theta2tau converts θ to the transformation vector τ = (μ_x, σ_x, γ, δ, α).

theta2unbounded transforms θ from the bounded space to an unrestricted space (by \log-transformation on σ_x, δ, and α; note that this restricts γ ≥q 0, δ ≥q 0, and α ≥q 0.).

### Usage

check_theta(theta, distname)

complete_theta(theta = list(), LambertW.input = NULL)

flatten_theta(theta)

get_initial_theta(
y,
distname,
type = c("h", "hh", "s"),
theta.fixed = list(alpha = 1),
method = c("Taylor", "IGMM"),
use.mean.variance = TRUE
)

get_theta_bounds(
distname,
beta,
type = c("s", "h", "hh"),
not.negative = FALSE
)

theta2tau(theta = list(beta = c(0, 1)), distname, use.mean.variance = TRUE)

theta2unbounded(theta, distname, type = c("h", "hh", "s"), inverse = FALSE)

unflatten_theta(theta.flattened, distname, type)


### Arguments

 theta list; a (possibly incomplete) list of parameters alpha, beta, gamma, delta. complete_theta fills in default values for missing entries. distname character; name of input distribution; see get_distnames. LambertW.input optional; if beta is missing in theta, LambertW.input (which has a beta element) must be specified. y a numeric vector of real values (the observed data). type type of Lambert W \times F distribution: skewed "s"; heavy-tail "h"; or skewed heavy-tail "hh". theta.fixed list; fixed parameters for the optimization; default: alpha = 1. method character; should a fast "Taylor" (default) approximation be used (delta_Taylor or gamma_Taylor) to estimate δ or γ, or should "IGMM" (IGMM) estimates be used. Use "Taylor" as initial values for IGMM; IGMM improves upon it and should be used for MLE_LambertW. Do not use "IGMM" as initial values for IGMM – this will run IGMM twice. use.mean.variance logical; if TRUE it uses mean and variance implied by \boldsymbol β to do the transformation (Goerg 2011). If FALSE, it uses the alternative definition from Goerg (2016) with location and scale parameter. beta numeric vector (deprecated); parameter \boldsymbol β of the input distribution. See check_beta on how to specify beta for each distribution. not.negative logical; if TRUE it sets the lower bounds for alpha and delta to 0. Default: FALSE. inverse logical; if TRUE, it transforms the unbounded theta back to the original, bounded space. Default: FALSE. theta.flattened named vector; flattened version of list theta.

### Details

get_initial_theta obtains a quick initial estimate of θ by first finding the (approximate) input \widehat{\boldsymbol x}_{\widehat{θ}} by IGMM, and then estimating \boldsymbol β for this input data \widehat{\boldsymbol x}_{\widehat{θ}} \sim F_X(x \mid \boldsymbol β) (see estimate_beta).

Converting theta to an unbouded space is especially useful for optimization routines (like nlm), which can be performed over an unconstrained space. The obtained optimum can be converted back to the original space using the inverse transformation (set inverse = TRUE transforms it via \exp) – this guarantees that the estimate satisfies non-negativity constraints (if required). The main advantage is that this avoids using optimization routines with boundary constraints – since they are much slower compared to uncostrained optimization.

### Value

check_theta throws an error if list theta does not define a proper Lambert W \times F distribution; does nothing otherwise.

complete_theta returns a list containing:

 alpha heavy tail exponent(s), beta named vector \boldsymbol β of the input distribution, gamma skewness parameter, delta heavy-tail parameter(s).

get_initial_theta returns a list containing:

 alpha heavy tail exponent; default: 1, beta named vector \boldsymbol β of the input distribution; estimated from the recovered input data \widehat{\mathbf{x}}_{\widehat{τ}}, gamma skewness parameter; if type is "h" or "hh" gamma = 0; estimated from IGMM, delta heavy-tail parameter; estimated from IGMM. If type = "s", then delta = 0.

get_theta_bounds returns a list containing two vectors:

 lower flattened vector of lower bounds for valid θ, upper flattened vector of upper bounds for valid θ.

check_beta

estimate_beta, get_initial_tau

beta2tau

### Examples


## Not run:
check_theta(theta = list(beta =  c(1, 1, -1)), distname = "t")

## End(Not run)

check_theta(theta = list(beta =  c(1, 1)), distname = "normal") # ok

params <- list(beta = c(2, 1), delta = 0.3) # alpha and gamma are missing
complete_theta(params) # added default values

params <- list(beta = c(2, 1), delta = 0.3, alpha = c(1, 2))
params <- complete_theta(params)
check_theta(params, distname = 'normal')

###
x <- rnorm(1000)
get_initial_theta(x, distname = "normal", type = "h")
get_initial_theta(x, distname = "normal", type = "s")

# starting values for the skewed version of an exponential
y <- rLambertW(n = 1000, distname = "exp", beta = 2, gamma = 0.1)
get_initial_theta(y, distname = "exp", type = "s")

# starting values for the heavy-tailed version of a Normal = Tukey's h
y <- rLambertW(n = 1000, beta = c(2, 1), distname = "normal", delta = 0.2)
get_initial_theta(y, distname = "normal", type = "h")#'

###
get_theta_bounds(type = "hh", distname = "normal", beta = c(0, 1))

###
theta.restr <- theta2unbounded(list(beta = c(-1, 0.1),
delta = c(0.2, 0.2)),
distname = "normal")
theta.restr
# returns again the beta and delta from above
theta2unbounded(theta.restr, inverse = TRUE, distname = "normal")



LambertW documentation built on Sept. 22, 2022, 5:07 p.m.