LambertW-utils: Utilities for Lambert W \times F Random Variables In LambertW: Probabilistic Models to Analyze and Gaussianize Heavy-Tailed, Skewed Data

 LambertW-utils R Documentation

Utilities for Lambert W \times F Random Variables

Description

Density, distribution, quantile function and random number generation for a Lambert W \times F_X(x \mid \boldsymbol β) random variable with parameter θ = (α, \boldsymbol β, γ, δ).

Following the usual R dqpr family of functions (e.g., rnorm, dnorm, ...) the Lambert W \times F utility functions work as expected: dLambertW evaluates the pdf at y, pLambertW evaluates the cdf at y, qLambertW is the quantile function, and rLambertW generates random samples from a Lambert W \times F_X(x \mid \boldsymbol β) distribution.

mLambertW computes the first 4 central/standardized moments of a Lambert W \times F. Works only for Gaussian distribution.

qqLambertW computes and plots the sample quantiles of the data y versus the theoretical Lambert W \times F theoretical quantiles given θ.

Usage

dLambertW(
y,
distname = NULL,
theta = NULL,
beta = NULL,
gamma = 0,
delta = 0,
alpha = 1,
input.u = NULL,
tau = NULL,
use.mean.variance = TRUE,
log = FALSE
)

mLambertW(
theta = NULL,
distname = c("normal"),
beta,
gamma = 0,
delta = 0,
alpha = 1
)

pLambertW(
q,
distname,
theta = NULL,
beta = NULL,
gamma = 0,
delta = 0,
alpha = 1,
input.u = NULL,
tau = NULL,
log = FALSE,
lower.tail = FALSE,
use.mean.variance = TRUE
)

qLambertW(
p,
distname = NULL,
theta = NULL,
beta = NULL,
gamma = 0,
delta = 0,
alpha = 1,
input.u = NULL,
tau = NULL,
is.non.negative = FALSE,
use.mean.variance = TRUE
)

qqLambertW(
y,
distname,
theta = NULL,
beta = NULL,
gamma = 0,
delta = 0,
alpha = 1,
plot.it = TRUE,
use.mean.variance = TRUE,
...
)

rLambertW(
n,
distname,
theta = NULL,
beta = NULL,
gamma = 0,
delta = 0,
alpha = 1,
return.x = FALSE,
input.u = NULL,
tau = NULL,
use.mean.variance = TRUE
)


Arguments

 y, q vector of quantiles. distname character; name of input distribution; see get_distnames. theta list; a (possibly incomplete) list of parameters alpha, beta, gamma, delta. complete_theta fills in default values for missing entries. beta numeric vector (deprecated); parameter \boldsymbol β of the input distribution. See check_beta on how to specify beta for each distribution. gamma scalar (deprecated); skewness parameter; default: 0. delta scalar or vector (length 2) (deprecated); heavy-tail parameter(s); default: 0. alpha scalar or vector (length 2) (deprecated); heavy tail exponent(s); default: 1. input.u users can supply their own version of U (either a vector of simulated values or a function defining the pdf/cdf/quanitle function of U); default: NULL. If not NULL, tau must be specified as well. tau optional; if input.u = TRUE, then tau must be specified. Note that \boldsymbol β is still taken from theta, but "mu_x", "sigma_x", and the other parameters (α, γ, δ) are all taken from tau. This is usually only used by the create_LambertW_output function; users usually don't need to supply this argument directly. use.mean.variance logical; if TRUE it uses mean and variance implied by \boldsymbol β to do the transformation (Goerg 2011). If FALSE, it uses the alternative definition from Goerg (2016) with location and scale parameter. log logical; if TRUE, probabilities p are given as log(p). lower.tail logical; if TRUE (default), probabilities are P(X ≤q x) otherwise, P(X > x). p vector of probability levels is.non.negative logical; by default it is set to TRUE if the distribution is not a location but a scale family. plot.it logical; should the result be plotted? Default: TRUE. ... further arguments passed to or from other methods. n number of observations return.x logical; if TRUE not only the simulated Lambert W \times F sample y, but also the corresponding simulated input x will be returned. Default FALSE. Note: if TRUE then rLambertW does not return a vector of length n, but a list of two vectors (each of length n).

Details

All functions here have an optional input.u argument where users can supply their own version corresponding to zero-mean, unit variance input U. This function usually depends on the input parameter \boldsymbol β; e.g., users can pass their own density function dmydist <- function(u, beta) {...} as dLambertW(..., input.u = dmydist). dLambertW will then use this function to evaluate the pdf of the Lambert W x 'mydist' distribution.

Important: Make sure that all input.u in dLambertW, pLambertW, ... are supplied correctly and return correct values – there are no unit-tests or sanity checks for user-defined functions.

See the references for the analytic expressions of the pdf and cdf. For "h" or "hh" types and for scale-families of type = "s" quantiles can be computed analytically. For location (-scale) families of type = "s" quantiles need to be computed numerically.

Value

mLambertW returns a list with the 4 theoretical (central/standardized) moments of Y implied by \boldsymbol θ and distname (currrently, this only works for distname = "normal"):

 mean mean, sd standard deviation, skewness skewness, kurtosis kurtosis (not excess kurtosis, i.e., 3 for a Gaussian).

rLambertW returns a vector of length n. If return.input = TRUE, then it returns a list of two vectors (each of length n):

 x simulated input, y Lambert W random sample (transformed from x - see References and get_output).

qqLambertW returns a list of 2 vectors (analogous to qqnorm):

 x theoretical quantiles (sorted), y empirical quantiles (sorted).

Examples


###############################
######### mLambertW ###########
mLambertW(theta = list(beta = c(0, 1), gamma = 0.1))
mLambertW(theta = list(beta = c(1, 1), gamma = 0.1)) # mean shifted by 1
mLambertW(theta = list(beta = c(0, 1), gamma = 0)) # N(0, 1)

###############################
######### rLambertW ###########
set.seed(1)
# same as rnorm(1000)
x <- rLambertW(n=100, theta = list(beta=c(0, 1)), distname = "normal")
skewness(x) # very small skewness
medcouple_estimator(x) # also close to zero

y <- rLambertW(n=100, theta = list(beta = c(1, 3), gamma = 0.1),
distname = "normal")
skewness(y) # high positive skewness (in theory equal to 3.70)
medcouple_estimator(y) # also the robust measure gives a high value

par(mfrow = c(2, 2), mar = c(2, 4, 3, 1))
plot(x)
hist(x, prob=TRUE, 15)
lines(density(x))

plot(y)
hist(y, prob=TRUE, 15)
lines(density(y))
par(op)
###############################
######### dLambertW ###########
beta.s <- c(0, 1)
gamma.s <- 0.1

# x11(width=10, height=5)
par(mfrow = c(1, 2), mar = c(3, 3, 3, 1))
curve(dLambertW(x, theta = list(beta = beta.s, gamma = gamma.s),
distname = "normal"),
-3.5, 5, ylab = "",  main="Density function")
plot(dnorm, -3.5, 5, add = TRUE, lty = 2)
legend("topright" , c("Lambert W x Gaussian" , "Gaussian"), lty = 1:2)
abline(h=0)

###############################
######### pLambertW ###########

curve(pLambertW(x, theta = list(beta = beta.s, gamma = gamma.s),
distname = "normal"),
-3.5, 3.5, ylab = "", main = "Distribution function")
plot(pnorm, -3.5,3.5, add = TRUE, lty = 2)
legend("topleft" , c("Lambert W x Gaussian" , "Gaussian"), lty = 1:2)
par(op)

######## Animation
## Not run:
gamma.v <- seq(-0.15, 0.15, length = 31) # typical, empirical range of gamma
b <- get_support(gamma_01(min(gamma.v)))[2]*1.1
a <- get_support(gamma_01(max(gamma.v)))[1]*1.1

for (ii in seq_along(gamma.v)) {
curve(dLambertW(x, beta = gamma_01(gamma.v[ii])[c("mu_x", "sigma_x")],
gamma = gamma.v[ii], distname="normal"),
a, b, ylab="", lty = 2, col = 2, lwd = 2, main = "pdf",
ylim = c(0, 0.45))
plot(dnorm, a, b, add = TRUE, lty = 1, lwd = 2)
legend("topright" , c("Lambert W x Gaussian" , "Gaussian"),
lty = 2:1, lwd = 2, col = 2:1)
abline(h=0)
legend("topleft", cex = 1.3,
c(as.expression(bquote(gamma == .(round(gamma.v[ii],3))))))
Sys.sleep(0.04)
}

## End(Not run)

###############################
######### qLambertW ###########

p.v <- c(0.01, 0.05, 0.5, 0.9, 0.95,0.99)
qnorm(p.v)
# same as above except for rounding errors
qLambertW(p.v, theta = list(beta = c(0, 1), gamma = 0), distname = "normal")
# positively skewed data -> quantiles are higher
qLambertW(p.v, theta = list(beta = c(0, 1), gamma = 0.1),
distname = "normal")

###############################
######### qqLambertW ##########
## Not run:
y <- rLambertW(n=500, distname="normal",
theta = list(beta = c(0,1), gamma = 0.1))

layout(matrix(1:2, ncol = 2))
qqnorm(y)
qqline(y)
qqLambertW(y, theta = list(beta = c(0, 1), gamma = 0.1),
distname = "normal")

## End(Not run)


LambertW documentation built on Sept. 22, 2022, 5:07 p.m.