dat2mvt: Fit multivariate skew-T distribution to a numeric dataset
In akcochrane/ACmisc: Aaron Cochrane's miscellaneous functions

dat2mvt

R Documentation

Fit multivariate skew-T distribution to a numeric dataset

Description

Fit multivariate skew-T distribution to a numeric dataset

Usage

dat2mvt(d, nRanGen = 50000)

Arguments

d

Input data. Must be numeric, and will likely fail if there are fewer than three unique values in any column. In general, the function is designed for data with plausibly skew-T distributions.

nRanGen

Confidence intervals are estimated using random variates of the multivariate skew-T distribution. This argument defines how many random samples to generate in the estimation.

This function treats a dataset as being multivariate skew-T distributed (see dmst), and fits this distribution. Given the distributional assumption, various useful things can be used; these include the central tendency and CI (the main output of the function; central tendency is the mean of the middle 1 percent of generated random variates, calculated using mean(., trim = .495)). Output also includes various attributes, including the correlation matrix (rob_correl), or the actual multivariate skew-T parameters (mvt_coef).

Because distributional quantiles (CI) are estimated using randomly-generated values from the multivariate distribution, increasing the number of values will make the estimates more stable. Conversely, smaller values of nRanGen will make estimates less stable but will speed up execution somewhat.

Multivariate distribution estimation will become very slow with increasing numbers of variables, and with inappropriate variables (e.g., binary). If the function is taking very long to run, it is recommended to start with only a handful of variables, and add more slowly.

Relies on the 'sn-package') package.

Examples

dat <- data.frame(x = rexp(200) , y= exp(rnorm(200)) , z = log(rnorm(200 , 5)))
got_mvt <- dat2mvt(dat)
got_mvt ; apply(dat, 2, quantile , c(.025 , .5 , .975))

# to generate new data using this fit multivariate skew-T distribution:
simulated_dat <- sn::rmst(50,dp =attr(got_mvt,'mvt_dp'))

akcochrane/ACmisc documentation built on Nov. 24, 2024, 11:22 a.m.