Distribution.df: Data Frame Summarizing Available Probability Distributions...

Distribution.dfR Documentation

Data Frame Summarizing Available Probability Distributions and Estimation Methods

Description

Data frame summarizing information about available probability distributions in R and the EnvStats package, and which distributions have associated functions for estimating distribution parameters.

Usage

Distribution.df

Format

A data frame with 35 rows corresponding to 35 different available probability distributions, and 25 columns containing information associated with these probability distributions.

Name

a character vector containing the name of the probability distribution (see the column labeled Name in the table below).

Type

a character vector indicating the type of distribution (see the column labeled Type in the table below). Possible values are "Finite Discrete", "Discrete", "Continuous", and "Mixed".

Support.Min

a character vector indicating the minimum value the random variable can assume (see the column labeled Range in the table below). The reason this is a character vector instead of a numeric vector is because some distributions have a lower bound that depends on the value of a distribution parameter. For example, the minimum value for a Uniform distribution is given by the value of the parameter min.

Support.Max

a character vector indicating the maximum value the random variable can assume (see the column labeled Range in the table below). The reason this is a character vector instead of a numeric vector is because some distributions have an upper bound that depends on the value of a distribution parameter. For example, the maximum value for a Uniform distribution is given by the value of the parameter max.

Estimation.Method(s)

a character vector indicating the names of the methods available to estimate the distribution parameter(s) (see the column labeled Estimation Method(s) in the table below). Possible values include "mle" (maximum likelihood), "mme" (method of moments), "mmue" (method of moments based on the unbiased estimate of variance), "mvue" (minimum variance unbiased), "qmle" (quasi-mle), etc., or some combination of these. In cases where an estimator is more than one kind, a slash (/) is used to denote all methods covered by the single estimator. For example, for the Binomial distribution, the sample proportion is the maximum likelihood, method of moments, and minimum variance unbiased estimator, so this method is denoted as "mle/mme/mvue". See the help files for the specific function listed under Estimating Distribution Parameters for an explanation of each of these estimation methods.

Quantile.Estimation.Method(s)

a character vector indicating the names of the methods available to estimate the distribution quantiles. For many distributions, these are the same as Estimation.Method(s). See the help files for the specific function listed under Estimating Distribution Quantiles for an explanation of each of these estimation methods.

Prediction.Interval.Method(s)

a character vector indicating the names of the methods available to create prediction intervals. See the help files for the specific function listed under Prediction Intervals for an explanation of each of these estimation methods.

Singly.Censored.Estimation.Method(s)

a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I singly-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.

Multiply.Censored.Estimation.Method(s)

a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I multiply-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.

Number.parameters

a numeric vector indicating the number of parameters associated with the distribution (see the column labeled Parameters in the table below).

Parameter.1

the columns labeled Parameter.1, Parameter.2, ..., Parameter.5 are character vectors containing the names of the distribution parameters (see the column labeled Parameters in the table below). If a distribution has n parameters and n < 5, then the columns labeled Parameter.n+1, ..., Parameter.5 are empty. For example, the Normal distribution has only two parameters associated with it (mean and sd), so the fields in Parameter.3, Parameter.4, and Parameter.5 are empty.

Parameter.2

see Parameter.1

Parameter.3

see Parameter.1

Parameter.4

see Parameter.1

Parameter.5

see Parameter.1

Parameter.1.Min

the columns labeled Parameter.1.Min, Parameter.2.Min, ...,
Parameter.5.Min are character vectors containing the minimum values that can be assumed by the distribution parameters (see the column labeled Parameter Range(s) in the table below).

The reason these are character vectors instead of numeric vectors is because some parameters have a lower bound of 0 but must be strictly bigger than 0 (e.g., the parameter sd for the Normal distribution), in which case the lower bound is .Machine$double.eps, which may vary from machine to machine. Also, some parameters have a lower bound that depends on the value of another parameter. For example, the parameter max for a Uniform distribution is bounded below by the value of the parameter min.

If a distribution has n parameters and n < 5, then the columns labeled Parameter.n+1.Min, ..., Parameter.5.Min have the missing value code (NA). For example, the Normal distribution has only two parameters associated with it (mean and sd) so the fields in
Parameter.3.Min, Parameter.4.Min, and Parameter.5.Min have NAs in them.

Parameter.2.Min

see Parameter.1.Min

Parameter.3.Min

see Parameter.1.Min

Parameter.4.Min

see Parameter.1.Min

Parameter.5.Min

see Parameter.1.Min

Parameter.1.Max

the columns labeled Parameter.1.Max, Parameter.2.Max, ...,
Parameter.5.Max are character vectors containing the maximum values that can be assumed by the distribution parameters (see the column labeled Parameter Range(s) in the table below).

The reason these are character vectors instead of numeric vectors is because some parameters have an upper bound that depends on the value of another parameter. For example, the parameter min for a Uniform distribution is bounded above by the value of the parameter max.

If a distribution has n parameters and n < 5, then the columns labeled Parameter.n+1.Max, ..., Parameter.5.Max have the missing value code (NA). For example, the Normal distribution has only two parameters associated with it (mean and sd) so the fields in
Parameter.3.Max, Parameter.4.Max, and Parameter.5.Max have NAs in them.

Parameter.2.Max

see Parameter.1.Max

Parameter.3.Max

see Parameter.1.Max

Parameter.4.Max

see Parameter.1.Max

Parameter.5.Max

see Parameter.1.Max

Details

The table below summarizes the probability distributions available in R and EnvStats. For each distribution, there are four associated functions for computing density values, percentiles, quantiles, and random numbers. The form of the names of these functions are dabb, pabb, qabb, and rabb, where abb is the abbreviated name of the distribution (see table below). These functions are described in the help file with the name of the distribution (see the first column of the table below). For example, the help file for Beta describes the behavior of dbeta, pbeta, qbeta, and rbeta.

For most distributions, there is also an associated function for estimating the distribution parameters, and the form of the names of these functions is eabb, where abb is the abbreviated name of the distribution (see table below). All of these functions are listed in the help file Estimating Distribution Parameters. For example, the function ebeta estimates the shape parameters of a Beta distribution based on a random sample of observations from this distribution.

For some distributions, there are functions to estimate distribution parameters based on Type I censored data. The form of the names of these functions is eabbSinglyCensored for singly censored data and eabbMultiplyCensored for multiply censored data. All of these functions are listed under the heading Estimating Distribution Parameters in the help file Censored Data.

Table 1a. Available Distributions: Name, Abbreviation, Type, and Range

Name Abbreviation Type Range
Beta beta Continuous [0, 1]
Binomial binom Finite [0, size]
Discrete (integer)
Cauchy cauchy Continuous (-\infty, \infty)
Chi chi Continuous [0, \infty)
Chi-square chisq Continuous [0, \infty)
Exponential exp Continuous [0, \infty)
Extreme evd Continuous (-\infty, \infty)
Value
F f Continuous [0, \infty)
Gamma gamma Continuous [0, \infty)
Gamma gammaAlt Continuous [0, \infty)
(Alternative)
Generalized gevd Continuous (-\infty, \infty)
Extreme for shape = 0
Value
(-\infty, location + \frac{scale}{shape}]
for shape > 0
[location + \frac{scale}{shape}, \infty)
for shape < 0
Geometric geom Discrete [0, \infty)
(integer)
Hypergeometric hyper Finite [0, min(k,m)]
Discrete (integer)
Logistic logis Continuous (-\infty, \infty)
Lognormal lnorm Continuous [0, \infty)
Lognormal lnormAlt Continuous [0, \infty)
(Alternative)
Lognormal lnormMix Continuous [0, \infty)
Mixture
Lognormal lnormMixAlt Continuous [0, \infty)
Mixture
(Alternative)
Three- lnorm3 Continuous [threshold, \infty)
Parameter
Lognormal
Truncated lnormTrunc Continuous [min, max]
Lognormal
Truncated lnormTruncAlt Continuous [min, max]
Lognormal
(Alternative)
Negative nbinom Discrete [0, \infty)
Binomial (integer)
Normal norm Continuous (-\infty, \infty)
Normal normMix Continuous (-\infty, \infty)
Mixture
Truncated normTrunc Continuous [min, max]
Normal
Pareto pareto Continuous [location, \infty)
Poisson pois Discrete [0, \infty)
(integer)
Student's t t Continuous (-\infty, \infty)
Triangular tri Continuous [min, max]
Uniform unif Continuous [min, max]
Weibull weibull Continuous [0, \infty)
Wilcoxon wilcox Finite [0, m n]
Rank Sum Discrete (integer)
Zero-Modified zmlnorm Mixed [0, \infty)
Lognormal
(Delta)
Zero-Modified zmlnormAlt Mixed [0, \infty)
Lognormal
(Delta)
(Alternative)
Zero-Modified zmnorm Mixed (-\infty, \infty)
Normal

Table 1b. Available Distributions: Name, Parameters, Parameter Default Values, Parameter Ranges, Estimation Method(s)

Default Parameter Estimation
Name Parameter(s) Value(s) Range(s) Method(s)
Beta shape1 (0, \infty) mle, mme, mmue
shape2 (0, \infty)
ncp 0 (0, \infty)
Binomial size [0, \infty) mle/mme/mvue
prob [0, 1]
Cauchy location 0 (-\infty, \infty)
scale 1 (0, \infty)
Chi df (0, \infty)
Chi-square df (0, \infty)
ncp 0 (-\infty, \infty)
Exponential rate 1 (0, \infty) mle/mme
Extreme location 0 (-\infty, \infty) mle, mme, mmue, pwme
Value scale 1 (0, \infty)
F df1 (0, \infty)
df2 (0, \infty)
ncp 0 (0, \infty)
Gamma shape (0, \infty) mle, bcmle, mme, mmue
scale 1 (0, \infty)
Gamma mean (0, \infty) mle, bcmle, mme, mmue
(Alternative) cv 1 (0, \infty)
Generalized location 0 (-\infty, \infty) mle, pwme, tsoe
Extreme scale 1 (0, \infty)
Value shape 0 (-\infty, \infty)
Geometric prob (0, 1) mle/mme, mvue
Hypergeometric m [0, \infty) mle, mvue
n [0, \infty)
k [1, m+n]
Logistic location 0 (-\infty, \infty) mle, mme, mmue
scale 1 (0, \infty)
Lognormal meanlog 0 (-\infty, \infty) mle/mme, mvue
sdlog 1 (0, \infty)
Lognormal mean exp(1/2) (0, \infty) mle, mme, mmue,
(Alternative) cv sqrt(exp(1)-1) (0, \infty) mvue, qmle
Lognormal meanlog1 0 (-\infty, \infty)
Mixture sdlog1 1 (0, \infty)
meanlog2 0 (-\infty, \infty)
sdlog2 1 (0, \infty)
p.mix 0.5 [0, 1]
Lognormal mean1 exp(1/2) (0, \infty)
Mixture cv1 sqrt(exp(1)-1) (0, \infty)
(Alternative) mean2 exp(1/2) (0, \infty)
cv2 sqrt(exp(1)-1) (0, \infty)
p.mix 0.5 [0, 1]
Three- meanlog 0 (-\infty, \infty) lmle, mme,
Parameter sdlog 1 (0, \infty) mmue, mmme,
Lognormal threshold 0 (-\infty, \infty) royston.skew,
zero.skew
Truncated meanlog 0 (-\infty, \infty)
Lognormal sdlog 1 (0, \infty)
min 0 [0, max)
max Inf (min, \infty)
Truncated mean exp(1/2) (0, \infty)
Lognormal cv sqrt(exp(1)-1) (0, \infty)
(Alternative) min 0 [0, max)
max Inf (min, \infty)
Negative size [1, \infty) mle/mme, mvue
Binomial prob (0, 1]
mu (0, \infty)
Normal mean 0 (-\infty, \infty) mle/mme, mvue
sd 1 (0, \infty)
Normal mean1 0 (-\infty, \infty)
Mixture sd1 1 (0, \infty)
mean2 0 (-\infty, \infty)
sd2 1 (0, \infty)
p.mix 0.5 [0, 1]
Truncated mean 0 (-\infty, \infty)
Normal sd 1 (0, \infty)
min -Inf (-\infty, max)
max Inf (min, \infty)
Pareto location (0, \infty) lse, mle
shape 1 (0, \infty)
Poisson lambda (0, \infty) mle/mme/mvue
Student's t df (0, \infty)
ncp 0 (-\infty, \infty)
Triangular min 0 (-\infty, max)
max 1 (min, \infty)
mode 0.5 (min, max)
Uniform min 0 (-\infty, max) mle, mme, mmue
max 1 (min, \infty)
Weibull shape (0, \infty) mle, mme, mmue
scale 1 (0, \infty)
Wilcoxon m [1, \infty)
Rank Sum n [1, \infty)
Zero-Modified meanlog 0 (-\infty, \infty) mvue
Lognormal sdlog 1 (0, \infty)
(Delta) p.zero 0.5 [0, 1]
Zero-Modified mean exp(1/2) (0, \infty) mvue
Lognormal cv sqrt(exp(1)-1) (0, \infty)
(Delta) p.zero 0.5 [0, 1]
(Alternative)
Zero-Modified mean 0 (-\infty, \infty) mvue
Normal sd 1 (0, \infty)
p.zero 0.5 [0, 1]

Source

The EnvStats package.

References

Millard, S.P. (2013). EnvStats: An R Package for Environmental Statistics. Springer, New York. https://link.springer.com/book/10.1007/978-1-4614-8456-1.


EnvStats documentation built on Aug. 22, 2023, 5:09 p.m.