# Categorical: Categorical Distribution Class In distr6: The Complete R6 Probability Distributions Interface

 Categorical R Documentation

## Categorical Distribution Class

### Description

Mathematical and statistical functions for the Categorical distribution, which is commonly used in classification supervised learning.

### Details

The Categorical distribution parameterised with a given support set, x_1,...,x_k, and respective probabilities, p_1,...,p_k, is defined by the pmf,

f(x_i) = p_i

for p_i, i = 1,…,k; ∑ p_i = 1.

Sampling from this distribution is performed with the sample function with the elements given as the support set and the probabilities from the `probs` parameter. The cdf and quantile assumes that the elements are supplied in an indexed order (otherwise the results are meaningless).

The number of points in the distribution cannot be changed after construction.

### Value

Returns an R6 object inheriting from class SDistribution.

### Distribution support

The distribution is supported on x_1,...,x_k.

### Default Parameterisation

Cat(elements = 1, probs = 1)

N/A

N/A

### Super classes

`distr6::Distribution` -> `distr6::SDistribution` -> `Categorical`

### Public fields

`name`

Full name of distribution.

`short_name`

Short name of distribution for printing.

`description`

Brief description of the distribution.

### Active bindings

`properties`

Returns distribution properties, including skewness type and symmetry.

### Methods

#### Public methods

Inherited methods

#### Method `new()`

Creates a new instance of this R6 class.

##### Usage
`Categorical\$new(elements = NULL, probs = NULL, decorators = NULL)`
##### Arguments
`elements`

`list()`
Categories in the distribution, see examples.

`probs`

`numeric()`
Probabilities of respective categories occurring.

`decorators`

`(character())`
Decorators to add to the distribution during construction.

##### Examples
```# Note probabilities are automatically normalised (if not vectorised)
x <- Categorical\$new(elements = list("Bapple", "Banana", 2), probs = c(0.2, 0.4, 1))

# Length of elements and probabilities cannot be changed after construction
x\$setParameterValue(probs = c(0.1, 0.2, 0.7))

# d/p/q/r
x\$pdf(c("Bapple", "Carrot", 1, 2))
x\$cdf("Banana") # Assumes ordered in construction
x\$quantile(0.42) # Assumes ordered in construction
x\$rand(10)

# Statistics
x\$mode()

summary(x)
```

#### Method `mean()`

The arithmetic mean of a (discrete) probability distribution X is the expectation

E_X(X) = ∑ p_X(x)*x

with an integration analogue for continuous distributions.

##### Usage
`Categorical\$mean(...)`
`...`

Unused.

#### Method `mode()`

The mode of a probability distribution is the point at which the pdf is a local maximum, a distribution can be unimodal (one maximum) or multimodal (several maxima).

##### Usage
`Categorical\$mode(which = "all")`
##### Arguments
`which`

`(character(1) | numeric(1)`
Ignored if distribution is unimodal. Otherwise `"all"` returns all modes, otherwise specifies which mode to return.

#### Method `variance()`

The variance of a distribution is defined by the formula

var_X = E[X^2] - E[X]^2

where E_X is the expectation of distribution X. If the distribution is multivariate the covariance matrix is returned.

##### Usage
`Categorical\$variance(...)`
`...`

Unused.

#### Method `skewness()`

The skewness of a distribution is defined by the third standardised moment,

sk_X = E_X[((x - μ)/σ)^3]

where E_X is the expectation of distribution X, μ is the mean of the distribution and σ is the standard deviation of the distribution.

##### Usage
`Categorical\$skewness(...)`
`...`

Unused.

#### Method `kurtosis()`

The kurtosis of a distribution is defined by the fourth standardised moment,

k_X = E_X[((x - μ)/σ)^4]

where E_X is the expectation of distribution X, μ is the mean of the distribution and σ is the standard deviation of the distribution. Excess Kurtosis is Kurtosis - 3.

##### Usage
`Categorical\$kurtosis(excess = TRUE, ...)`
##### Arguments
`excess`

`(logical(1))`
If `TRUE` (default) excess kurtosis returned.

`...`

Unused.

#### Method `entropy()`

The entropy of a (discrete) distribution is defined by

- ∑ (f_X)log(f_X)

where f_X is the pdf of distribution X, with an integration analogue for continuous distributions.

##### Usage
`Categorical\$entropy(base = 2, ...)`
##### Arguments
`base`

`(integer(1))`
Base of the entropy logarithm, default = 2 (Shannon entropy)

`...`

Unused.

#### Method `mgf()`

The moment generating function is defined by

mgf_X(t) = E_X[exp(xt)]

where X is the distribution and E_X is the expectation of the distribution X.

##### Usage
`Categorical\$mgf(t, ...)`
##### Arguments
`t`

`(integer(1))`
t integer to evaluate function at.

`...`

Unused.

#### Method `cf()`

The characteristic function is defined by

cf_X(t) = E_X[exp(xti)]

where X is the distribution and E_X is the expectation of the distribution X.

##### Usage
`Categorical\$cf(t, ...)`
##### Arguments
`t`

`(integer(1))`
t integer to evaluate function at.

`...`

Unused.

#### Method `pgf()`

The probability generating function is defined by

pgf_X(z) = E_X[exp(z^x)]

where X is the distribution and E_X is the expectation of the distribution X.

##### Usage
`Categorical\$pgf(z, ...)`
##### Arguments
`z`

`(integer(1))`
z integer to evaluate probability generating function at.

`...`

Unused.

#### Method `clone()`

The objects of this class are cloneable with this method.

##### Usage
`Categorical\$clone(deep = FALSE)`
##### Arguments
`deep`

Whether to make a deep clone.

### References

McLaughlin, M. P. (2001). A compendium of common probability distributions (pp. 2014-01). Michael P. McLaughlin.

Other discrete distributions: `Bernoulli`, `Binomial`, `Degenerate`, `DiscreteUniform`, `EmpiricalMV`, `Empirical`, `Geometric`, `Hypergeometric`, `Logarithmic`, `Matdist`, `Multinomial`, `NegativeBinomial`, `WeightedDiscrete`

Other univariate distributions: `Arcsine`, `Bernoulli`, `BetaNoncentral`, `Beta`, `Binomial`, `Cauchy`, `ChiSquaredNoncentral`, `ChiSquared`, `Degenerate`, `DiscreteUniform`, `Empirical`, `Erlang`, `Exponential`, `FDistributionNoncentral`, `FDistribution`, `Frechet`, `Gamma`, `Geometric`, `Gompertz`, `Gumbel`, `Hypergeometric`, `InverseGamma`, `Laplace`, `Logarithmic`, `Logistic`, `Loglogistic`, `Lognormal`, `Matdist`, `NegativeBinomial`, `Normal`, `Pareto`, `Poisson`, `Rayleigh`, `ShiftedLoglogistic`, `StudentTNoncentral`, `StudentT`, `Triangular`, `Uniform`, `Wald`, `Weibull`, `WeightedDiscrete`

### Examples

```
## ------------------------------------------------
## Method `Categorical\$new`
## ------------------------------------------------

# Note probabilities are automatically normalised (if not vectorised)
x <- Categorical\$new(elements = list("Bapple", "Banana", 2), probs = c(0.2, 0.4, 1))

# Length of elements and probabilities cannot be changed after construction
x\$setParameterValue(probs = c(0.1, 0.2, 0.7))

# d/p/q/r
x\$pdf(c("Bapple", "Carrot", 1, 2))
x\$cdf("Banana") # Assumes ordered in construction
x\$quantile(0.42) # Assumes ordered in construction
x\$rand(10)

# Statistics
x\$mode()

summary(x)
```

distr6 documentation built on March 28, 2022, 1:05 a.m.