Cat: Cat Distribution
In joker: Probability Distributions and Parameter Estimation

View source: R/02_Cat.R

Cat	R Documentation

Cat Distribution

Description

The Categorical distribution is a discrete probability distribution that describes the probability of a single trial resulting in one of k possible categories. It is a generalization of the Bernoulli distribution and a special case of the multinomial distribution with n = 1.

Usage

Cat(prob = c(0.5, 0.5))

dcat(x, prob, log = FALSE)

rcat(n, prob)

## S4 method for signature 'Cat,numeric'
d(distr, x, log = FALSE)

## S4 method for signature 'Cat,numeric'
r(distr, n)

## S4 method for signature 'Cat'
mean(x)

## S4 method for signature 'Cat'
mode(x)

## S4 method for signature 'Cat'
var(x)

## S4 method for signature 'Cat'
entro(x)

## S4 method for signature 'Cat'
finf(x)

llcat(x, prob)

## S4 method for signature 'Cat,numeric'
ll(distr, x)

ecat(x, type = "mle", ...)

## S4 method for signature 'Cat,numeric'
mle(distr, x, dim = NULL, na.rm = FALSE)

## S4 method for signature 'Cat,numeric'
me(distr, x, dim = NULL, na.rm = FALSE)

vcat(prob, type = "mle")

## S4 method for signature 'Cat'
avar_mle(distr)

## S4 method for signature 'Cat'
avar_me(distr)

Arguments

`prob`	numeric. Probability vector of success for each category.
`x`	For the density function, `x` is a numeric vector of quantiles. For the moments functions, `x` is an object of class `Cat`. For the log-likelihood and the estimation functions, `x` is the sample of observations.
`log`	logical. Should the logarithm of the probability be returned?
`n`	number of observations. If `length(n) > 1`, the length is taken to be the number required.
`distr`	an object of class `Cat`.
`type`	character, case ignored. The estimator type (mle or me).
`...`	extra arguments.
`dim`	numeric. The probability vector dimension. See Details.
`na.rm`	logical. Should the `NA` values be removed?

Details

The probability mass function (PMF) of the categorical distribution is given by:

f(x; p) = \prod_{i=1}^k p_i^{x_i},

subject to \sum_{i=1}^{k} x_i = n .

The estimation of prob from a sample would by default return a vector of probabilities corresponding to the categories that appeared in the sample and 0 for the rest. However, the parameter dimension cannot be uncovered by the sample, it has to be provided separately. This can be done with the argument dim. If dim is not supplied, the dimension will be retrieved from the distr argument. Categories that did not appear in the sample will have 0 probabilities appended to the end of the prob vector.

Note that the actual dimension of the probability parameter vector is k-1, therefore the Fisher information matrix and the asymptotic variance - covariance matrix of the estimators is of dimension ⁠(k-1)x(k-1)⁠.

Value

Each type of function returns a different type of object:

Distribution Functions: When supplied with one argument (distr), the d(), p(), q(), r(), ll() functions return the density, cumulative probability, quantile, random sample generator, and log-likelihood functions, respectively. When supplied with both arguments (distr and x), they evaluate the aforementioned functions directly.
Moments: Returns a numeric, either vector or matrix depending on the moment and the distribution. The moments() function returns a list with all the available methods.
Estimation: Returns a list, the estimators of the unknown parameters. Note that in distribution families like the binomial, multinomial, and negative binomial, the size is not returned, since it is considered known.
Variance: Returns a named matrix. The asymptotic covariance matrix of the estimator.

Examples

# -----------------------------------------------------
# Categorical Distribution Example
# -----------------------------------------------------

# Create the distribution
p <- c(0.1, 0.2, 0.7)
D <- Cat(p)

# ------------------
# dpqr Functions
# ------------------

d(D, 2) # density function
x <- r(D, 100) # random generator function

# alternative way to use the function
df <- d(D) ; df(x) # df is a function itself

# ------------------
# Moments
# ------------------

mean(D) # Expectation
mode(D) # Mode
var(D) # Variance
entro(D) # Entropy
finf(D) # Fisher Information Matrix

# List of all available moments
mom <- moments(D)
mom$mean # expectation

# ------------------
# Point Estimation
# ------------------

ll(D, x)
llcat(x, p)

ecat(x, dim = 3, type = "mle")
ecat(x, dim = 3, type = "me")

mle(D, x)
me(D, x)
e(D, x, type = "mle")

mle("cat", dim = 3, x) # the distr argument can be a character

# ------------------
# Estimator Variance
# ------------------

vcat(p, type = "mle")
vcat(p, type = "me")

avar_mle(D)
avar_me(D)

v(D, type = "mle")

joker documentation built on June 8, 2025, 12:12 p.m.