View source: R/dist_categorical.R
| dist_categorical | R Documentation |
Categorical distributions are used to represent events with multiple
outcomes, such as what number appears on the roll of a dice. This is also
referred to as the 'generalised Bernoulli' or 'multinoulli' distribution.
The Categorical distribution is a special case of the Multinomial()
distribution with n = 1.
dist_categorical(prob, outcomes = NULL)
prob |
A list of probabilities of observing each outcome category. |
outcomes |
The list of vectors where each value represents each outcome. |
We recommend reading this documentation on pkgdown which renders math nicely. https://pkg.mitchelloharawild.com/distributional/reference/dist_categorical.html
In the following, let X be a Categorical random variable with
probability parameters prob = \{p_1, p_2, \ldots, p_k\}.
The Categorical probability distribution is widely used to model the
occurance of multiple events. A simple example is the roll of a dice, where
p = \{1/6, 1/6, 1/6, 1/6, 1/6, 1/6\} giving equal chance of observing
each number on a 6 sided dice.
Support: \{1, \ldots, k\}
Mean: Not defined for unordered categories. For ordered categories with
integer outcomes \{1, 2, \ldots, k\}, the mean is:
E(X) = \sum_{i=1}^{k} i \cdot p_i
Variance: Not defined for unordered categories. For ordered categories
with integer outcomes \{1, 2, \ldots, k\}, the variance is:
\text{Var}(X) = \sum_{i=1}^{k} i^2 \cdot p_i - \left(\sum_{i=1}^{k} i \cdot p_i\right)^2
Probability mass function (p.m.f):
P(X = i) = p_i
Cumulative distribution function (c.d.f):
The c.d.f is undefined for unordered categories. For ordered categories
with outcomes x_1 < x_2 < \ldots < x_k, the c.d.f is:
P(X \le x_j) = \sum_{i=1}^{j} p_i
Moment generating function (m.g.f):
E(e^{tX}) = \sum_{i=1}^{k} e^{tx_i} \cdot p_i
Skewness: Approximated numerically for ordered categories.
Kurtosis: Approximated numerically for ordered categories.
stats::Multinomial
dist <- dist_categorical(prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)))
dist
generate(dist, 10)
density(dist, 2)
density(dist, 2, log = TRUE)
# The outcomes aren't ordered, so many statistics are not applicable.
cdf(dist, 0.6)
quantile(dist, 0.7)
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)
# Some of these statistics are meaningful for ordered outcomes
dist <- dist_categorical(list(rpois(26, 3)), list(ordered(letters)))
dist
cdf(dist, "m")
quantile(dist, 0.5)
dist <- dist_categorical(
prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)),
outcomes = list(letters[1:5], letters[24:26])
)
generate(dist, 10)
density(dist, "a")
density(dist, "z", log = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.