# dirmn: The Dirichlet Multinomial Distribution In MGLM: Multivariate Response Generalized Linear Models

 rdirmn R Documentation

## The Dirichlet Multinomial Distribution

### Description

`ddirmn` computes the log of the Dirichlet multinomial probability mass function. `rdirmn` generates Dirichlet multinomially distributed random number vectors.

### Usage

```rdirmn(n, size, alpha)

ddirmn(Y, alpha)
```

### Arguments

 `n` number of random vectors to generate. When `size` is a scalar and `alpha` is a vector, must specify `n`. When `size` is a vector and `alpha` is a matrix, `n` is optional. The default value of `n` is the length of `size`. If given, `n` should be equal to the length of `size`. `size` a number or vector specifying the total number of objects that are put into d categories in the Dirichlet multinomial distribution. `alpha` the parameter of the Dirichlet multinomial distribution. Can be a numerical positive vector or matrix. For `ddirmn`, `alpha` has to match the size of `Y`. If `alpha` is a vector, it will be replicated n times to match the dimension of `Y`. For `rdirmn`, if `alpha` is a vector, `size` must be a scalar, and all the random vectors will be drawn from the same `alpha` and `size`. If `alpha` is a matrix, the number of rows should match the length of `size`, and each random vector will be drawn from the corresponding row of `alpha` and the corresponding element in the `size` vector. See Details below. `Y` The multivariate count matrix with dimensions nxd, where n = 1,2, … is the number of observations and d=2,3, … is the number of categories.

### Details

When the multivariate count data exhibits over-dispersion, the traditional multinomial model is insufficient. Dirichlet multinomial distribution models the probabilities of the categories by a Dirichlet distribution. Given the parameter vector α = (α_1, …, α_d), α_j>0 , the probability mass of d-category count vector Y=(y_1, …, y_d), d ≥ 2 under Dirichlet multinomial distribution is

P(y|α) = C_{y_1, …, y_d}^{m} prod_{j=1}^d {Gamma(α_j+y_j)Gamma(sum_{j'=1}^d α_j')} / {Gamma(α_j)Gamma(sum_{j'=1}^d α_j' + sum_{j'=1}^d y_j')},

where m = sum_{j=1}^d y_j. Here, C_k^n, often read as "n choose k", refers the number of k combinations from a set of n elements.

The parameter α can be a vector of length d, such as the results from the distribution fitting. α can also be a matrix with n rows, such as the inverse link calculated from the regression parameter estimate exp(Xβ).

### Value

For each count vector and each corresponding parameter vector α, the function `ddirmn` returns the value logP(y|α). When `Y` is a matrix of n rows, `ddirmn` returns a vector of length n.

`rdirmn` returns a nxd matrix of the generated random observations.

### Examples

```m <- 20
alpha <- c(0.1, 0.2)
dm.Y <- rdirmn(n=10, m, alpha)
pdfln <- ddirmn(dm.Y, alpha)
```

MGLM documentation built on April 14, 2022, 1:07 a.m.