gdirmn: The Generalized Dirichlet Multinomial Distribution
In MGLM: Multivariate Response Generalized Linear Models

rgdirmn

R Documentation

The Generalized Dirichlet Multinomial Distribution

Description

rgdirmn generates random observations from the generalized Dirichlet multinomial distribution. dgdirmn computes the log of the generalized Dirichlet multinomial probability mass function.

Usage

rgdirmn(n, size, alpha, beta)

dgdirmn(Y, alpha, beta)

Arguments

`n`	the number of random vectors to generate. When `size` is a scalar and `alpha` is a vector, must specify `n`. When `size` is a vector and `alpha` is a matrix, `n` is optional. The default value of `n` is the length of `size`. If given, `n` should be equal to the length of `size`.
`size`	a number or vector specifying the total number of objects that are put into d categories in the generalized Dirichlet multinomial distribution.
`alpha`	the parameter of the generalized Dirichlet multinomial distribution. `alpha` is a numerical positive vector or matrix. For `gdirmn`, `alpha` should match the size of `Y`. If `alpha` is a vector, it will be replicated n times to match the dimension of `Y`. For `rdirmn`, if `alpha` is a vector, `size` must be a scalar. All the random vectors will be drawn from the same `alpha` and `size`. If `alpha` is a matrix, the number of rows should match the length of `size`. Each random vector will be drawn from the corresponding row of `alpha` and the corresponding element of `size`.
`beta`	the parameter of the generalized Dirichlet multinomial distribution. `beta` should have the same dimension as `alpha`. For `rdirm`, if `beta` is a vector, `size` must be a scalar. All the random samples will be drawn from the same `beta` and `size`. If `beta` is a matrix, the number of rows should match the length of `size`. Each random vector will be drawn from the corresponding row of `beta` and the corresponding element of `size`.
`Y`	the multivariate count matrix with dimensions nxd, where n = 1,2, … is the number of observations and d=3,4,… is the number of categories.

Details

Y=(y_1, …, y_d) are the d category count vectors. Given the parameter vector α = (α_1, …, α_{d-1}), α_j>0, and β=(β_1, …, β_{d-1}), β_j>0, the generalized Dirichlet multinomial probability mass function is

P(y|α,β) =C_{y_1, …, y_d}^{m} prod_{j=1}^{d-1} {Gamma(α_j+y_j)Gamma(β_j+z_{j+1})Gamma(α_j+β_j)} / {Gamma(α_j)Gamma(β_j)Gamma(α_j+β_j+z_j)},

where z_j = sum_{k=j}^d y_k and m = sum_{j=1}^d y_j. Here, C_k^n, often read as "n choose k", refers the number of k combinations from a set of n elements.

The α and β parameters can be vectors, like the results from the distribution fitting function, or they can be matrices with n rows, like the estimate from the regression function multiplied by the covariate matrix exp(Xα) and exp(Xβ)

Value

dgdirmn returns the value of logP(y|α, β). When Y is a matrix of n rows, the function dgdirmn returns a vector of length n.

rgdirmn returns a nxd matrix of the generated random observations.

Examples

# example 1
m <- 20
alpha <- c(0.2, 0.5)
beta <- c(0.7, 0.4)
Y <- rgdirmn(10, m, alpha, beta)
dgdirmn(Y, alpha, beta)

# example 2 
set.seed(100)
alpha <- matrix(abs(rnorm(40)), 10, 4)
beta <- matrix(abs(rnorm(40)), 10, 4)
size <- rbinom(10, 10, 0.5)
GDM.rdm <- rgdirmn(size=size, alpha=alpha, beta=beta)
GDM.rdm1 <- rgdirmn(n=20, size=10, alpha=abs(rnorm(4)), beta=abs(rnorm(4)))

MGLM documentation built on April 14, 2022, 1:07 a.m.