gdirmn: The Generalized Dirichlet Multinomial Distribution

rgdirmnR Documentation

The Generalized Dirichlet Multinomial Distribution

Description

rgdirmn generates random observations from the generalized Dirichlet multinomial distribution. dgdirmn computes the log of the generalized Dirichlet multinomial probability mass function.

Usage

rgdirmn(n, size, alpha, beta)

dgdirmn(Y, alpha, beta)

Arguments

n

the number of random vectors to generate. When size is a scalar and alpha is a vector, must specify n. When size is a vector and alpha is a matrix, n is optional. The default value of n is the length of size. If given, n should be equal to the length of size.

size

a number or vector specifying the total number of objects that are put into d categories in the generalized Dirichlet multinomial distribution.

alpha

the parameter of the generalized Dirichlet multinomial distribution. alpha is a numerical positive vector or matrix.

For gdirmn, alpha should match the size of Y. If alpha is a vector, it will be replicated n times to match the dimension of Y.

For rdirmn, if alpha is a vector, size must be a scalar. All the random vectors will be drawn from the same alpha and size. If alpha is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of alpha and the corresponding element of size.

beta

the parameter of the generalized Dirichlet multinomial distribution. beta should have the same dimension as alpha.

For rdirm, if beta is a vector, size must be a scalar. All the random samples will be drawn from the same beta and size. If beta is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of beta and the corresponding element of size.

Y

the multivariate count matrix with dimensions nxd, where n = 1,2, … is the number of observations and d=3,4,… is the number of categories.

Details

Y=(y_1, …, y_d) are the d category count vectors. Given the parameter vector α = (α_1, …, α_{d-1}), α_j>0, and β=(β_1, …, β_{d-1}), β_j>0, the generalized Dirichlet multinomial probability mass function is

P(y|α,β) =C_{y_1, …, y_d}^{m} prod_{j=1}^{d-1} {Gamma(α_j+y_j)Gamma(β_j+z_{j+1})Gamma(α_j+β_j)} / {Gamma(α_j)Gamma(β_j)Gamma(α_j+β_j+z_j)},

where z_j = sum_{k=j}^d y_k and m = sum_{j=1}^d y_j. Here, C_k^n, often read as "n choose k", refers the number of k combinations from a set of n elements.

The α and β parameters can be vectors, like the results from the distribution fitting function, or they can be matrices with n rows, like the estimate from the regression function multiplied by the covariate matrix exp(Xα) and exp(Xβ)

Value

dgdirmn returns the value of logP(y|α, β). When Y is a matrix of n rows, the function dgdirmn returns a vector of length n.

rgdirmn returns a nxd matrix of the generated random observations.

Examples

# example 1
m <- 20
alpha <- c(0.2, 0.5)
beta <- c(0.7, 0.4)
Y <- rgdirmn(10, m, alpha, beta)
dgdirmn(Y, alpha, beta)

# example 2 
set.seed(100)
alpha <- matrix(abs(rnorm(40)), 10, 4)
beta <- matrix(abs(rnorm(40)), 10, 4)
size <- rbinom(10, 10, 0.5)
GDM.rdm <- rgdirmn(size=size, alpha=alpha, beta=beta)
GDM.rdm1 <- rgdirmn(n=20, size=10, alpha=abs(rnorm(4)), beta=abs(rnorm(4)))

MGLM documentation built on April 14, 2022, 1:07 a.m.