# C.alpha.multinomial: C(alpha) - Optimal Test for Assessing Multinomial Goodness of... In HMP: Hypothesis Testing and Power Calculations for Comparing Metagenomic Samples from HMP

## Description

A function to compute the C(α)-optimal test statistics of Kim and Margolin (1992) for evaluating the Goodness-of-Fit of a Multinomial distribution (null hypothesis) versus a Dirichlet-Multinomial distribution (alternative hypothesis).

## Usage

 1 C.alpha.multinomial(data) 

## Arguments

 data A matrix of taxonomic counts(columns) for each sample(rows).

## Details

In order to test if a set of ranked-abundance distribution(RAD) from microbiome samples can be modeled better using a multinomial or Dirichlet-Multinomial distribution, we test the hypothesis \mathrm{H}: θ = 0 versus \mathrm{H}: θ \ne 0, where the null hypothesis implies a multinomial distribution and the alternative hypothesis implies a DM distribution. Kim and Margolin (Kim and Margolin, 1992) proposed a C(α)-optimal test- statistics given by,

T = ∑_{j=1}^{K} ∑_{i=1}^{P} \frac{1}{∑_{i=1}^{P} x_{ij}}≤ft (x_{ij}-\frac{N_{i}∑_{i=1}^{P} x_{ij}}{N_{\mathrm{g}}} \right )^2

Where K is the number of taxa, P is the number of samples, x_{ij} is the taxon j, j = 1,…,K from sample i, i=1,…,P, N_{i} is the number of reads in sample i, and N_{\mathrm{g}} is the total number of reads across samples.

As the number of reads increases, the distribution of the T statistic converges to a Chi-square with degrees of freedom equal to (P-1)(K-1), when the number of sequence reads is the same in all samples. When the number of reads is not the same in all samples, the distribution becomes a weighted Chi-square with a modified degree of freedom (see (Kim and Margolin, 1992) for more details).

Note: Each taxa in data should be present in at least 1 sample, a column with all 0's may result in errors and/or invalid results.

## Value

A list containing the C(α)-optimal test statistic and p-value.

## References

Kim, B. S., and Margolin, B. H. (1992). Testing Goodness of Fit of a Multinomial Model Against Overdispersed Alternatives. Biometrics 48, 711-719.

## Examples

 1 2 3 4  data(saliva) calpha <- C.alpha.multinomial(saliva) calpha 

### Example output

Loading required package: dirmult

Attaching package: 'HMP'

The following object is masked from 'package:dirmult':

weirMoM

$T statistics [1] 1876.092$p value
[1] 0


HMP documentation built on Aug. 31, 2019, 5:05 p.m.