Fit Negative Binomial Regression Model and Test for a Regression Coefficient
Description
For each row of the input data matrix, nb.glm.test
fits an NB loglinear regression model and performs
largesample tests for a onedimensional regression
coefficient.
Usage
1 2 3 4  nb.glm.test(counts, x, beta0, lib.sizes = colSums(counts),
normalization.method = "AH2010", dispersion.model = "NBQ",
tests = c("HOA", "LR", "Wald"), alternative = "two.sided",
subset = 1:dim(counts)[1])

Arguments
counts 
an m by n matrix of RNASeq read counts with rows corresponding to gene features and columns corresponding to independent biological samples. 
x 
an n by p design matrix specifying the treatment structure. 
beta0 
a pvector specifying the null hypothesis. NonNA components specify the parameters to test and their null values. 
lib.sizes 
a pvector of observed library sizes, usually (and by default) estimated by column totals. 
normalization.method 
a character string specifying
the method for estimating the normalization factors, can
be 
dispersion.model 
a character string specifying the dispersion model, and can be one of "NB2", "NBP", "NBQ" (default), "NBS" or "step". 
tests 
a character string vector specifying the
tests to be performed, can be any subset of 
alternative 
a character string specifying the
alternative hypothesis, must be one of 
subset 
specify a subset of rows to perform the test on 
Details
nbp.glm.test
provides a simple, onestop interface
to performing a series of core tasks in regression analysis
of RNASeq data: it calls
estimate.norm.factors
to estimate
normalization factors; it calls
prepare.nb.data
to create an NB data
structure; it calls estimate.dispersion
to
estimate the NB dispersion; and it calls
test.coefficient
to test the regression
coefficient.
To keep the interface simple, nbp.glm.test
provides
limited options for fine tuning models/parameters in each
individual step. For more control over individual steps,
advanced users can call
estimate.norm.factors
,
prepare.nb.data
,
estimate.dispersion
, and
test.coefficient
directly, or even substitute
one or more of them with their own versions.
Value
A list containing the following components:
data 
a
list containing the input data matrix with additional
summary quantities, output from

dispersion 
dispersion estimates and models, output
from 
test 
test
results, output from 
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  ## Load Arabidopsis data
data(arab);
## Specify treatment structure
grp.ids = as.factor(c(1, 1, 1, 2, 2, 2));
x = model.matrix(~grp.ids);
## Specify the null hypothesis
## The null hypothesis is beta[1]=0 (beta[1] is the log fold change).
beta0 = c(NA, 0);
## Fit NB regression model and perform large sample tests.
## The step can take long if the number of genes is large
fit = nb.glm.test(arab, x, beta0, subset=1:50);
## The result contains the data, the dispersion estimates and the test results
print(str(fit));
## Show HOA test results for top ten genes
subset = order(fit$test.results$HOA$p.values)[1:10];
cbind(fit$data$counts[subset,], fit$test.results$HOA[subset,]);
## Show LR test results
subset = order(fit$test.results$LR$p.values)[1:10];
cbind(fit$data$counts[subset,], fit$test.results$LR[subset,]);
