nb.gof.v: Main Function of Implementing Simulation-based...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/nb.gof.v.R

Description

This function tests goodness-of-fit of an NB2 or NBP negative binomial regression model with a known design matrix. Estimation method for the NB2 model fitting is maximum likelihood (ML). Estimation methods for NBP model fitting include both ML and adjusted profile likelihood (APL). This function can also test goodness-of-fit of a Poisson regression model.

Usage

1
nb.gof.v(y, x, lib.sizes=NULL, sim=999, model = "NB2", method="ML", seed=1, ncores=NULL)

Arguments

y

an n-by-1 vector of non-negative integers. For a typical RNA-Seq experiment, this may represent the read counts for a single gene across n samples.

x

an n-by-p design matrix.

lib.sizes

library sizes of a RNA-Seq experiment. Default is 1 for all samples.

sim

number of simulations performed.

model

a string of characters specifying the model (negative binomial or Poisson) used to fit the data. Currently the following models are available to be checked for goodness-of-fit:

  • NB2: conventional NB2 regression model (NB2)

  • NBP: NBP regression model (NBP)

  • Poisson: Poisson regression model (Poisson)

Users should specify exactly the same characters as shown in paratheses above for testing one of the regression models.

method

specify either "ML" or "APL" for the maximum likelihood and adjusted profile likelihood methods used, respectively, for the NBP model estimations. ML is used for the NB2 model estimations.

seed

initial random number seed for reproducibility of re-simulations.

ncores

number of CPU cores to use. If unspecified, use the total number of CPUs minus 1.

Details

When the response is a vector of counts, we can use this function to test the goodness-of-fit of a specified negative binomial or Poisson regression model. It returns an object with the test results, which can be further summarized and visualized using appropriate methods, e.g. EPPlot.

This function calls model.nb2.v to fit the NB2 model, calls model.nbp.v to fit the NBP model, and calls model.poi.v to fit the Poisson model.

Value

An object of class "gofv" to which other methods can be applied.

Author(s)

Gu Mi <neo.migu@gmail.com>, Yanming Di, Daniel Schafer

References

Mi, G, Di, Y, & Schafer, DW (2015). Goodness-of-Fit Tests and Model Diagnostics for Negative Binomial Regression of RNA Sequencing Data. PLOS ONE, 10(3).

Di Y, Schafer DW, Cumbie JS, and Chang JH (2011): "The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq", Statistical Applications in Genetics and Molecular Biology, 10 (1).

See https://github.com/gu-mi/NBGOF/wiki/ for more details.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
## load package into session:
library(NBGOF)

## basic set-up of the model:
seed = 539
n = 100
beta.v = c(1, -3)

## specify a design matrix:
X = cbind(rep(1,n), seq(1.5, 3.5, length=n))
s = rep(1e6, n)
mu = s * exp(X %*% beta.v)
pi = mu/s  # relative frequency
sim = 999

## simulate an n-by-1 vector for the read counts of a single gene:
# the data is simulated from NB1
set.seed(seed)
alpha1 = -1  # NB1 data
phi0 = 0.0001
alpha0 = log(phi0); 
phi.nb1 = phi0 * pi^alpha1
y.nb1 = rnbinom(n, size=1/phi.nb1, mu=mu)

## implement the GOF test for testing NB and Poisson model adequacy:
# pdf("gofv-result.pdf", width=14, height=7)

# NB2 model fit using MLE:
gof.nb1.nb2 = nb.gof.v(y.nb1, X, s, sim=sim, model="NB2", seed=1)
p1 = EPPlot(gof.nb1.nb2, envelope=0.95, data.note="NB1")

# NBP model fit using MLE:
gof.nb1.nbp = nb.gof.v(y.nb1, X, s, sim=sim, model="NBP", method="ML", seed=1)
p2 = EPPlot(gof.nb1.nbp, envelope=0.95, data.note="NB1")

# Poisson model fit:
gof.nb1.poi = nb.gof.v(y.nb1, X, s, sim=sim, model="Poisson", seed=1)
p3 = EPPlot(gof.nb1.poi, envelope=0.95, data.note="NB1")

multiplot(p1, p2, p3, cols=3, layout=matrix(seq(1,3), nrow=1, byrow=TRUE))
# dev.off()

gu-mi/NBGOF documentation built on Oct. 25, 2020, 3:30 a.m.