timeSeq: Statistical Inference for Time Course RNA-Seq Data using a...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/timeSeq.R

Description

Accurately identifying differentially expressed (DE) genes from time course RNA-seq data has been of tremendous significance in creating a global picture of cellular function. DE genes from the time course RNA-seq data can be classified into two types, parallel DE genes (PDE) and non-parallel DE (NPDE) genes. The former are often biologically irrelevant, whereas the latter are often biologically interesting. In this package, we propose a negative binomial mixed-effects (NBME) model to identify both PDE and NPDE genes in time course RNA-seq data.

Usage

1
timeSeq(data.count,group.label,gene.names,exon.length=NULL,exon.level=FALSE,pvalue=TRUE)

Arguments

data.count

a n by p matrix of expression values. Data should be appropriately normalized beforehand.

group.label

a vector indicating the experimental conditions of each time point.

gene.names

a vector containing all the gene names.

exon.length

a vector containing the length of exons, only used in exon level data.

exon.level

logical:indicating if this is an exon level dataset. Default is FALSE.

pvalue

logical:indicating if p-values are returned. Default is TRUE.

Details

Nonparallel differential expression(NPDE) genes and parallel differential expression(PDE) genes detection.

Value

A list with components

sorted

an object returned by timeSeq.sort function. It contains sorted Kullback Leibler Ratios(KLRs) or p-values for identifying DE genes.

count

the number of exons or replicates for each gene.

NPDE

the NPDE ratios or p-values.

PDE

the PDE ratios or p-values.

genenames

gene names.

table

gene expression values.

data

a n by p matrix of expression values.

gene.names

a vector including all the gene names.

group.label

a vector indicating the experimental conditions of each time point.

group.length

the total number of time points.

group1.length

the number of time points of condition one.

group2.length

the number of time points of condition two.

exon.level

logical:indicating if this is an exon level dataset. Default is FALSE.

pvalue

logical:indicating if p-values are returned. Default is TRUE.

Author(s)

Fan Gao and Xiaoxiao Sun

References

Sun, Xiaoxiao, David Dalpiaz, Di Wu, Jun S. Liu, Wenxuan Zhong, and Ping Ma. "Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model." BMC Bioinformatics, 17(1):324, 2016.

Chong Gu. Model diagnostics for smoothing spline ANOVA models. Canadian Journal of Statistics, 32(4):347-358, 2004.

Chong Gu. Smoothing spline ANOVA models. Springer, second edition, 2013.

Chong Gu and Ping Ma. Optimal smoothing in nonparametric mixed-effect models. Annals of Statistics, 1357-1379, 2005.

Wood (2001) mgcv:GAMs and Generalized Ridge Regression for R. R News 1(2):20-25

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
####Data should be appropriately normalized beforehand####

##Exon level data (The p-values calculation is not supported)
data(pAbp)
attach(pAbp)
model.fit <- timeSeq(data.count,group.label,gene.names,exon.length,exon.level=TRUE,pvalue=FALSE)
#NPDE genes have large KLRs
model.fit$NPDE
detach(pAbp)

##Gene level data (three replicates)
data(simulate.dt)
attach(simulate.dt)
model.fit <- timeSeq(data.count,group.label,gene.names,exon.level=FALSE,pvalue=TRUE)
#p-values
model.fit$NPDE

Example output

There were 50 or more warnings (use warnings() to see the first 50)
[1] 0.8404626
 [1] 4.056052e-102 1.601623e-216  8.095688e-86 3.998878e-112 8.449241e-278
 [6]  9.973802e-01  7.780800e-01  9.026850e-01  1.000000e+00  9.278024e-01

timeSeq documentation built on May 2, 2019, 3:07 a.m.