PS.Main: detecting differentially expressed genes from RNA-Seq data.

Description Usage Arguments Value Author(s) References Examples

Description

This function is the main function of this package. Given the data matrix and the outcome vector, this function returns the estimated permuation-based p-values, the estimated permutation-based false discovery rates, et al.
A more detailed instruction as well as sample data is available at
http://www.stanford.edu/~junli07/research.html.

Usage

1
 PS.Main(dat, para=list())

Arguments

dat

The input RNA-Seq data. It must have the following three attributes:

(1) n: the data matrix. Rows for genes, columns for experiments (samples).

(2) y: the outcome vector

(3) type: 'twoclass', 'multiclass' or 'quant'

The following attributes are optional. If not specified, the default values will be used.

(4) pair: paired data or not. Default value: FALSE. Only take effect for twoclass data.

(5) gname: gene names. Default value: 1 : nrow(n). That is, the i'th gene is named "i".

para

A list of parameters. It can have the following attributes:

(1) trans: to tranform the data using the order transformation or not to transform it. default value: TRUE

(2) npermu: number of permuations. default value: 100

(3) seed: random seed to generate the permutation indexes. default value: 10

(4) ct.sum: if the total number of reads of a gene across all experiments <= ct.sum, this gene will not be considered for differential expression detection. Default value: 5.

(5) ct.mean: if the mean number of reads of a gene across all experiments <= ct.mean, this gene will not be considered for differential expression detection. Default value: 0.5.

(6) div: the number of divisions of genes for estimating theta. default value: 10

(7) pow.file: the file to store the power transform curve (mean(log(mu)) ~ 1/theta). default value: 'pow.txt'

All the above attributes are optional.

Value

a data frame (table) containing the following columns. Each row stands for a gene. The genes are sorted from the most significant to the most insignificant.

nc

number of significant genes called. nc = 1 : (number of genes).

gname

the sorted gene names.

tt

The score statistics of the genes.

pval

Permutation-based p-values of the genes.

fdr

Estimated false discovery rate.

log.fc

Estimated log fold change of the genes. Only available for twoclass outcomes.

Author(s)

Jun Li.

References

Li J, Witten DM, Johnstone I, Tibshirani R (2012). Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics 13(3): 523-38.

Examples

1
2
 data(dat)
 res <- PS.Main(dat)

PoissonSeq documentation built on May 1, 2019, 7:33 p.m.