EBTest: Using EM algorithm to calculate the posterior probabilities...

View source: R/EBTest.R

EBTestR Documentation

Using EM algorithm to calculate the posterior probabilities of being DE

Description

Base on the assumption of NB-Beta Empirical Bayes model, the EM algorithm is used to get the posterior probability of being DE.

Usage

EBTest(Data, NgVector = NULL, Conditions, sizeFactors, fast = T,
    Alpha = NULL, Beta = NULL, Qtrm = 1, QtrmCut = 0, maxround = 50, 
    step1 = 1e-06, step2 = 0.01, thre = log(2), sthre = 0, 
    filter = 10, stopthre = 1e-4) 

Arguments

Data

A data matrix contains expression values for each transcript (gene or isoform level). In which rows should be transcripts and columns should be samples.

NgVector

A vector indicates the uncertainty group assignment of each isoform. e.g. if we use number of isoforms in the host gene to define the uncertainty groups, suppose the isoform is in a gene with 2 isoforms, Ng of this isoform should be 2. The length of this vector should be the same as the number of rows in Data. If it's gene level data, Ngvector could be left as NULL.

Conditions

A factor indicates the condition which each sample belongs to.

sizeFactors

The normalization factors. It should be a vector with lane specific numbers (the length of the vector should be the same as the number of samples, with the same order as the columns of Data).

fast

boolean indicator whether to use fast EBSeq or full EBSeq

Alpha

start value of hyper parameter alpha

Beta

start value of hyper parameter beta

Qtrm, QtrmCut

Transcripts with Qtrm th quantile < = QtrmCut will be removed before testing. The default value is Qtrm = 1 and QtrmCut=0. By default setting, transcripts with all 0's won't be tested.

maxround

Number of iterations. The default value is 50. Users should always check the convergency by looking at the Alpha and Beta in output. If the hyper-parameter estimations are not converged in 50 iterations, larger number is suggested.

step1

stepsize for gradient ascent of alpha

step2

stepsize for gradietn ascent of beta

thre

threshold for determining the state of a position

sthre

shrinkage threshold for iterative pruning during the EM updates

filter

filterthreshold for low expression units

stopthre

stopping threshold for EM

Details

For each transcript gi within condition, the model assumes: X_gis|mu_gi ~ NB (r_gi0 * l_s, q_gi) q_gi|alpha, beta^N_g ~ Beta (alpha, beta^N_g) In which the l_s is the sizeFactors of samples.

The function will test "H0: q_gi^C1 = q_gi^C2" and "H1: q_gi^C1 != q_gi^C2."

Value

Alpha

Fitted parameter alpha of the prior beta distribution.

Beta

Fitted parameter beta of the prior beta distribution.

P

Global proportion of DE patterns.

RList

The fitted values of r for each transcript.

MeanList

The mean of each transcript (across conditions).

VarList

The variance of each transcript (across conditions).

QList

The fitted q values of each transcript within the two conditions

Mean

The mean of each transcript within the two conditions (adjusted by normalization factors).

Var

The estimated variance of each transcript within the two conditions (adjusted by normalization factors).

PoolVar

The variance of each transcript (The pooled value of within condition EstVar).

DataNorm

Normalized expression matrix.

AllZeroIndex

The transcript with expression 0 for all samples (which are not tested).

Iso

same as NgVector

PPMat

A matrix contains posterior probabilities of being EE (the first column) or DE (the second column). Rows are transcripts. Transcripts with expression 0 for all samples are not shown in this matrix.

AllParti

selected patterns

PPMatWith0

A matrix contains posterior probabilities of being EE (the first column) or DE (the second column). Rows are transcripts. Transcripts with expression 0 for all samples are shown as PP(EE) = PP(DE) = NA in this matrix. The transcript order is exactly the same as the order of the input data.

Conditions

The input conditions.

Author(s)

Ning Leng, Xiuyu Ma

References

Ning Leng, John A. Dawson, James A. Thomson, Victor Ruotti, Anna I. Rissman, Bart M.G. Smits, Jill D. Haag, Michael N. Gould, Ron M. Stewart, and Christina Kendziorski. EBSeq: An empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics (2013)

See Also

EBMultiTest, PostFC, GetPPMat

Examples

data(GeneMat)
str(GeneMat)
Sizes = MedianNorm(GeneMat)
EBOut = EBTest(Data=GeneMat, Conditions=as.factor(rep(c("C1","C2"),each=5)),
       sizeFactors = Sizes)
PP = GetPPMat(EBOut)

wiscstatman/EBSeq documentation built on June 3, 2023, 7:34 a.m.