EBMultiTest: Using EM algorithm to calculate the posterior probabilities...

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/EBMultiTest.R

Description

'EBMultiTest' is built based on the assumption of NB-Beta Empirical Bayes model. It utilizes the EM algorithm to give the posterior probability of the interested patterns.

Usage

1
2
3
EBMultiTest(Data, NgVector = NULL, Conditions, AllParti = NULL, 
	sizeFactors, maxround, Pool = F, NumBin = 1000, 
	ApproxVal=10^-10, PoolLower=.25, PoolUpper = .75, Print=T,Qtrm=1,QtrmCut=0)

Arguments

Data

A data matrix contains expression values for each transcript (gene or isoform level). In which rows should be transcripts and columns should be samples.

NgVector

A vector indicates the uncertainty group assignment of each isoform. e.g. if we use number of isoforms in the host gene to define the uncertainty groups, suppose the isoform is in a gene with 2 isoforms, Ng of this isoform should be 2. The length of this vector should be the same as the number of rows in Data. If it's gene level data, Ngvector could be left as NULL.

Conditions

A vector indicates the condition in which each sample belongs to.

AllParti

A matrix indicates the interested patterns. Columns shoule be conditions and rows should be patterns. The matrix could be obtained by the GetPatterns function. If AllParti=NULL, all possible patterns will be used.

sizeFactors

The normalization factors. It should be a vector with lane specific numbers (the length of the vector should be the same as the number of samples, with the same order as the columns of Data).

maxround

Number of iterations. The default value is 5. Users should always check the convergency by looking at the Alpha and Beta in output. If the hyper-parameter estimations are not converged in 5 iterations, larger number is suggested.

Pool

While working without replicates, user could define the Pool = TRUE in the EBTest function to enable pooling.

NumBin

By defining NumBin = 1000, EBSeq will group the genes with similar means together into 1,000 bins.

PoolLower, PoolUpper

With the assumption that only subset of the genes are DE in the data set, we take genes whose FC are in the PoolLower - PoolUpper quantile of the FC's as the candidate genes (default is 25%-75%).

For each bin, the bin-wise variance estimation is defined as the median of the cross condition variance estimations of the candidate genes within that bin.

We use the cross condition variance estimations for the candidate genes and the bin-wise variance estimations of the host bin for the non-candidate genes.

ApproxVal

The variances of the transcripts with mean < var will be approximated as mean/(1-ApproxVal).

Print

Whether print the elapsed-time while running the test.

Qtrm, QtrmCut

Transcripts with Qtrm th quantile < = QtrmCut will be removed before testing. The default value is Qtrm = 1 and QtrmCut=0. By default setting, transcripts with all 0's won't be tested.

Value

Alpha

Fitted parameter alpha of the prior beta distribution. Rows are the values for each iteration.

Beta

Fitted parameter beta of the prior beta distribution. Rows are the values for each iteration.

P, PFromZ

The bayes estimator of following each pattern of interest. Rows are the values for each iteration.

Z, PoissonZ

The Posterior Probability of following each pattern of interest for each transcript. (Maybe not in the same order of input).

RList

The fitted values of r for each transcript.

MeanList

The mean of each transcript. (across conditions).

VarList

The variance of each transcript. (across conditions).

QList

The fitted q values of each transcript within each condition.

SPMean

The mean of each transcript within each condition (adjusted by the normalization factors).

SPEstVar

The estimated variance of each transcript within each condition (adjusted by the normalization factors).

PoolVar

The variance of each transcript (The pooled value of within condition EstVar).

DataList

A List of data that grouped with Ng and bias.

PPpattern

The Posterior Probability of following each pattern (columns) for each transcript (rows). Transcripts with expression 0 for all samples are not shown in this matrix.

f

The likelihood of likelihood of prior predictive distribution of being each pattern for each transcript.

AllParti

The matrix describe the patterns.

PPpatternWith0

The Posterior Probability of following each pattern (columns) for each transcript (rows). Transcripts with expression 0 for all samples are shown in this matrix with PP(any_pattrn)=NA.

ConditionOrder

The condition assignment for C1Mean, C2Mean, etc.

Author(s)

Ning Leng

References

Ning Leng, John A. Dawson, James A. Thomson, Victor Ruotti, Anna I. Rissman, Bart M.G. Smits, Jill D. Haag, Michael N. Gould, Ron M. Stewart, and Christina Kendziorski. EBSeq: An empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics (2013)

See Also

EBTest, GetMultiPP, GetMultiFC

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
data(MultiGeneMat)
MultiGeneMat.small = MultiGeneMat[201:210,]
Conditions = c("C1","C1","C2","C2","C3","C3")
PosParti = GetPatterns(Conditions)
Parti = PosParti[-3,]
MultiSize = MedianNorm(MultiGeneMat.small)
MultiOut = EBMultiTest(MultiGeneMat.small, NgVector = NULL,
	Conditions = Conditions, AllParti = Parti, 
	sizeFactors = MultiSize, maxround = 5)
MultiPP = GetMultiPP(MultiOut)

Example output

Loading required package: blockmodeling
Loading required package: gplots

Attaching package: 'gplots'

The following object is masked from 'package:stats':

    lowess

Loading required package: testthat
iteration 1 done 

time 0.87 

iteration 2 done 

time 0.85 

iteration 3 done 

time 0.48 

iteration 4 done 

time 0.88 

iteration 5 done 

time 0.41 

EBSeq documentation built on Nov. 8, 2020, 6:52 p.m.