SIBER: Fit Mixture Model on The RNAseq Data and Calculates...

Description Usage Arguments Value References Examples

View source: R/siberRaw2.R

Description

SIBER proceeds in two steps. The first step fits a two-component mixture model. The second step calculates the Bimodality Index corresponding to the assumed mixture distribution. Four types of mixture models are implemented: log normal (LN), Negative Binomial (NB), Generalized Poisson (GP), Beta (Beta) and normal mixture (NL). The normal mixture model was developed to identify bimodal genes from microarray data in Wang et al. It is incorporated here in case the user has already transformed the RNAseq data. The Beta mixture model can be applied to methylation data where the observed values are between 0 and 1 representing metylation rate. Behind the scene, SIBER calls the fitNB, fitGP, fitLN and fitNL function with model=E depending on which distribution model is specified. When the observed percentage of count exceeds the user specified threshold zeroPercentThr, the 0-inflated model overrides the E model and will be fitted. Type vignette('SIBER') in the R console to pull out the user manual in pdf format.

Usage

1
2
SIBER(y, d = NULL, model = c("LN", "NB", "GP", "Beta", "NL", "BetaReg"),
  zeroPercentThr = 0.2, base = exp(1), eps = 10)

Arguments

y

A vector representing the RNAseq raw count or the transformed values if model=NL.

d

A vector of the same length as y representing the normalization constant to be applied to the data.

model

Character string specifying the mixture model type. It can be any of LN, NB, GP, Beta and NL.

zeroPercentThr

A scalar specifying the minimum percent of zero to detect using log normal mixture. This parameter is used to deal with zero-inflation in RNAseq count data. When the percent of zero exceeds this threshold, 1-comp mixture LN model is used to estimate mu and sigma from nonzero count. This parameter is relevant only if model='LN'.

base

The logarithm base defining the parameter estimates in the logarithm scale from LN model . It is relevant only if model='LN'.

eps

A scalar to be added to the count data when model='LN'. This parameter is relevant only when model='LN'.

Value

A vector consisting estimates of mu1, mu2, sigma1, sigma2, p1, delta and BI.

References

Tong, P., Chen, Y., Su, X. and Coombes, K. R. (2012). Systematic Identification of Bimodally Expressed Genes Using RNAseq Data. Bioinformatics, submitted.

Examples

1
2
3
4
set.seed(100)
y=c(rbeta(100,1,4),rbeta(200,4,1))
fitBeta(y=y)
SIBER(y, model='Beta')

nickytong/SIBER documentation built on May 23, 2019, 5:08 p.m.