iDEG: Identification of individualized Differentially...
In QikeLi/iDEG: Identify Differentially Expressed Genes without requring replicates

Description Usage Arguments Value Examples

Identify differentionally expressed genes between two conditions, and only one transcriptome is collected for each condition.

iDEG(baseline, case, normalization = F, dataDistribution = c("NB",
  "Poisson"), numBin = 100, rankBaseline = T, estBaseline = F,
  estSize = F, spar = NULL, plot = 0, constDisp = T, df = 7,
  nulltype = 1, pct = 1e-04)

`baseline`	a vector of gene expression levels of the baseline transcriptome (e.g., healthy tissue)
`case`	a vector of gene expression levels of the case transcriptome (e.g., tumor tissue)
`normalization`	a logical variable indicating if normalization has been done
`dataDistribution`	the distribuitonal assumption of the RNA-Seq data under analysis. Possible values are 'Poisson' and 'NB'. Default is NB–negative binomial.
`numBin`	number of bins used to group all genes into. Default is 100.
`rankBaseline`	if True, iDEG groups all genes based on the gene expression levels of the baseline transcriptome. If False, iDEG group all genes based on the gene expression levels of the average of baseline and case transcriptomes.
`estBaseline`	compute the dispersion parameter only using the baseline transcriptome
`estSize`	if True, size parameter is estiamted from each bin; if False, dispersion parameter is estiamted from each bin.
`spar`	smoothing parameter used to fit a smoothing spline, typically (but not necessarily) in (0,1]. The coefficient lambda of the integral of the squared second derivative in the fit (penalized log likelihood) criterion is a monotone function of ‘spar’, see the details from `help(smooth.spline)`
`plot`	plots desired. 0 gives no plots. 1 gives single plot showing the histogram of zz and fitted densities f and p0*f0.
`constDisp`	if True, iDEG assumes the dispersion is a count across all genes. If False, iDEG assume dispersion is a smooth fucntion os expression mean
`df`	the degrees of freedom used for estimating marginal distrution.
`nulltype`	type of null distribution assumed in computing the probability of gene differential expression. 0 is the theoretical null N(0,1), 1 is maximum likelihood estimation.
`pct`	the percentage of genes exculded from fiting the two-group mixture model.

'iDEG' produces a list containing the following elements:

results: a table iDEG result for each gene. The first two columns are the gene epxression values of the two transcriptomes provided by the user. The thrid column is the local false discovery rate, which provides the probability of a gene being differentially expresseed. The fourth column is the statistic used to compute the local false discovery rate, and can be used as an effect size.
sizeHat: When the assumptioin of constant dispersion across genes is made, this is an single estimate of the common dispersion. When the assumptioin of non-constant is made, this is a vector of estimates for the dispersion parameter of each gene.

set.seed(1)
exp_mean1 <- rexp(20000, 1/500) + 1
exp_mean2 <- exp_mean1
exp_mean2[1:100] <- exp_mean2[1:100] * 10
transcriptome1 <- rnbinom(n = length(exp_mean1), size = 60, mu = exp_mean1)
transcriptome2 <- rnbinom(n = length(exp_mean2), size = 60, mu = exp_mean2)
res <- iDEG(transcriptome1,transcriptome2)