estimate_differential_expression: Estimating differential expression statistics for Genes...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/wrapper2.R

Description

Using the mapped reads and Exon and Gene annotations, this function calculates the statistics to assess differential expression of genes using both the Poisson model and the Generalized Poisson Model. In both models, the final statistic outputted by the function is a Chi Square distributed random variable with degree of freedom = 1. Since the estimation of the parameters of the Generalized Poisson Model rely on the convergence of the Newton Raphson algorithm, the statistics are only valid if the algorithm has converged for both the conditions being compared.

Usage

1
estimate_differential_expression(reads, exons, genes, norm_gp, norm_p, do_permute)

Arguments

reads

The mapped reads output. See details

exons

The exon annotations and the reads mapping to a particular exon

genes

The gene annotations and the exons mapping to the particular gene

norm_gp

The normalizing values for the generalized Poisson Model for each condition. Therefore the normalizing factor for condition 3 vs condition 2 would be norm_gp[2]/norm_gp[3]

norm_p

The normalizing values for the Poisson Model.

do_permute

If do_permute = 1, a permutation test will be performed to fit the null distribution of the test statistic to a Gamma distribution

Details

reads The reads data is a (num_reads x (3+num_conditions)) matrix where num_reads is the number of reads mapped in the experiment and num_conditions is the number of different conditions under which the experiment is done. Column 1 is the Gene ID of the gene the read maps to, column 2 is the Exon ID of the exon the read maps to, Column 3 is the Position of the read and Columns 4 to (3+num_conditions) is the coverage of that position in each of the conditions.
exons The exons data is a (num_exons x 4) matrix where num_exons is the total number of uniquely annotated exons. Column 1 is the exon ID, column 2 is the start position and column 3 is the end position of the reads mapping to an exon corresponding to the reads dataset. That is if column 2 =4 and column 3 = 15, reads 4 to 15 map to this particular exon. Column 4 is the length of the exon.
genes The genes data is a (num_genes x 4) matrix where num_genes is the total number of uniquely annotated genes. Column 1 is the gene ID, column 2 is the starting exon and column 3 is the ending exon in the gene. If Column 2 = 1 and Column 3 = 4, then exons 1,2,3 and 4 are inside the gene. Column 4 is the gene length.

Value

gp_comparison

gp_comparison[[i,j,k]]$Gptest is the test statistic for the differential expression of Gene i between conditions j and k ( j < k) when the gene counts are modeled as a Generalized Poisson Random variable. This test statistic is only valid if gp_comparison[[i,j,k]]$mark = 1. If gp_comparison[[i,j,k]] = NULL, then the test could not be carried out because the MLE for the Generalized Poisson model for one of the parameters could not be calculated. If do_permute = 1 and the algorithm has converged, gp_comparison[[i,j,k]]$shape is the shape parameter and gp_comparison[[i,j,k]]$scale is the scale parameter for the gamma distribution used to model the null distribution of the test statistic

p_comparison

p_comparison[[i,j,k]]$Ptest is the test statistic for the differential expression of Gene i between conditions j and k ( j < k) when the gene counts are modeled as a Poisson Random variable. If p_comparison[[i,j,k]] = NULL, then the test statistic for the Poisson Model was not calculated since gp_comparison[[i,j,k]] = NULL

Author(s)

Sudeep Srivastava, Liang Chen

References

Consul, P. C. (1989) Generalized Poisson Distributions: Properties and Applications. New York: Marcel Dekker.
Sudeep Srivastava, Liang Chen A two-parameter generalized Poisson model to improve the analysis of RNA-Seq data Nucleic Acids Research Advance Access published July 29,2010 doi : 10.1093/nar/gkq670

See Also

likelihood_ratio_tissue_generalized_poisson, likelihood_ratio_tissue_poisson, reads, exons, genes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(reads);
data(exons);
data(genes);
set.seed(666);
norm_gp = runif(6,0,1);
norm_p = runif(6,0,1);

output = estimate_differential_expression(reads,exons,genes,norm_gp,norm_p,0);

##Comparing Gene 1 between condition 1 and 2
cat("Mark = ",output$gp_comparison[[1,1,2]]$mark," Test Statistic with Generalized Poisson Model = ", output$gp_comparison[[1,1,2]]$Gptest,"\n");

cat("Test Statistic with Poisson Model = ",output$p_comparison[[1,1,2]]$Ptest,"\n");

GPseq documentation built on May 30, 2017, 3:11 a.m.