Introduction

Riborex is a R package for identifying differentially translated genes from Ribo-seq data. Riborex integrates both RNA- and Ribo-seq read count data into a single generalized linear model (GLM) and generates a modified design matrix reflecting the integration. At its core, Riborex applies existing RNA-seq analysis tools such as edgeR, DESeq2 and Voom to this modified design matrix and identifies differential translation across conditions.

Detailed example

First, we need to load Riborex library.

library(riborex)

The input for Riborex are two read count tables summarized from RNA-seq and Ribo-seq data respectively. The read count table should be organized as a data frame with rows correspond to genes and columns correspond to samples as shown below.

data(riborexdata)
RNACntTable <- rna
RiboCntTable <- ribo

We can check the first five lines of the table:

head(RNACntTable,5)
head(RiboCntTable,5)

Then we need to prepare two vectors to indicate the treatments of samples in RNA- and Ribo-seq data. Both RNA-seq and Ribo-seq can have different number of samples in control and treated condtions, and RNA-seq and Ribo-seq data can have different number of samples.

rnaCond <- c("control", "control", "treated", "treated")
riboCond <- c("control", "control", "treated", "treated")

After the two read count table and two condition vectors are ready, we can use riborex (), and we can choose which engine to use. By default, DESeq2 is used as the engine if you don't specify the engine option. Use help(riborex) in R to see more details about this function.

res.deseq2 <- riborex(RNACntTable, RiboCntTable, rnaCond, riboCond)

The format of the result is the same when DESeq2 is used in RNA-seq analysis.

res.deseq2

You can check the distribution p-values.

hist(res.deseq2$pvalue, main = 'DESeq2 unadjusted p-values', xlab='Unadjusted p-values')

We can see for this dataset, the p-value distribution is as expected based on DESeq2 manual which is uniformly distribution with differrentially expressed genes enriched with small p-values. We will show another dataset later for which the p-value distribution is skew to the right and how it can be fixed with fdrtool.

Also, you can use summary () for your results.

summary(res.deseq2)

And results can be saved by:

write.table(res.deseq2, "riborex_res_deseq2.txt", quote=FALSE)

If you want to use edgeR as your engine, you can use riborex () as:

res.edgeR <- riborex(RNACntTable, RiboCntTable, rnaCond, riboCond, "edgeR")

The format of the result is the same when edgeR is used in RNA-seq analysis.

head(res.edgeR$table)

For edgeR engine, you can also choose to estimate dispersion of RNA-seq and Ribo-seq data separately by specifying engine as "edgeRD".

res.edgeRD <- riborex(RNACntTable, RiboCntTable, rnaCond, riboCond, "edgeRD")

If you want to use Voom as the engine, you can run riborex () as:

res.voom <- riborex(RNACntTable, RiboCntTable, rnaCond, riboCond, "Voom")

The format of the result is the same when Voom is used in RNA-seq analysis.

head(res.voom)

Case-study with "incorrect" p-value distribution

RNACntTable.corrected <- rna.null
RiboCntTable.corrected <- ribo.null

We can check the first five lines of the table:

head(RNACntTable.corrected)
head(RiboCntTable.corrected)

The condition vectors can be created as:

rnaCond.corrected <- c(rep('T0', 3), rep('T24',3))
riboCond.corrected <- rnaCond.corrected
rnaCond.corrected
riboCond.corrected

The results from DESeq2 can be obtained as:

res.deseq2.corrected <- riborex(RNACntTable.corrected, RiboCntTable.corrected, rnaCond.corrected, riboCond.corrected)

We can check the p-value distribution as:

hist(res.deseq2.corrected$pvalue, main = 'DESeq2 unadjusted p-values', xlab='Unadjusted p-values')

We can see from the histogram that the distribution of p-values is skew to the right, that means the null distribution is not "correct", we can fix it by reestimating the p-values using fdrtool:

results.corrected <- correctNullDistribution(res.deseq2.corrected)

We can see the p-value distribution after correction:

hist(results.corrected$pvalue, main = 'DESeq2 unadjusted p-values after correction', 
     xlab='Corrected unadjusted p-values')

We can see after the correction, the distribution of p-values is as expected. And the adjusted pvalues are corrected also.

Multi-factor experiment

Since we don't find any available ribosome profiling data generated in a multi-factor experiement, here we generate a pseudo dataset to demonstrate the usage of riborex in a multi-factor experiment. The pseudo dataset have 8 samples in RNA-seq and Ribo-seq, and two factors are included.

rna <- RNACntTable[,c(1,2,3,4,1,2,3,4)]
ribo <- RiboCntTable[,c(1,2,3,4,1,2,3,4)]

For multi-factor experiment, we prepare two data frames to indicate the treatment under each factor. Here for the 8 samples in both RNA- and Ribo-seq experiement, the 3rd and 4th samples are treated with drug1 and the 7th and 8th samples are treated with drug2.

rnaCond <- data.frame(factor1=(c("control1", "control1", "treated1", "treated1", 
                                 "control1", "control1", "control1", "control1")),
                      factor2=(c("control2", "control2", "control2", "control2", 
                                 "control2", "control2", "treated2", "treated2")))

riboCond <- data.frame(factor1=(c("control1", "control1", "treated1", "treated1",
                                  "control1", "control1", "control1", "control1")),
                       factor2=(c("control2", "control2", "control2", "control2", 
                                  "control2", "control2", "treated2", "treated2")))

Also we need to prepare a contrast to specify the comparison we want to perform, for example, if we want to compare the influence of the usage of drug2. The contrast can be constructed as:

contrast = c("factor2", "control2", "treated2")

Then riborex () is used with contrast specified.

res.deseq2 <- riborex(rna, ribo, rnaCond, riboCond, "DESeq2", contrast = contrast)

We can see the summary of the result:

summary(res.deseq2)

edgeR and edgeRD can be used in a similar way.

res.edgeR <- riborex(rna, ribo, rnaCond, riboCond, "edgeR", contrast = contrast)
res.edgeRD <- riborex(rna, ribo, rnaCond, riboCond, "edgeRD", contrast = contrast)

Currently, you can't choose Voom as the engine in a multi-factor experiment yet.

Setup

This analysis was conducted on

sessionInfo()


smithlabcode/riborex documentation built on April 12, 2021, 11:46 a.m.