knitr::opts_chunk$set(
  echo = TRUE
)
library("DeeDee")
data(DE_results_IFNg_naive, package = "DeeDee")
IFNg_naive <- deedee_prepare(IFNg_naive, "DESeq2")
data(DE_results_IFNg_both, package = "DeeDee")
IFNg_both <- deedee_prepare(IFNg_both, "DESeq2")
data(DE_results_Salm_naive, package = "DeeDee")
Salm_naive <- deedee_prepare(Salm_naive, "DESeq2")
data(DE_results_Salm_both, package = "DeeDee")
Salm_both <- deedee_prepare(Salm_both, "DESeq2")

DeeDee_obj <- list(IFNg_naive = IFNg_naive, 
                   IFNg_both = IFNg_both, 
                   Salm_naive = Salm_naive, 
                   Salm_both = Salm_both)

What DeeDee is for

When you want to compare results from multiple Differential Expression Analyses (DEAs), the DeeDee package is your friend. It contains various functions, and a Shiny App combining them all, that help shed light on the similarities and differences between the experiments in question. DeeDee is designed to be used after the application of a DE analysis program (like DESeq2, edgeR or limma) on the single DEAs.

How to Install

You can install the current development version from GitHub:

library("remotes")
remotes::install_github("lea-rothoerl/DeeDee", 
                        dependencies = TRUE, build_vignettes = TRUE)

Example Dataset Used in this Vignette

The examples in the following chapters utilize the data from the Bioconductor package macrophage(Human macrophage immune response). The macrophage dataset includes data from 24 RNA-seq samples of human macrophages exposed to different conditions: naive, associated with IFNg (an interferon), SL1344 (a strain of Salmonella), or both. The preprocessing of the data was done using DESeq2, obtaining the following four DEAs: naive vs. IFNg, IFNg vs. both, naive vs. Salmonella, and Salmonella vs. both.

What to Provide

The main DeeDee functions work on tables of logFC- ("logFC") and p-values ("pval") from DEAs, with the gene identifiers as row names. This table can be built by the user manually or by using deedee_prepare. The deedee_prepare function accepts results from three of the most used R DE analysis packages (DESeq2, edgeR, limma) and converts them to digestible tables. Which of these packages was used to analyze the raw data needs to be specified using the parameter input_type. If, for example, you have a DESeq2 result called DESeq2_res and want to convert it to a DeeDee table, you can use the following code chunk to do so.

inp <- deedee_prepare(data = DESeq2_res, input_type = "DESeq2")

Every DeeDee main function needs to be fed with a (named) list of at least two of these tables as input (data). A threshold for p-values can be specified with the parameter pthresh (default = 0.05). Besides these, most functions have additional parameters that will be explained in the respective sections below. All functions produce colorblind-friendly output.

For our example, the deedee_prepare results are inp1 - inp4 (naive vs. IFNg, IFNg vs. both, naive vs. Salmonella, and Salmonella vs. both from macrophage). The named input list is created as follows.

DeeDee_obj <- list(naive_IFNg = inp1, IFNg_both = inp2, naive_Salm = inp3, Salm_both = inp4)

DESeq2

When DESeq2 is applied to analyze the raw data, deedee_prepare accepts the output from the results() function.

edgeR

When edgeR is applied to analyze the raw data, deedee_prepare accepts the output from the topTable() function.

limma

When limma is applied to analyze the raw data, deedee_prepare accepts the output from the exactTest() function.

\pagebreak

Exported Functions

DeeDee Scatter Plot

The function deedee_scatter() creates a scatterplot of logFC values of the genes in two input datasets. If the input DeeDee list contains more than two datasets (like our example DeeDee_obj does), the select1 (default = 1) and select2 (default = 2) parameters specify which ones will be plotted (selected by list index). deedee_scatter() includes a color_by parameter that can be set to "pval1" (default) or "pval2". The points will be colored according to the color scheme given on the right side of the output.

deedee_scatter(data = DeeDee_obj, select1 = 2, select2 = 3, color_by = "pval1", pthresh = 0.05)

\pagebreak

DeeDee Heatmap

The function deedee_heatmap() creates a heatmap of the logFC values for all common genes in every input dataset. The color key is given on the right side of the plot. Additionally to the standard parameters explained in Input, the function takes a numeric show_first value (default = 25), specifying the number of genes depicted (if the total number of genes is smaller than show_first, all genes are shown). It also digests a logical show_gene_names value (default = FALSE) that determines if the gene identifiers (rownames in deedee_prepare results) will be displayed in the heatmap, and a logical show_na, defining if genes with NAs (in less than half of the contrasts) are included. The distance measure (dist, values: euclidean (default), manhattan, pearson, spearman) and clustering method (clust, values: single, complete, average (default), centroid) can be chosen as well.

deedee_heatmap(data = DeeDee_obj, show_first = 25, show_gene_names = FALSE, dist = "manhattan", clust = "centroid", show_na = FALSE, pthresh = 0.05)

Because the heatmap is produced with the InteractiveComplexHeatmap package, a Shiny window with the heatmap and entailed interactivity can be opened by executing the command InteractiveComplexHeatmap::ht_shiny(res) to a result res of deedee_heatmap(). For more information, please refer to the InteractiveComplexHeatmap Vignette.

\pagebreak

DeeDee Venn Diagram

The function deedee_venn() creates a Venn diagram depicting the overlaps of differentially expressed genes in the input datasets. To keep the Venn diagram easy on the eye, the data list may contain no more than four datasets. To compare more DEAs in a similar, set-based manner, please make use of deedee_upSet(). The parameter mode can be set to up, down, or both (default), specifying if only up-regulated, down-regulated or both DE genes will be counted.

# deedee_venn(data = DeeDee_obj, mode = "both", pthresh = 0.05)

\pagebreak

DeeDee UpSet Plot

The function deedee_upset() creates an UpSet plot depicting the overlaps of differentially expressed genes in the input datasets. Contrary to deedee_venn(), the UpSet plot can compare multiple (and even more than four) datasets in a visually pleasing way. The parameter mode can be set to up, down, both, or both_colored (default), specifying if only up-regulated, down-regulated or both DE genes shall be depicted. If both_colored is chosen, the result is the same UpSet plot as for both, but the shares of the intersections where all samples have positive/negative logFC values are colored accoring to the color key. The minimum size for an intersection to be included in the plot can be defined via the parameter min_setsize, default is 10.

deedee_upset(data = DeeDee_obj, mode = "both_colored", min_setsize = 15, pthresh = 0.05)

\pagebreak

DeeDee Q-Q Plot

The function deedee_qq() compares the statistical distributions of two input datasets. If the input data list contains more than two datasets (like our example DeeDee_obj does), the select1 (default = 1) and select2 (default = 2) parameters specify which ones are to be used. If the resulting curve resembles a straight line with a slope of 1, the distributions are similar (perfect straight line = identical distributions). deedee_qq() includes a color_by parameter that can be set to pval1 (default) or pval2. The points generating the curve will be colored according to the color key displayed on the right of the output.

deedee_qq(data = DeeDee_obj, select1 = 1, select2 = 3, color_by = "pval2", pthresh = 0.05)

The function deedee_qqmult() makes the same calculations as deedee_qq, but it is capable of depicting multiple Q-Q lines of different contrasts against the same reference in one plot. The curves are then colored by contrast, a coloration by p-value, like what deedee_qq() allows for, is not possible with deedee_qqmult().

deedee_qqmult(data = DeeDee_obj, ref = 1, pthresh = 0.05)

\pagebreak

DeeDee CAT Plot

The function deedee_cat() creates a plot depicting Concordance At the Top curves for a reference contrast (parameter ref, select contrast by list index). For each contrast except the reference, a curve indicating the concordance of the top genes in the contrast's logFC-sorted gene list, against the reference's, is displayed. The argument mode specifies the way of sorting the genes: by highest (up, default), lowest (down) or highest absolute (both) logFC value. The highest rank for which the concordance is calculated is given by the maxrank (default = 1000) parameter.

deedee_cat(data = DeeDee_obj, ref = 2, maxrank = 800, pthresh = 0.05)

\pagebreak

DeeDee Summary

deedee_summary() is the function to combine the results from the other DeeDee functions as a HTML document in a report-like manner. The function entails arguments to set the parameters for all included functions (for more information see the manual). The summary will be saved to a path specified with the argument output_path (default = "DeeDee_Summary.html" in the working directory). The parameter overwrite decides if a potentially existing document at the designated location will be overwritten (default = TRUE). Param silent can suppress messages (default = FALSE) and open_file decides if the resulting document will be opened (default = TRUE).

deedee_summary(deedee_list = DeeDee_obj,
               output_path = "DeeDee_Summary.html",
               overwrite = FALSE,
               pthresh = 0.05,
               scatter_select1 = 1,
               scatter_select2 = 2,
               scatter_color_by = "pval1",
               heatmap_show_first = 25,
               heatmap_show_gene_names = FALSE,
               heatmap_dist = "euclidean",
               heatmap_clust = "average",
               heatmap_show_na = FALSE,
               venn_mode = "both",                           
               upset_mode = "both_colored",
               upset_min_setsize = 10,
               qqmult_ref = 1,
               cat_ref = 1,
               cat_maxrank = 1000,
               cat_mode = "up",
               silent = FALSE,
               open_file = TRUE))

\pagebreak

DeeDee Shiny App

Besides the standalone functions, the DeeDee package also contains an interactive Shiny web application. To open it, run the following command:

deedee_app()

It can work on a diverse range of input files (.txt, .xlsx, .RDS) with different contents (single DeeDee tables, lists of DeeDee tables, raw DEA result objects), as described in the INFO panel in the input tab. The App entails one tab for each of the above presented functions. The respective parameters can be set via self-explanatory input boxes in each tab. Additional interactive functionality, namely the interactive heatmap, a brushing (area selection) of points in the Q-Q and scatter plot and the possibility to run an over-representation on the brushed points in the scatter plot are implemented. More information on the usage can be found in the INFO panel in each respective tab.

Optionally, you can include a list of DeeDee tables as an argument to the deedee_app(data = DeeDee_obj) function. The input can be combined with uploads and used just like these.

Session Info {-}

sessionInfo()


lea-rothoerl/DeeDee documentation built on Dec. 21, 2021, 9:47 a.m.