R/VolcanoPlot.R

Defines functions VolcanoPlot

Documented in VolcanoPlot

#' The VolcanoPlot class
#'
#' The VolcanoPlot is a \linkS4class{RowDataPlot} subclass that is dedicated to creating a volcano plot.
#' It retrieves the log-fold change and p-value from and creates a row-based plot where each point represents a feature.
#'
#' @section Slot overview:
#' The following slots control the thresholds used in the visualization:
#' \itemize{
#' \item \code{PValueThreshold}, a numeric scalar in (0, 1] specifying the threshold to use on the (adjusted) p-value.
#' Defaults to 0.05.
#' \item \code{LogFCThreshold}, a non-negative numeric scalar specifying the threshold to use on the log-fold change.
#' Defaults to 0.
#' \item \code{PValueCorrection}, a string specifying the multiple testing correction to apply.
#' Defaults to \code{"BH"}, but can take any value from \code{\link{p.adjust.methods}}.
#' }
#'
#' In addition, this class inherits all slots from its parent \linkS4class{RowDataPlot},
#' \linkS4class{RowDotPlot}, \linkS4class{DotPlot} and \linkS4class{Panel} classes.
#'
#' @section Constructor:
#' \code{VolcanoPlot(...)} creates an instance of a VolcanoPlot class,
#' where any slot and its value can be passed to \code{...} as a named argument.
#'
#' Users are expected to load relevant statistics into the \code{\link{rowData}} of a \linkS4class{SummarizedExperiment}.
#' This panel expects one or more columns containing the p-values and log-fold changes for each gene/row - see Examples.
#' The expected column names (and how to tune them) are listed at \code{?"\link{registerPValueFields}"}.
#'
#' @section Supported methods:
#' In the following code snippets, \code{x} is an instance of a \linkS4class{RowDataPlot} class.
#' Refer to the documentation for each method for more details on the remaining arguments.
#'
#' For setting up data values:
#' \itemize{
#' \item \code{\link{.cacheCommonInfo}(x, se)} returns \code{se} after being loaded with class-specific constants.
#' This includes \code{"valid.p.fields"} and \code{"valid.lfc.fields"}, character vectors containing the names of valid \code{\link{rowData}} columns for the p-values and log-fold changes, respectively.
#' \item \code{\link{.refineParameters}(x, se)} returns \code{x} after setting \code{XAxis="Row data"} and the various \code{*Pattern} fields to their cached values.
#' This will also call the equivalent \linkS4class{RowDataPlot} method for further refinements to \code{x}.
#' If valid p-value and log-fold change fields are not available, \code{NULL} is returned instead.
#' }
#'
#' For defining the interface:
#' \itemize{
#' \item \code{\link{.defineDataInterface}(x, se, select_info)} returns a list of interface elements for manipulating all slots described above.
#' \item \code{\link{.panelColor}(x)} will return the specified default color for this panel class.
#' \item \code{\link{.allowableXAxisChoices}(x, se)} returns a character vector specifying the acceptable log-fold change-related variables in \code{\link{rowData}(se)} that can be used as choices for the x-axis.
#' \item \code{\link{.allowableYAxisChoices}(x, se)} returns a character vector specifying the acceptable p-value-related variables in \code{\link{rowData}(se)} that can be used as choices for the y-axis.
#' \item \code{\link{.hideInterface}(x, field)} will return \code{TRUE} for \code{field="XAxis"},
#' otherwise it will call the \linkS4class{RowDataPlot} method.
#' \item \code{\link{.fullName}(x)} will return \code{"Volcano plot"}.
#' }
#'
#' For monitoring reactive expressions:
#' \itemize{
#' \item \code{\link{.createObservers}(x, se, input, session, pObjects, rObjects)} sets up observers for all new slots described above, as well as in the parent classes via the \linkS4class{RowDataPlot} method.
#' }
#'
#' For creating the plot:
#' \itemize{
#' \item \code{\link{.generateDotPlotData}(x, envir)} will create a data.frame of row metadata variables in \code{envir}.
#' This should contain negative log-transformed p-values on the y-axis and log-fold changes on the x-axis,
#' in addition to an extra field specifying whether or not the feature was considered to be significantly up or down.
#' The method will return the commands required to do so as well as a list of labels.
#' \item \code{\link{.prioritizeDotPlotData}(x, envir)} will create variables in \code{envir} marking the priority of points.
#' Significant features receive higher priority (i.e., are plotted over their non-significant counterparts) and are less aggressively downsampled when \code{Downsample=TRUE}.
#' The method will return the commands required to do this as well as a logical scalar indicating that rescaling of downsampling resolution is performed.
#' \item \code{\link{.colorByNoneDotPlotField}(x)} will return a string specifying the field of the data.frame (generated by \code{\link{.generateDotPlotData}}) containing the significance information.
#' This is to be used for coloring when \code{ColorBy="None"}.
#' \item \code{\link{.colorByNoneDotPlotScale}(x)} will return a string containing a \pkg{ggplot2} command to add a default color scale when \code{ColorBy="None"}.
#' \item \code{\link{.generateDotPlot}(x, labels, envir)} returns a list containing \code{plot} and \code{commands}, using the inital \linkS4class{ColumnDataPlot} \link{ggplot} and adding vertical lines demarcating the log-fold change threshold.
#' }
#'
#' For documentation:
#' \itemize{
#' \item \code{\link{.definePanelTour}(x)} returns an data.frame containing the steps of a panel-specific tour.
#' \item \code{\link{.getDotPlotColorHelp}(x, color_choices)} returns a function that generates an \pkg{rintrojs} tour for the color choice UI.
#' }
#'
#' @docType methods
#' @aliases VolcanoPlot VolcanoPlot-class
#' initialize,VolcanoPlot-method
#' .cacheCommonInfo,VolcanoPlot-method
#' .refineParameters,VolcanoPlot-method
#' .defineDataInterface,VolcanoPlot-method
#' .createObservers,VolcanoPlot-method
#' .hideInterface,VolcanoPlot-method
#' .fullName,VolcanoPlot-method
#' .panelColor,VolcanoPlot-method
#' .generateDotPlotData,VolcanoPlot-method
#' .allowableXAxisChoices,VolcanoPlot-method
#' .allowableYAxisChoices,VolcanoPlot-method
#' .prioritizeDotPlotData,VolcanoPlot-method
#' .colorByNoneDotPlotField,VolcanoPlot-method
#' .colorByNoneDotPlotScale,VolcanoPlot-method
#' .generateDotPlot,VolcanoPlot-method
#' .definePanelTour,VolcanoPlot-method
#' .getDotPlotColorHelp,VolcanoPlot-method
#'
#' @examples
#' # Making up some results:
#' se <- SummarizedExperiment(matrix(rnorm(10000), 1000, 10))
#' rownames(se) <- paste0("GENE_", seq_len(nrow(se)))
#' rowData(se)$PValue <- runif(nrow(se))
#' rowData(se)$LogFC <- rnorm(nrow(se))
#' rowData(se)$AveExpr <- rnorm(nrow(se))
#'
#' if (interactive()) {
#'     iSEE(se, initial=list(VolcanoPlot()))
#' }
#'
#' @author Aaron Lun
#'
#' @importFrom SummarizedExperiment rowData
#'
#' @seealso
#' \link{RowDataPlot}, for the base class.
#' @name VolcanoPlot-class
NULL

#' @export
setClass("VolcanoPlot", contains="RowDataPlot",
    slots=c(PValueThreshold="numeric", LogFCThreshold="numeric", PValueCorrection="character"))

#' @export
setMethod(".fullName", "VolcanoPlot", function(x) "Volcano plot")

#' @export
setMethod(".panelColor", "VolcanoPlot", function(x) "#DEAE10")

#' @export
setMethod("initialize", "VolcanoPlot", function(.Object, PValueThreshold=0.05,
    LogFCThreshold=0, PValueCorrection="BH", ...)
{
    args <- list(PValueThreshold=PValueThreshold,
        LogFCThreshold=LogFCThreshold, PValueCorrection=PValueCorrection, ...)

    do.call(callNextMethod, c(list(.Object), args))
})

#' @export
#' @importFrom methods new
VolcanoPlot <- function(...) {
    new("VolcanoPlot", ...)
}

#' @importFrom stats p.adjust.methods
setValidity2("VolcanoPlot", function(object) {
    msg <- .define_de_validity(object, patterns=c("PValuePattern", "LogFCPattern"))
    if (length(msg)) msg else TRUE
})

#' @export
setMethod(".cacheCommonInfo", "VolcanoPlot", function(x, se) {
    if (!is.null(.getCachedCommonInfo(se, "VolcanoPlot"))) {
        return(se)
    }

    se <- callNextMethod()
    all.cont <- .getCachedCommonInfo(se, "RowDotPlot")$continuous.rowData.names

    p.okay <- .matchPValueFields(se, all.cont)
    lfc.okay <- .matchLogFCFields(se, all.cont)

    .setCachedCommonInfo(se, "VolcanoPlot", valid.lfc.fields=lfc.okay, valid.p.fields=p.okay)
})

#' @export
#' @importFrom methods callNextMethod
setMethod(".refineParameters", "VolcanoPlot", function(x, se) {
    x <- callNextMethod() # Trigger warnings from base classes.
    if (is.null(x)) {
        return(NULL)
    }

    x[["XAxis"]] <- "Row data"
    x
})

#' @export
setMethod(".allowableXAxisChoices", "VolcanoPlot", function(x, se) .getCachedCommonInfo(se, "VolcanoPlot")$valid.lfc.fields)

#' @export
setMethod(".allowableYAxisChoices", "VolcanoPlot", function(x, se) .getCachedCommonInfo(se, "VolcanoPlot")$valid.p.fields)

#' @export
#' @importFrom shiny numericInput selectInput hr
#' @importFrom stats p.adjust.methods
setMethod(".defineDataInterface", "VolcanoPlot", function(x, se, select_info) {
    plot_name <- .getEncodedName(x)
    input_FUN <- function(field) paste0(plot_name, "_", field)

    .addSpecificTour(class(x), "YAxis", function(plot_name) {
        data.frame(
            rbind(
                c(
                    element=paste0("#", plot_name, "_", "YAxis + .selectize-control"),
                    intro="Here, we select the <code>rowData</code> field containing the p-values for all features.
This is presumably generated from some comparison between conditions, e.g., for differential gene expression.
In general, the values here should not be corrected, and they should certainly not be transformed in any way."
                ),
                c(
                    element=paste0("#", plot_name, "_", "XAxisRowData + .selectize-control"),
                    intro="Similarly, we can select the <code>rowData</code> field containing the log-fold changes for all features.
This should have been generated from the same analysis that was used to obtain the p-values."
                )
            )
        )
    })

    .define_gene_sig_tours(x)

    c(callNextMethod(), 
        list(hr()), 
        .define_gene_sig_ui(x)
    )
})

#' @export
setMethod(".hideInterface", "VolcanoPlot", function(x, field) {
    if (field == "XAxis") TRUE else callNextMethod()
})

#' @export
setMethod(".createObservers", "VolcanoPlot", function(x, se, input, session, pObjects, rObjects) {
    callNextMethod()

    plot_name <- .getEncodedName(x)

    .createUnprotectedParameterObservers(plot_name,
        fields=c("PValueThreshold", "LogFCThreshold", "PValueCorrection"),
        input=input, pObjects=pObjects, rObjects=rObjects)
})

#' @export
setMethod(".generateDotPlotData", "VolcanoPlot", function(x, envir) {
    output <- callNextMethod()

    extra_cmds <- c(
        .define_de_status(x, lfc="plot.data$X", pval="plot.data$Y"),
        "plot.data$IsSig <- c('down', 'none', 'up')[.de_status];",
        "plot.data$Y <- -log10(plot.data$Y)"
    )
    eval(parse(text=extra_cmds), envir)

    output$commands <- c(output$commands, extra_cmds)
    output$labels$Y <- sprintf("-Log10[%s]", output$labels$Y)
    output
})

#' @export
setMethod(".prioritizeDotPlotData", "VolcanoPlot", function(x, envir) .define_de_priority(envir))

#' @export
setMethod(".colorByNoneDotPlotField", "VolcanoPlot", function(x) "IsSig")

#' @export
setMethod(".colorByNoneDotPlotScale", "VolcanoPlot", function(x) .de_color_scale(x[["Downsample"]]))

#' @export
#' @importFrom ggplot2 geom_vline
setMethod(".generateDotPlot", "VolcanoPlot", function(x, labels, envir) {
    output <- callNextMethod()

    # Adding the horizontal lines.
    extras <- "dot.plot <- dot.plot +"
    lfc <- x[["LogFCThreshold"]]
    if (lfc > 0) {
        # No idea why I need ggplot2:: here, but it just can't find it otherwise.
        extras <- c(extras, sprintf("ggplot2::geom_vline(xintercept=c(-1, 1)*%s, color=\"darkgreen\", linetype=\"dashed\")", lfc))
    }

    if (length(extras) > 1) {
        extras <- paste(extras, collapse="\n    ")
        output$commands <- c(output$commands, list(volcano=extras))
        output$plot <- eval(parse(text=extras), envir=envir)
    }

    output
})

#' @export
setMethod(".definePanelTour", "VolcanoPlot", function(x) {
    prev <- callNextMethod()
    skip <- grep("VisualBoxOpen$", prev$element)
    prev <- prev[-seq_len(skip-1),]

    rbind(
        c(paste0("#", .getEncodedName(x)), sprintf("The <font color=\"%s\">Volcano plot</font> panel shows the (log-transformed) p-value from a differential comparison against the log-fold change. Each point here corresponds to a feature in our <code>SummarizedExperiment</code>, and the number of significantly different features in the comparisons is shown in the legend.", .getPanelColor(x))),
        c(paste0("#", .getEncodedName(x), "_DataBoxOpen"), "The <i>Data parameters</i> box shows the available parameters that can be tweaked in this plot.<br/><br/><strong>Action:</strong> click on this box to open up available options."),
        c(paste0("#", .getEncodedName(x), "_YAxis + .selectize-control"), "We can control the columns containing the p-values for all features, based on the available fields in the <code>rowData</code> of the <code>SummarizedExperiment</code>. Note that these values should not be log-transformed, as the panel will take care of that itself."),
        c(paste0("#", .getEncodedName(x), "_XAxisRowData + .selectize-control"), "Similarly, we can control the columns containing the log-fold change of each feature, again based on the <code>rowData</code> fields."),
        c(paste0("#", .getEncodedName(x), "_PValueThreshold"), "A variety of thresholds can also be tuned to define significant differences; the most relevant of these is the threshold on the false discovery rate."),
        prev
    )
})

#' @export
#' @importFrom shiny tagList
setMethod(".getDotPlotColorHelp", "VolcanoPlot", function(x, color_choices) {
    FUN <- callNextMethod()

    function(plot_name) {
        df <- FUN(plot_name)
        df[1,2] <- "Here, we choose whether to color points by per-row attributes.
When <em>None</em> is selected, the plot defaults to a constant color for all non-significant features,
pink for the significant features with positive log-fold changes,
and blue for the significant features with negative log-fold changes.
The number of features in each category is also shown in the legend.
<br/><br/>
Alternatively, try out some of the different choices here, and note how further options become available when each choice is selected."
        df
    }
})
iSEE/iSEEu documentation built on March 28, 2024, 2:29 a.m.