signalAlongAxis: Visualize how genomic signal in a region set changes along a...

View source: R/visualization.R

signalAlongAxisR Documentation

Visualize how genomic signal in a region set changes along a given axis

Description

Look at genomic signal (e.g., DNA methylation values) in regions of interest across samples, with samples ordered according to a variable of interest (e.g. PC score). The ComplexHeatmap package is used and additional parameters for the ComplexHeatmap::Heatmap function may be passed to this function to modify the heatmap.

Usage

signalAlongAxis(
  genomicSignal,
  signalCoord,
  regionSet,
  sampleScores,
  orderByCol = "PC1",
  topXVariables = NULL,
  variableScores = NULL,
  decreasing = TRUE,
  cluster_columns = FALSE,
  cluster_rows = FALSE,
  row_title = "Sample",
  column_title = "Genomic Signal",
  column_title_side = "bottom",
  name = "Genomic Signal Value",
  col = c("blue", "#EEEEEE", "red"),
  ...
)

Arguments

genomicSignal

Matrix/data.frame. The genomic signal (e.g. DNA methylation levels) Columns of genomicSignal should be samples/patients. Rows should be individual signal/features (each row corresponds to one genomic coordinate/range) Must have sample names/IDs as column names, These same sample names must be row names of sampleScores.

signalCoord

A GRanges object or data frame with coordinates for the genomic signal/original epigenetic data. Coordinates should be in the same order as the original data and the feature contribution scores (each item/row in signalCoord corresponds to a row in signal). If a data.frame, must have chr and start columns (optionally can have end column, depending on the epigenetic data type).

regionSet

A genomic ranges (GRanges) object with regions corresponding to the same biological annotation. Must be from the same reference genome as the coordinates for the actual data/samples (signalCoord). The regions that will be visualized.

sampleScores

A matrix. Must contain a column for the variable of interest/target variable. E.g. The variable of interest could be the principal component scores for the samples. 'sampleScores' must have sample names/IDs as row names, These same sample names must be column names of genomicSignal.

orderByCol

A character object. A variable to order samples by (order rows of heatmap by variable, from high to low value). Must be the name of a column in sampleScores. For instance, if doing unsupervised COCOA with PCA, orderByCol might be the name of one of the PCs (e.g. "PC1"). If doing supervised COCOA, orderByCol might be the name of the target variable of the supervised analysis.

topXVariables

Numeric. The number of variables from genomicSignal to plot. The variables with the highest scores according to variableScores will be plotted. Can help to reduce the size of the plot.

variableScores

Numeric. A vector that has a numeric score for each variable in genomicSignal (length(variableScores) should equal nrow(genomicSignal)). Only used if topXVariables is given. The highest 'topXVariables' will be plotted.

decreasing

Logical. Whether samples should be sorted in decreasing order of 'orderByCol' or not (FALSE is increasing order).

cluster_columns

Logical. Whether to cluster columns (the genomic signal, e.g. DNA methylation values for each CpG).

cluster_rows

Logical. Whether rows should be clustered. This should be kept as FALSE to keep the correct ranking of samples/observations according to their target variable score.

row_title

Character object, row title

column_title

Character object, column title

column_title_side

Character object, where to put the column title: "top" or "bottom"

name

Character object, legend title

col

A vector of colors or a color mapping function which will be passed to the ComplexHeatmap::Heatmap() function. See ?Heatmap (the "col" parameter) for more details. "#EEEEEE" is the code for a color similar to white.

...

Optional parameters for ComplexHeatmap::Heatmap()

Value

A heatmap of genomic signal values (eg DNA methylation levels) in regions of interest (regionSet), with rows ordered by the column of sampleScores given with 'orderByCol'. Each row is a patient/sample and each column is an individual genomic signal value.

Examples

data("brcaMethylData1")
data("brcaMCoord1")
data("esr1_chr1")
data("brcaPCScores")
signalHM <- signalAlongAxis(genomicSignal=brcaMethylData1,
                             signalCoord=brcaMCoord1,
                             regionSet=esr1_chr1,
                             sampleScores=brcaPCScores,
                             orderByCol="PC1", cluster_columns=TRUE)


databio/COCOA documentation built on Sept. 1, 2023, 5:50 p.m.