rends_heat: Metaheatmaps of the two extremities of the reads.

View source: R/read_end_metaheatmap_plot.R

rends_heatR Documentation

Metaheatmaps of the two extremities of the reads.

Description

This function generates four metaheatmaps displaying the abundance of the 5' and 3' extremity of reads mapping around the start and the stop codon of annotated CDSs, stratified by their length. Multiple samples and replicates can be handled.

Usage

rends_heat(
  data,
  annotation,
  sample,
  multisamples = "average",
  plot_style = "split",
  scale_factors = "auto",
  transcripts = NULL,
  length_range = NULL,
  cl = 100,
  utr5l = 50,
  cdsl = 50,
  utr3l = 50,
  log_colour = F,
  colour = "black"
)

Arguments

data

Either list of data tables or GRangesList object from bamtolist, bedtolist, length_filter or psite_info.

annotation

Data table as generated by create_annotation.

sample

Either character string, character string vector or named list of character string(s)/character string vector(s) specifying the name of the sample(s) and replicate(s) of interest. If a list is provided, each element of the list is considered as an independent sample associated with one ore multiple replicates. Multiple samples and replicates are handled and visualised according to multisamples and plot_style.

multisamples

Either "average" or "independent". It specifies how to handle multiple samples and replicates stored in sample:

  • if sample is a character string vector and multisamples is set to "average" the elements of the vector are considered as replicates of one sample and a single heatmap is returned.

  • if sample is a character string vector and multisamples is set to "independent", each element of the vector is analysed independently of the others. The number of plots returned and their organization is specified by plot_style.

  • if sample is a list, multisamples must be set to "average". Each element of the list is analysed independently of the others, its replicates averaged and its name reported in the plot. The number of plots returned and their organization is specified by plot_style. Note: when this parameter is set to "average" the heatmap associated with each sample displays the length- and position- specific mean signal computed across the replicates. Default is "average".

plot_style

Either "split" or "facet". It specifies how to organize and display multiple heatmaps:

  • "split": one heatmap for each sample is returned as an independent ggplot object.

  • "facet": the heatmaps are placed one below the other, in independent boxes. Default is "split".

scale_factors

Either "auto", a named numeric vector or "none". It specifies how heatmap values should be scaled before merging multiple replicates (if any):

  • "auto": each heatmap is scaled so that the average of all values is 1.

  • named numeric vector: scale_factors must be the same length of unlisted sample and each scale factor must be named after the corresponding string in unlisted sample. No specific order is required. Each heatmap value is multiplied by the matching scale factor.

  • "none": no scaling is applied. Default is "auto".

transcripts

Character string vector listing the name of transcripts to be included in the analysis. Default is NULL, i.e. all transcripts are used. Please note: transcripts with either 5' UTR, coding sequence or 3' UTR shorter than utr5l, 2*cdsl and utr3l, respectively, are automatically discarded.

length_range

Integer or integer vector for restricting the plot to a chosen range of read lengths. Default is NULL, i.e. all read lengths are used. If specified, this parameter prevails over cl.

cl

Integer value in 1,100 specifying a confidence level for restricting the plot to an automatically-defined range of read lengths. The new range is computed according to the most frequent read lengths, which accounts for the cl% of the sample and is defined by discarding the (100-cl)% of read lengths falling in the tails of the read lengths distribution. If multiple samples are analysed, a single range of read lengths is computed such that at least the cl% of all samples is represented. Default is 100.

utr5l

Positive integer specifying the length (in nucleotides) of the 5' UTR region flanking the start codon to be considered in the analysis. Default is 50.

cdsl

Positive integer specifying the length (in nucleotides) of the CDS regions flanking both the start and stop codon to be considered in the analysis. Default is 50.

utr3l

Positive integer specifying the length (in nucleotides) of the 3' UTR region flanking the stop codon to be considered in the analysis. Default is 50.

log_colour

Logical value whether to use a logarithmic colour scale (strongly suggested in case of large signal variations). Default is FALSE.

colour

Character string specifying the colour of the plot. The colour scheme is as follow: tiles corresponding to the lowest signal are always white, tiles corresponding to the highest signal are of the specified colour and the progression between these two colours follows either linear or logarithmic gradients (see log_colour). Default is "black".

Value

List containing: one or more ggplot object(s) and the data table with the corresponding x- and y-axis values and the values defining the color of the tiles ("plot_dt"); an additional data table with raw and scaled number of read extremities mapping around the start and the stop codon, per length, for each sample ("count_dt").

Examples

data(reads_list)
data(mm81cdna)

## Generate fake samples and replicates
for(i in 2:6){
  samp_name <- paste0("Samp", i)
  set.seed(i)
  reads_list[[samp_name]] <- reads_list[["Samp1"]][sample(.N, 5000)]
}

## Define the list of samples and replicate to use as input
input_samples <- list("S1" = c("Samp1", "Samp2"),
                      "S2" = c("Samp3", "Samp4", "Samp5"),
                      "S3" = c("Samp6"))

## Generate metaheatmaps for a sub-range of read lengths:
example_ends_heatmap <- rends_heat(reads_list, mm81cdna,
                                   sample = input_samples,
                                   multisamples = "average",
                                   plot_style = "split",
                                   cl = 85,
                                   utr5l = 25, cdsl = 40, utr3l = 25)

LabTranslationalArchitectomics/riboWaltz documentation built on Jan. 17, 2024, 12:18 p.m.