metaprofile_psite: Ribosome occupancy metaprofiles at single-nucleotide...

View source: R/metaplots.R

metaprofile_psiteR Documentation

Ribosome occupancy metaprofiles at single-nucleotide resolution.

Description

This function generates metaprofiles displaying the abundance of P-sites around the start and the stop codon of annotated CDSs. For each sample the intensity of the signal in the metaprofiles corresponds, for each nucleotide, to the sum of the number of P-sites (defined by their leftmost position) mapping on that position for all transcripts. Multiple samples and replicates can be handled.

Usage

metaprofile_psite(
  data,
  annotation,
  sample,
  multisamples = "average",
  plot_style = "split",
  scale_factors = "auto",
  transcripts = NULL,
  length_range = NULL,
  cl = 100,
  utr5l = 25,
  cdsl = 50,
  utr3l = 25,
  colour = NULL
)

Arguments

data

Either list of data tables or GRangesList object from psite_info.

annotation

Data table as generated by create_annotation.

sample

Either character string, character string vector or named list of character string(s)/character string vector(s) specifying the name of the sample(s) and replicate(s) of interest. If a list is provided, each element of the list is considered as an independent sample associated with one ore multiple replicates. Multiple samples and replicates are handled and visualised according to multisamples and plot_style.

multisamples

Either "average" or "independent". It specifies how to handle multiple samples and replicates stored in sample:

  • if sample is a character string vector and multisamples is set to "average" the elements of the vector are considered as replicates of one sample and a single metaprofile is returned.

  • if sample is a character string vector and multisamples is set to "independent", each element of the vector is analysed independently of the others. The number of plots returned and their organization is specified by plot_style.

  • if sample is a list, multisamples must be set to "average". Each element of the list is analysed independently of the others, its replicates averaged and its name reported in the plot. The number of plots returned and their organization is specified by plot_style. Note: when this parameter is set to "average" the bar plot associated with each sample displays the nucleotide-specific mean signal computed across the replicates and the corresponding standard error is also reported. Default is "average".

plot_style

Either "split", "facet", "overlap" or "mirror". It specifies how to organize and display multiple metaprofiles:

  • "split": one metaprofile for each sample is returned as an independent ggplot object;

  • "facet": the metaprofiles are placed one below the other, in independent boxes.

  • "overlap": the metaprofiles are placed one on top of the other;

  • "mirror": sample must be either a character string vector or a list of exactly two elements and the resulting metaprofiles are mirrored along the x axis. Default is "split".

scale_factors

Either "auto", a named numeric vector or "none". It specifies how metaprofiles should be scaled before merging multiple replicates (if any):

  • "auto": each metaprofile is scaled so that the area under the curve is 1.

  • named numeric vector: scale_factors must be the same length of unlisted sample and each scale factor must be named after the corresponding string in unlisted sample. No specific order is required. Each metaprofile is multiplied by the matching scale factor.

  • "none": no scaling is applied. Default is "auto".

transcripts

Character string vector listing the name of transcripts to be included in the analysis. Default is NULL, i.e. all transcripts are used. Please note: transcripts with either 5' UTR, coding sequence or 3' UTR shorter than utr5l, 2*cdsl and utr3l, respectively, are automatically discarded.

length_range

Integer or integer vector for restricting the plot to a chosen range of read lengths. Default is NULL, i.e. all read lengths are used. If specified, this parameter prevails over cl.

cl

Integer value in 1,100 specifying a confidence level for restricting the plot to an automatically-defined range of read lengths. The new range is computed according to the most frequent read lengths, which accounts for the cl% of the sample and is defined by discarding the (100-cl)% of read lengths falling in the tails of the read lengths distribution. If multiple samples are analysed, a single range of read lengths is computed such that at least the cl% of all samples is represented. Default is 100.

utr5l

Positive integer specifying the length (in nucleotides) of the 5' UTR region flanking the start codon to be considered in the analysis. Default is 25.

cdsl

Positive integer specifying the length (in nucleotides) of the CDS regions flanking both the start and stop codon to be considered in the analysis. Default is 50.

utr3l

Positive integer specifying the length (in nucleotides) of the 3' UTR region flanking the stop codon to be considered in the analysis. Default is 25.

colour

Character string or character string vector specifying the colour of the metaprofile(s). If sample is a list of multiple elements and multisamples is set to "average", a colour for each element of the list is required. If this parameter is not specified the R default palette is employed. Default is NULL.

Value

List containing: one or more ggplot object(s) and the data table with the corresponding x- and y-axis values ("plot_dt"); an additional data table with raw and scaled number of P-sites per codon in the selected region for each sample ("count_dt").

Examples

## data(reads_list)
## data(mm81cdna)
##
## ## Generate fake samples and replicates
## for(i in 2:6){
##   samp_name <- paste0("Samp", i)
##   set.seed(i)
##   reads_list[[samp_name]] <- reads_list[["Samp1"]][sample(.N, 5000)]
## }
##
## ## Compute and add p-site details
## psite_offset <- psite(reads_list, flanking = 6, extremity = "auto")
## reads_psite_list <- psite_info(reads_list, psite_offset)
##
## ## Define the list of samples and replicate to use as input
## input_samples <- list("S1" = c("Samp1", "Samp2"),
##                       "S2" = c("Samp3", "Samp4", "Samp5"),
##                       "S3" = c("Samp6"))
##
## ## Generate metaprofiles
## example_metaprofile <- metaprofile_psite(reads_psite_list, mm81cdna,
##                                          sample = input_samples,
##                                          multisamples = "average",
##                                          plot_style = "facet",
##                                          utr5l = 20, cdsl = 40, utr3l = 20,
##                                          colour = c("#333f50", "#39827c", "gray70"))

LabTranslationalArchitectomics/riboWaltz documentation built on Jan. 17, 2024, 12:18 p.m.