make_plot_data: Construct data of percent-spliced-in (PSI) matrices and...

make_plot_dataR Documentation

Construct data of percent-spliced-in (PSI) matrices and "diagonal" for heatmaps and scatter plots

Description

make_matrix() constructs a matrix of PSI values of the given alternative splicing events (ASEs).

make_diagonal() constructs a table of "average" PSI values, with samples grouped by two given conditions (e.g. "group A" and "group B") of a given condition category (e.g. condition "treatment"). See details below.

Usage

make_matrix(
  se,
  event_list,
  sample_list = colnames(se),
  method = c("PSI", "logit", "Z-score"),
  depth_threshold = 10,
  logit_max = 5,
  na.percent.max = 0.1
)

make_diagonal(
  se,
  event_list = rownames(se),
  condition,
  nom_DE,
  denom_DE,
  depth_threshold = 10,
  logit_max = 5
)

Arguments

se

(Required) A NxtSE object generated by MakeSE

event_list

A character vector containing the row names of ASE events (as given by the EventName column of differential ASE results table using limma_ASE() or DESeq_ASE())

sample_list

(default = colnames(se)) In make_matrix(), a list of sample names in the given experiment to be included in the returned matrix

method

In make_matrix(), rhe values to be returned (default = "PSI"). It can alternately be "logit" which returns logit-transformed PSI values, or "Z-score" which returns Z-score-transformed PSI values

depth_threshold

(default = 10) Samples with the number of reads supporting either included or excluded isoforms below this values are excluded

logit_max

(default = 5) PSI values close to 0 or 1 are rounded up/down to plogis(-logit_max) and plogis(logit_max), respectively. See details.

na.percent.max

(default = 0.1) The maximum proportion of values in the given dataset that were transformed to NA because of low splicing depth. ASE events where there are a higher proportion (default 10%) NA values will be excluded from the final matrix. Most heatmap functions will spring an error if there are too many NA values in any given row. This option caps the number of NA values to avoid returning this error.

condition

The name of the column containing the condition values in colData(se)

nom_DE

The condition to be contrasted, e.g. nom_DE = "treatment"

denom_DE

The condition to be contrasted against, e.g. denom_DE = "control"

Details

Note that this function takes the geometric mean of PSI, by first converting all values to logit(PSI), taking the average logit(PSI) values of each condition, and then converting back to PSI using inverse logit.

Samples with low splicing coverage (either due to insufficient sequencing depth or low gene expression) are excluded from calculation of mean PSIs. The threshold can be set using depth_threshold. Excluding these samples is appropriate because the uncertainty of PSI is high when the total included / excluded count is low. Note that events where all samples in a condition is excluded will return a value of NaN.

Using logit-transformed PSI values is appropriate because PSI values are bound to the (0,1) interval, and are often thought to be beta-distributed. The link function often used with beta-distributed models is the logit function, which is defined as logit(x) = function(x) log(x / (1 - x)), and is equivalent to stats::qlogis. Its inverse is equivalent to stats::plogis.

Users wishing to calculate arithmetic means of PSI are advised to use make_matrix, followed by rowMeans on subsetted sample columns.

Value

For make_matrix: A matrix of PSI (or alternate) values, with columns as samples and rows as ASE events.

For make_diagonal: A 3 column data frame, with the first column containing event_list list of ASE events, and the last 2 columns containing the average PSI values of the nominator and denominator conditions.

Functions

  • make_matrix: constructs a matrix of PSI values of the given alternative splicing events (ASEs)

  • make_diagonal: constructs a table of "average" PSI values

Examples

se <- NxtIRF_example_NxtSE()

colData(se)$treatment <- rep(c("A", "B"), each = 3)

event_list <- rowData(se)$EventName

mat <- make_matrix(se, event_list[1:10])

diag_values <- make_diagonal(se, event_list,
  condition = "treatment", nom_DE = "A", denom_DE = "B"
)

alexchwong/NxtIRFcore documentation built on Oct. 31, 2022, 9:14 a.m.