call_DTU: Calculate differential transcript usage.

View source: R/rats.R

call_DTUR Documentation

Calculate differential transcript usage.

Description

There are two modes for input:

  • Bootstrapped count estimates. This requires the following parameters: boot_data_A and boot_data_B.

  • Count estimates. This requires the following parameters: count_data_A and count_data_B.

Usage

call_DTU(
  annot = NULL,
  TARGET_COL = "target_id",
  PARENT_COL = "parent_id",
  count_data_A = NULL,
  count_data_B = NULL,
  boot_data_A = NULL,
  boot_data_B = NULL,
  name_A = "Condition-A",
  name_B = "Condition-B",
  varname = "condition",
  use_sums = FALSE,
  p_thresh = 0.05,
  abund_thresh = 5,
  dprop_thresh = 0.2,
  correction = "BH",
  scaling = 1,
  testmode = "both",
  lean = TRUE,
  qboot = TRUE,
  qbootnum = 0L,
  qrep_thresh = 0.95,
  rboot = TRUE,
  rrep_thresh = 0.85,
  description = NA_character_,
  verbose = TRUE,
  threads = 1L,
  seed = NA_integer_,
  reckless = FALSE,
  dbg = "0"
)

Arguments

annot

A data.table matching transcript identifiers to gene identifiers. Any additional columns are allowed but ignored.

TARGET_COL

The name of the column for the transcript identifiers in annot. (Default "target_id")

PARENT_COL

The name of the column for the gene identifiers in annot. (Default "parent_id")

count_data_A

A data.table of estimated counts for condition A. One column per sample/replicate, one row per transcript. The first column should contain the transcript identifiers.

count_data_B

A data.table of estimated counts for condition B. One column per sample/replicate, one row per transcript. The first column should contain the transcript identifiers.

boot_data_A

A list of data.tables, one per sample/replicate of condition A. One bootstrap iteration's estimates per column, one transcript per row. The first column should contain the transcript identifiers.

boot_data_B

A list of data.tables, one per sample/replicate of condition B. One bootstrap iteration's estimates per column, one transcript per row. The first column should contain the transcript identifiers.

name_A

The name for one condition. (Default "Condition-A")

name_B

The name for the other condition. (Default "Condition-B")

varname

The name of the covariate to which the two conditions belong. (Default "condition").

use_sums

Use each transcript's sum of abundances across the replicates, instead of the means. Increases sensitivity, but also increases risk of false positives. RATs used only the sums up to 0.6.5 (inclusive). (Default FALSE)

p_thresh

The p-value threshold. (Default 0.05)

abund_thresh

Noise threshold. Minimum mean (across replicates) abundance for transcripts (and genes) to be eligible for testing. (Default 5)

dprop_thresh

Effect size threshold. Minimum change in proportion of a transcript for it to be considered meaningful. (Default 0.20)

correction

The p-value correction to apply, as defined in p.adjust.methods. (Default "BH")

scaling

A scaling factor or vector of scaling factors, to be applied to the abundances *prior* to any thresholding and testing. Useful for scaling TPMs (transcripts per 1 million reads) to the actual library sizes of the samples. If a vector is supplied, the order should correspond to the samples in group A followed by the samples in group B. WARNING: Improper use of the scaling factor will artificially inflate/deflate the significances obtained.

testmode

One of

  • "genes",

  • "transc",

  • "both" (default)

.

lean

Reduce memory footprint by not tracking mean/median/max/min/stdev for Dprop and pval across bootstrap iterations. The respective columns will be absent from the output structure. (Default TRUE)

qboot

Bootstrap the DTU robustness against bootstrapped quantifications data. (Default TRUE) Ignored if input is count_data.

qbootnum

Number of iterations for qboot. (Default 0) If 0, RATs will try to infer a value from the data.

qrep_thresh

Reproducibility threshold for quantification bootsrapping. (Default 0.95)

rboot

Bootstrap the DTU robustness against the replicates. Does *ALL* 1 vs 1 combinations. (Default TRUE)

rrep_thresh

Reproducibility threshold for replicate bootsrapping. (Default 0.85) With few replicates per condition, the reproducibility takes heavily quantized values. For 3x3, there are 9 possible 1v1 comparisons, and a consistency of 8/9 = 0.88.

description

Free-text description of the run. You can use this to add metadata to the results object.

verbose

Display progress updates and warnings. (Default TRUE)

threads

Number of threads to use. (Default 1) Multi-threading will be ignored on non-POSIX systems.

seed

A numeric integer used to initialise the random number engine. Use this only if reproducible bootstrap selections are required. (Default NA)

reckless

RATs normally aborts if any inconsistency is detected among the transcript IDs found in the annotation and the quantifications. Enabling reckless mode will downgrade this error to a warning and allow RATs to continue the run. Not recommended unless you know why the inconsistency exists and how it will affect the results. (Default FALSE)

dbg

Debugging mode only. Interrupt execution at the specified flag-point. Used to speed up code-tests by avoiding irrelevant downstream processing. (Default 0: do not interrupt)

Value

List of mixed types. Contains a list of runtime settings, a table of gene-level results, a table of transcript-level results, and a list of two tables with the transcript abundaces.


bartongroup/RATS documentation built on June 8, 2022, 12:40 a.m.