collapseTpm: Collapse bundles of transcripts, discard any with (default) <...

Description Usage Arguments Details Value

View source: R/collapseTpm.R

Description

Collapse bundles of transcripts, discard any with (default) < 1TPM/bundle, and optionally prune any whose joined bundle IDs tend to choke downstream packages for e.g. pathway- or network-based enrichment analysis. Note that this function may or may not be optimal for your RNAseq experiment. Please refer to 'Details' for some thought exercises about the nature of 'genes'.

Usage

1
2
3
collapseTpm(kexp, bundleID = "gene_id", minTPM = 0.01,
  discardjoined = TRUE, tx_biotype = NULL, gene_biotype = NULL,
  biotype_class = NULL, ...)

Arguments

kexp

A KallistoExperiment (or something very much like it)

bundleID

The column (in mcols(rowRanges(kexp))) of the bundle IDs

minTPM

Discard transcripts/bundles with < this many TPMs (0.01)

discardjoined

Discard bundles with IDs "joined" by a ";"? (TRUE)

tx_biotype

Restrict to a specific mcols(kexp)$tx_biotype? (NULL)

gene_biotype

Restrict to a specific mcols(kexp)$gene_biotype? (NULL)

biotype_class

Restrict to a specific mcols(kexp)$biotype_class? (No)

...

any more parameters to add

Details

This function sums transcripts per million (TPM) of each transcript within bundle of transcripts ("bundle" being a user-defined identifier, often but not always a 'gene', sometimes a biotype or a class of repeat elements).

The default approach is to discard all rows where the maximum TPM is less than the specified cutoff. Since the default cutoff is 1TPM, this means discarding bundles where the total transcripts per million estimate is < 1. (Filtering tends to increase statistical power at given false-positive rate)

Value

a matrix of TPMs by bundle for each sample


RamsinghLab/arkas_staging documentation built on March 14, 2021, 11:40 a.m.