CorrectLowExpression: Correct sample/gene combinations that have expression values...

View source: R/utility_functions.R

CorrectLowExpressionR Documentation

Correct sample/gene combinations that have expression values of 0 or close to 0 to stabilize results

Description

Correct sample/gene combinations that have expression values of 0 or close to 0 to stabilize results

Usage

CorrectLowExpression(y, CLEParam = 0.05)

Arguments

y

is the data for the current gene/sample combination

CLEParam

is the parameter (betwen 0 and 1) that controls the correction threshold (see details of CorrectLowExpression for more information)

Details

The CLEParam parameter a works as follows: any TPM value that is less than ‘100*a’ percent of the total gene-level expression for the sample is replaced by ‘100*a’ percent of this expression. Mathematically, let T_{ij} be the TPM value for transcript $j=1,..., D$ for sample $i = 1,..., n$ within a given gene with $D$ transcripts. Any

T_{ij} < a * (T_{i1}+...+T_{iD})

will be replaced by

a * (T_{i1}+...+T_{iD})

. This procedure results in relative transcript abundance fractions (RTAFs) being zero only when every T_{ij} is equal to zero. The value a could be increased or decreased to result in more or less modification to the observed TPM values. As $a$ increases, the proportions are driven closer to each other, with each converging towards (1/D) as a converges to 1 (and each equal to (1/D) for a > 1). We find

a=0.05

is a good compromise that is large enough to stabilize the ilr coordinates sufficiently while additionally not over-modifying the observed data, and use this value by default for all CompDTU and CompDTUme results. We recommend keeping this value at the default of 0.05.


skvanburen/CompDTUReg documentation built on Jan. 23, 2025, 9:01 a.m.