preprocess: PreProcess Expression Matrix

Description Usage Arguments Value

View source: R/pipe-preprocess.R

Description

If the data is passed in TPM form, the data is first transformed to logTPM. Low quality cells are then removed. They are a filtered according to a cutoff for the number of genes detected. Next, low quality genes are removed. These are filtered according to a cutoff for their avg. expression values across all cells. Finally, the data is centered such that the average expression of each gene across all cells is 0. Note: The standard deviation is left as is – rather than being normalised to equal 1 – since s.d. values are very skewed due to how sparse the scRNA data is (many 0s).

Usage

1
2
3
preprocess(mat, pipeName = "preprocess", cachePath = ".",
  logTransform = TRUE, complexity.cutoff = 3000, genes.cutoff = 4,
  centering = TRUE)

Arguments

mat

matrix of vars. by. obs. (in TPM or logTPM).

pipeName

a job ID, a name for the project/pipeline. Defaults to function name.

cachePath

passed to cacheCall::cacheCall. A character string providing path to the Cache directory.

logTransform

if TRUE, apply logTPM to matrix.

complexity.cutoff

if a numeric value, apply complexityCut to matrix with cutoff complexity.cutoff. Else if FALSE, do not complexityCut.

genes.cutoff

if a numeric value, apply genesCut to matrix with cutoff genes.cutoff. Else if FALSE, do not genesCut.

centering

if TRUE, apply center to matrix.

Value

if all steps are run, return a centered, log-transformed matrix consisting of only high-quality – user-defined – cells and genes)


jlaffy/statistrics documentation built on May 23, 2019, 4:04 a.m.