processData: Check and process input to runImpulseDE2()

View source: R/srcImpulseDE2_processData.R

processDataR Documentation

Check and process input to runImpulseDE2()

Description

Check validity of input and process count data matrix and annotation into data structures used later in runImpulseDE2. processData is structure in the following way:

  • Subhelper functions:

    • checkNull() Check whether object was supplied (is not NULL).

    • checkDimMatch() Checks whether dimensions of matrices agree.

    • checkElementMatch() Checks whether vectors are identical.

    • checkNumeric() Checks whether elements are numeric.

    • checkProbability() Checks whether elements are probabilities.

    • checkCounts() Checks whether elements are count data.

  • Helper functions:

    • checkData() Check format and presence of input data.

    • nameGenes() Name genes if names are not given.

    • procAnnotation() Add categorial time variable to annotation table. Add nested batch column if necessary. Reduce to samples used.

    • reduceCountData() Reduce count data to data which are utilised later.

  • Script body

Usage

processData(dfAnnotation, matCountData, boolCaseCtrl, vecConfounders,
  vecDispersionsExternal, vecSizeFactorsExternal)

Arguments

dfAnnotation

(data frame samples x covariates) Sample, Condition, Time (numeric), TimeCateg (str) (and confounding variables if given). Annotation table with covariates for each sample.

matCountData

(matrix genes x samples) [Default NULL] Read count data, unobserved entries are NA.

boolCaseCtrl

(bool) Whether to perform case-control analysis. Does case-only analysis if FALSE.

vecConfounders

(vector of strings number of confounding variables) Factors to correct for during batch correction. Have to supply dispersion factors if more than one is supplied. Names refer to columns in dfAnnotation.

vecDispersionsExternal

(vector length number of genes in matCountData) [Default NULL] Externally generated list of gene-wise dispersion factors which overides DESeq2 generated dispersion factors.

vecSizeFactorsExternal

(vector length number of cells in matCountData) [Default NULL] Externally generated list of size factors which override size factor computation in ImpulseDE2.

Value

(list length 4)

  • matCountDataProc (matrix genes x samples) Read count data.

  • dfAnnotationProc (data frame samples x covariates) Sample, Condition, Time (numeric), TimeCateg (str) (and confounding variables if given). Processed annotation table with covariates for each sample.

  • vecSizeFactorsExternalProc (numeric vector number of samples) Model scaling factors for each sample which take sequencing depth into account (size factors).

  • vecDispersionsExternalProc (vector number of genes) Gene-wise negative binomial dispersion hyper-parameter.

  • strReportProcessing (str) String of stdout of processData().

Author(s)

David Sebastian Fischer

See Also

Called by runImpulseDE2.


YosefLab/ImpulseDE2 documentation built on Sept. 17, 2022, 2:45 a.m.