processSCData: Prepare single cell data for analysis

Description Usage Arguments Value Author(s) See Also

Description

Check that input is correctly supplied and formatted. Then process input data for analysis. This function catches downstream errors arising from incorrect input parameter combinations and explains the input error to the user. Input passing this function should not lead to errors arising from incorrect input. Helper functions: checkNull() Check whether object was supplied (is not NULL). checkNumeric() Checks whether elements are numeric. checkCounts() Checks whether elements are count data.

Usage

1
2
3
4
processSCData(counts, dfAnnotation, vecConfoundersDisp, vecConfoundersMu,
  matPiConstPredictors, vecNormConstExternal, strDispModelFull, strDispModelRed,
  strMuModel, scaDFSplinesDisp, scaDFSplinesMu, scaMaxEstimationCycles,
  boolVerbose, boolSuperVerbose)

Arguments

counts

(matrix genes x samples) (matrix genes x cells (sparseMatrix or standard) or file) Matrix: Count data of all cells, unobserved entries are NA. file: .mtx file from which count matrix is to be read.

dfAnnotation

(data frame cells x meta characteristics) Annotation table which contains meta data on cells.

vecConfoundersDisp

(vector of strings number of confounders on dispersion) [Default NULL] Confounders to correct for in dispersion batch correction model, must be subset of column names of dfAnnotation which describe condounding variables.

vecConfoundersMu

(vector of strings number of confounders on mean) [Default NULL] Confounders to correct for in mu batch correction model, must be subset of column names of dfAnnotation which describe condounding variables.

matPiConstPredictors

(numeric matrix genes x number of constant gene-wise drop-out predictors) Predictors for logistic drop-out fit other than offset and mean parameter (i.e. parameters which are constant for all observations in a gene and externally supplied.) Is null if no constant predictors are supplied.

vecNormConstExternal

(numeric vector number of cells) Model scaling factors, one per cell. These factors will linearly scale the mean model for evaluation of the loglikelihood. Must be named according to the column names of matCounts. Supplied by user.

strDispModelFull

(str) "constant" [Default "constant"] Model according to which dispersion parameter is fit to each gene as a function of population structure in the alternative model (H1).

strDispModelRed

(str) "constant" [Default "constant"] Model according to which dispersion parameter is fit to each gene as a function of population structure in the null model (H0).

strMuModel

(str) "constant" [Default "impulse"] Model according to which the mean parameter is fit to each gene as a function of population structure in the alternative model (H1).

scaDFSplinesDisp

(sca) [Default 3] If strDispModelFull=="splines" or strDispModelRed=="splines", the degrees of freedom of the natural cubic spline to be used as a dispersion parameter model.

scaDFSplinesMu

(sca) [Default 3] If strMuModel=="splines", the degrees of freedom of the natural cubic spline to be used as a mean parameter model.

scaMaxEstimationCycles

(integer) [Default 20] Maximum number of estimation cycles performed in fitZINB(). One cycle contain one estimation of of each parameter of the zero-inflated negative binomial model as coordinate ascent.

boolVerbose

(bool) Whether to follow convergence of the iterative parameter estimation with one report per cycle.

boolSuperVerbose

(bool) Whether to follow convergence of the iterative parameter estimation in high detail with local convergence flags and step-by-step loglikelihood computation.

Value

list (length 3)

Author(s)

David Sebastian Fischer

See Also

Called by runLineagePulse.


YosefLab/LineagePulse documentation built on May 6, 2019, 2:19 p.m.