plotCountDepth: Evaluate the count-depth relationship before (or after)...

View source: R/plotCountDepth.R

plotCountDepthR Documentation

Evaluate the count-depth relationship before (or after) normalizing the data.

Description

Quantile regression is used to estimate the dependence of read counts on sequencing depth for every gene. If multiple conditions are provided, a separate plot is provided for each and the filters are applied within each condition separately. The plot can be used to evaluate the extent of the count-depth relationship in the dataset or can be be used to evaluate data normalized by alternative methods.

Usage

plotCountDepth(
  Data,
  NormalizedData = NULL,
  Conditions = NULL,
  Tau = 0.5,
  FilterCellProportion = 0.1,
  FilterExpression = 0,
  NumExpressionGroups = 10,
  NCores = NULL,
  ditherCounts = FALSE
)

Arguments

Data

can be a matrix of single-cell expression with cells where rows are genes and columns are samples. Gene names should not be a column in this matrix, but should be assigned to rownames(Data). Data can also be an object of class SummarizedExperiment that contains the single-cell expression matrix and other metadata. The assays slot contains the expression matrix and is named "Counts". This matrix should have one row for each gene and one sample for each column. The colData slot should contain a data.frame with one row per sample and columns that contain metadata for each sample. This data.frame should contain a variable that represents biological condition in the same order as the columns of NormCounts). Additional information about the experiment can be contained in the metadata slot as a list.

NormalizedData

matrix of normalized expression counts. Rows are genesand columns are samples. Only input this if evaluating already normalized data.

Conditions

vector of condition labels, this should correspond to the columns of the un-normalized expression matrix. If not provided data is assumed to come from same condition/batch.

Tau

value of quantile for the quantile regression used to estimate gene-specific slopes (default is Tau = .5 (median)).

FilterCellProportion

the proportion of non-zero expression estimates required to include the genes into the evaluation. Default is .10, and will not go below a proportion which uses less than 10 total cells/samples.

FilterExpression

exclude genes having median of non-zero expression below this threshold from count-depth plots (default = 0).

NumExpressionGroups

the number of groups to split the data into, genes are split into equally sized groups based on their non-zero median expression.

NCores

number of cores to use, default is detectCores() - 1. This will be used to set up a parallel environment using either MulticoreParam (Linux, Mac) or SnowParam (Windows) with NCores using the package BiocParallel.

ditherCounts

whether to dither/jitter the counts, may be used for data with many ties, default is FALSE.

Value

returns a data.frame containing each gene's slope (count-depth relationship) and its associated expression group. A plot will be output.

Author(s)

Rhonda Bacher

Examples

 
data(ExampleSimSCData)
Conditions = rep(c(1,2), each= 90) 
#plotCountDepth(Data = ExampleSimSCData, Conditions = Conditions, 
  #FilterCellProportion = .1)

rhondabacher/SCnorm documentation built on July 8, 2023, 11:36 p.m.