factor_sample_filter: Factor-based Sample Filtering: Function to filter single-cell...

View source: R/sample_filtering.R

factor_sample_filterR Documentation

Factor-based Sample Filtering: Function to filter single-cell RNA-Seq libraries.

Description

This function returns a sample-filtering report for each cell in the input expression matrix, describing whether it passed filtering by factor-based filtering, using PCA of quality metrics.

Usage

factor_sample_filter(
  expr,
  qual,
  gene_filter = NULL,
  max_exp_pcs = 5,
  qual_select_q_thresh = 0.01,
  force_metrics = NULL,
  good_metrics = NULL,
  min_qual_variance = 0.7,
  zcut = 1,
  mixture = TRUE,
  dip_thresh = 0.01,
  plot = FALSE,
  hist_breaks = 20
)

Arguments

expr

matrix The data matrix (genes in rows, cells in columns).

qual

matrix Quality metric data matrix (cells in rows, metrics in columns).

gene_filter

Logical vector indexing genes that will be used for PCA. If NULL, all genes are used.

max_exp_pcs

numeric number of expression PCs used in quality metric selection. Default 5.

qual_select_q_thresh

numeric. q-value threshold for quality/expression correlation significance tests. Default 0.01

force_metrics

logical. If not NULL, indexes quality metric to be forcefully included in quality PCA.

good_metrics

logical. If not NULL, indexes quality metric that indicate better quality when of higher value.

min_qual_variance

numeric. Minimum proportion of selected quality variance addressed in filtering. Default 0.70

zcut

A numeric value determining threshold Z-score for sd, mad, and mixture sub-criteria. Default 1.

mixture

A logical value determining whether mixture modeling sub-criterion will be applied per primary criterion (quality score). If true, a dip test will be applied to each quality score. If a metric is multimodal, it is fit to a two-component normal mixture model. Samples deviating zcut sd's from optimal mean (in the inferior direction), have failed this sub-criterion.

dip_thresh

A numeric value determining dip test p-value threshold. Default 0.05.

plot

logical. Should a plot be produced?

hist_breaks

hist() breaks argument. Ignored if 'plot=FALSE'.

Details

None

Value

A logical, representing samples passing factor-based filter.

Examples

mat <- matrix(rpois(1000, lambda = 5), ncol=10)
colnames(mat) <- paste("X", 1:ncol(mat), sep="")
qc = as.matrix(cbind(colSums(mat),colSums(mat > 0)))
rownames(qc) = colnames(mat)
colnames(qc) = c("NCOUNTS","NGENES")
mfilt = factor_sample_filter(expr = mat,
    qc, plot = TRUE,qual_select_q_thresh = 1)


YosefLab/scone documentation built on Oct. 21, 2024, 4:39 p.m.