View source: R/sample_filtering.R
metric_sample_filter | R Documentation |
This function returns a sample-filtering report for each cell in the input expression matrix, describing which filtering criteria are satisfied.
metric_sample_filter(
expr,
nreads = colSums(expr),
ralign = NULL,
gene_filter = NULL,
pos_controls = NULL,
scale. = FALSE,
glen = NULL,
AUC_range = c(0, 15),
zcut = 1,
mixture = TRUE,
dip_thresh = 0.05,
hard_nreads = 25000,
hard_ralign = 15,
hard_breadth = 0.2,
hard_auc = 10,
suff_nreads = NULL,
suff_ralign = NULL,
suff_breadth = NULL,
suff_auc = NULL,
plot = FALSE,
hist_breaks = 10,
...
)
expr |
matrix The data matrix (genes in rows, cells in columns). |
nreads |
A numeric vector representing number of reads in each library. Default to 'colSums' of 'expr'. |
ralign |
A numeric vector representing the proportion of reads aligned to the reference genome in each library. If NULL, filtered_ralign will be returned NA. |
gene_filter |
A logical vector indexing genes that will be used to compute library transcriptome breadth. If NULL, filtered_breadth will be returned NA. |
pos_controls |
A logical, numeric, or character vector indicating positive control genes that will be used to compute false-negative rate characteristics. If NULL, filtered_fnr will be returned NA. |
scale. |
logical. Will expression be scaled by total expression for FNR computation? Default = FALSE |
glen |
Gene lengths for gene-length normalization (normalized data used in FNR computation). |
AUC_range |
An array of two values, representing range over which FNR AUC will be computed (log(expr_units)). Default c(0,15) |
zcut |
A numeric value determining threshold Z-score for sd, mad, and mixture sub-criteria. Default 1. If NULL, only hard threshold sub-criteria will be applied. |
mixture |
A logical value determining whether mixture modeling sub-criterion will be applied per primary criterion (metric). If true, a dip test will be applied to each metric. If a metric is multimodal, it is fit to a two-component normal mixture model. Samples deviating zcut sd's from optimal mean (in the inferior direction), have failed this sub-criterion. |
dip_thresh |
A numeric value determining dip test p-value threshold. Default 0.05. |
hard_nreads |
numeric. Hard (lower bound on) nreads threshold. Default 25000. |
hard_ralign |
numeric. Hard (lower bound on) ralign threshold. Default 15. |
hard_breadth |
numeric. Hard (lower bound on) breadth threshold. Default 0.2. |
hard_auc |
numeric. Hard (upper bound on) fnr auc threshold. Default 10. |
suff_nreads |
numeric. If not null, serves as an overriding upper bound on nreads threshold. |
suff_ralign |
numeric. If not null, serves as an overriding upper bound on ralign threshold. |
suff_breadth |
numeric. If not null, serves as an overriding upper bound on breadth threshold. |
suff_auc |
numeric. If not null, serves as an overriding lower bound on fnr auc threshold. |
plot |
logical. Should a plot be produced? |
hist_breaks |
hist() breaks argument. Ignored if 'plot=FALSE'. |
... |
Arguments to be passed to methods. |
For each primary criterion (metric), a sample is evaluated based on 4 sub-criteria: 1) Hard (encoded) threshold 2) Adaptive thresholding via sd's from the mean 3) Adaptive thresholding via mad's from the median 4) Adaptive thresholding via sd's from the mean (after mixture modeling) A sample must pass all sub-criteria to pass the primary criterion.
A list with the following elements:
filtered_nreads Logical. Sample has too few reads.
filtered_ralign Logical. Sample has too few reads aligned.
filtered_breadth Logical. Samples has too few genes detected (low breadth).
filtered_fnr Logical. Sample has a high FNR AUC.
mat <- matrix(rpois(1000, lambda = 5), ncol=10)
colnames(mat) <- paste("X", 1:ncol(mat), sep="")
qc = as.matrix(cbind(colSums(mat),colSums(mat > 0)))
rownames(qc) = colnames(mat)
colnames(qc) = c("NCOUNTS","NGENES")
mfilt = metric_sample_filter(expr = mat,nreads = qc[,"NCOUNTS"],
plot = TRUE, hard_nreads = 0)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.