perFeatureQCMetrics | R Documentation |
Compute per-feature quality control metrics for a count matrix or a SummarizedExperiment.
perFeatureQCMetrics(x, ...)
## S4 method for signature 'ANY'
perFeatureQCMetrics(
x,
subsets = NULL,
threshold = 0,
BPPARAM = SerialParam(),
flatten = TRUE,
detection_limit = NULL
)
## S4 method for signature 'SummarizedExperiment'
perFeatureQCMetrics(x, ..., assay.type = "counts", exprs_values = NULL)
x |
A numeric matrix of counts with cells in columns and features in rows. Alternatively, a SummarizedExperiment or SingleCellExperiment object containing such a matrix. |
... |
For the generic, further arguments to pass to specific methods. For the SummarizedExperiment and SingleCellExperiment methods, further arguments to pass to the ANY method. |
subsets |
A named list containing one or more vectors (a character vector of cell names, a logical vector, or a numeric vector of indices), used to identify interesting sample subsets such as negative control wells. |
threshold |
A numeric scalar specifying the threshold above which a gene is considered to be detected. |
BPPARAM |
A BiocParallelParam object specifying how parallelization should be performed. |
flatten |
Logical scalar indicating whether the nested DataFrames in the output should be flattened. |
detection_limit , exprs_values |
Soft deprecated equivalents to the arguments described above. |
assay.type |
A string or integer scalar indicating which |
This function calculates useful QC metrics for features, including the mean across all cells and the number of expressed features (i.e., counts above the detection limit).
If subsets
is specified, the same statistics are computed for each subset of cells.
This is useful for obtaining statistics for cell sets of interest, e.g., negative control wells.
These statistics are stored as nested DataFrames in the output.
For example, if subsets
contained "empty"
and "cellpool"
, the output would look like:
output |-- mean |-- detected +-- subsets |-- empty | |-- mean | |-- detected | +-- ratio +-- cellpool |-- mean |-- detected +-- ratio
The ratio
field contains the ratio of the mean within each subset to the mean across all cells.
If flatten=TRUE
, the nested DataFrames are flattened by concatenating the column names with underscores.
This means that, say, the subsets$empty$mean
nested field becomes the top-level subsets_empty_mean
field.
A flattened structure is more convenient for end-users performing interactive analyses,
but less convenient for programmatic access as artificial construction of strings is required.
A DataFrame of QC statistics where each row corresponds to a row in x
.
This contains the following fields:
mean
: numeric, the mean counts for each feature.
detected
: numeric, the percentage of observations above threshold
.
If flatten=FALSE
, the output DataFrame also contains the subsets
field.
This a nested DataFrame containing per-feature QC statistics for each subset of columns.
If flatten=TRUE
, subsets
is flattened to remove the hierarchical structure.
Aaron Lun
addPerFeatureQCMetrics
, to add the QC metrics to the row metadata.
example_sce <- mockSCE()
stats <- perFeatureQCMetrics(example_sce)
stats
# With subsets.
stats2 <- perFeatureQCMetrics(example_sce, subsets=list(Empty=1:10))
stats2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.