Description Usage Arguments Details Value See Also
View source: R/downsample_txis.R
Originally this was a helper function for label_cells but it is useful
in its own right, so it is exposed as an exported function now. Note that
maxcells
is a MAXIMUM, i.e. if there are 30 cells in a cluster for a sample
and maxcells == 50, then (obviously?) only 30 cells will be returned for that
particular combination of cluster and sample.
1 2 3 4 5 6 7 | downsample_txis(
txis,
maxcells = 20,
mincells = 10,
ret = c("sce", "colnames"),
...
)
|
txis |
SingleCellExperiment where !is.null(colLabels(txis)) |
maxcells |
max cells per cluster per sample (see Details) (20) |
mincells |
min cells per cluster per sample (see Details) (10) |
ret |
whether to return colnames ("colnames") or (default) "sce" |
... |
additional arguments to accomodate bootstrapping (not yet) |
Especially when using the default Louvain clustering approach, there will be samples without any cells in a cluster, and vice versa. To avoid having a bunch of artifacts, when sample==TRUE, we fit a mixture model to the number of cells in each cluster, and exclude samples with few or no cells in that cluster from block sampling. Don't use this on SmartSeq-type data.
Note that attr(downsample_txis(txis, ret="colnames"), "scheme") is a list
with elements 'mincells', 'maxcells', and 'eligible'. 'mincells' & 'maxcells'
are integers, while 'eligible' is an integer matrix with counts of cells
post-filtering (i.e., subject to mincells
and per-cluster mixture fits).
The mixture fits assume that a two-component mixture model on either log(1+cells) or directly on cell number per cluster will remove "noise" elements. This may be false; the user will have to investigate if so.
1 | colnames(txis) satisfying the sampling scheme (see Details)
|
find_eligible_cells
label_cells
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.