Description Usage Arguments Details Value References See Also Examples
Calculates cell-specific mixing scores based on euclidean distances within a subspace of integrated data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
sce |
A |
k |
Numeric. Number of k-nearest neighbours (knn) to use. |
group |
Character. Name of group/batch variable.
Needs to be one of |
dim_red |
Character. Name of embeddings to use as subspace for distance distributions. Default is "PCA". |
assay_name |
Character. Name of the assay to use for PCA.
Only relevant if no existing 'dim_red' is provided.
Must be one of |
res_name |
Character. Appendix of the result score's name (e.g. method used to combine batches). |
k_min |
Numeric. Minimum number of knn to include. Default is NA (see Details). |
smooth |
Logical. Indicating if cms results should be smoothened within each neighbourhood using the weigthed mean. |
n_dim |
Numeric. Number of dimensions to include to define the subspace. |
cell_min |
Numeric. Minimum number of cells from each group to be included into the AD test. |
batch_min |
Numeric. Minimum number of cells per batch to include in to the AD test. If set neighbours will be included until batch_min cells from each batch are present. |
unbalanced |
Boolean. If True neighbourhoods with only one batch present will be set to NA. This way they are not included into any summaries or smoothening. |
BPPARAM |
A BiocParallelParam object specifying whether cms scores shall be calculated in parallel. |
The cms function tests the hypothesis, that group-specific distance
distributions of knn cells have the same underlying unspecified distribution.
It performs Anderson-Darling tests as implemented in the
kSamples package
.
In default the function uses all distances and group label defined in knn.
Alternative a density based neighbourhood can be defined by specifying
k_min
. In this case the first local minimum of the overall distance
distribution with at least k_min cells is used. This can be used to adapt to
the local structure of the datatset e.g. prevent cells from a
different cluster to be included. Third the neighbourhood can be defined by
batch occurences. batch_min
specifies the minimal number of cells from
each batch that should be included to define the neighbourhood.
If 'dim_red' is not defined or default cms will calculate a PCA using
runPCA
. Results will be appended to colData(sce)
.
Names can be specified using res_name
.
If multiple cores are available cms scores can be calculated in parallel
(does not work on Windows). Parallelization can be specified using BPPARAM.
A SingleCellExperiment
with cms (and cms_smooth) within
colData.
Scholz, F. W. and Stephens, M. A. (1987). K-Sample Anderson-Darling Tests. J. Am. Stat. Assoc.
1 2 3 4 5 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.