Sampling: Sampling Primary Clusters

Description Usage Arguments Details Value References Examples

View source: R/sampling.R

Description

Performs sampling from the primary clusters in an inverse exponential order of cluster size.

Usage

1
2
Sampling(object, nsamples = 500, method = "sps",
  optm_parameters = FALSE, pinit = 0.195, pfin = 0.9, K = 500)

Arguments

object

A SingleCellExperiment object containing normalized expression values in "normcounts".

nsamples

integer, total number of samples to return post sampling; ignored when optm_parameters = FALSE.

method

character, one of c("sps","random"). Structure Preserving Sampling (sps) selects proportional number of members from each cluster obtained from partitioning an approximate nearest neighbour graph.

optm_parameters

logical, when TRUE the parameters (pinit, pfin, K) are optimized such that exactly nsamples are returned. Optimization is performed using simulated annealing

pinit

numeric [0,0.5], minimum probability of that sampling occurs from a cluster, ignored when optm_parameters = TRUE.

pfin

numeric [0.5,1], maximum probability of that sampling occurs from a cluster, ignored when optm_parameters = TRUE.

K

numeric, scaling factor analogous to Boltzmann constant, ignored when optm_parameters = TRUE.

Details

Sampling in inverse proportion of cluster size following a exponential decay equation. To ensure selection of sufficient representative transcriptomes from small clusters, an exponential decay function is used to determine the proportion of transcriptomes to be sampled from each cluster. For $i^th$ cluster, the proportion of expression profiles $p_i$ was obtained as follows.
pi = pl - e-(Si)/(K) where S_i is the size of cluster i, K is a scaling factor, p_i is the proportion of cells to be sampled from the $i^th$ Louvain cluster. $p_l$ and $p_u$ are lower and upper bounds of the proportion value respectively.

Value

A SingleCellExperiment object with an additional column named Sampling in colData column. The column stores a a logical value against each cell to indicate if it has been sampled.

References

\insertRef

sengupta2013reformulateddropClust

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library(SingleCellExperiment)
ncells <- 100
ngenes <- 2000
x <- matrix(rpois(ncells*ngenes, lambda = 10), ncol=ncells, nrow=ngenes, byrow=TRUE)
rownames(x) <- paste0("Gene", seq_len(ngenes))
colnames(x) <- paste0("Cell", seq_len(ncells))
sce <- SingleCellExperiment(list(counts=x))
sce <- CountNormalize(sce)
sce <- RankGenes(sce)
sce <- Sampling(sce)

debsin/dropClust documentation built on Nov. 4, 2019, 10:22 a.m.