speck: Abundance estimation for single cell RNA-sequencing...

View source: R/SPECK.R

speckR Documentation

Abundance estimation for single cell RNA-sequencing (scRNA-seq) data.

Description

Performs normalization, reduced rank reconstruction (RRR) and thresholding for a m x n scRNA-seq matrix with m samples and n genes. The speck() function calls the randomizedRRR() function on the scRNA-seq matrix. Thresholding is next applied to each gene from the m x n RRR matrix using the ckmeansThreshold() function, resulting in a m x n thresholded matrix. See documentation for the randomizedRRR() and ckmeansThreshold() functions for individual implementation details.

Usage

speck(
  counts.matrix,
  rank.range.end = 100,
  min.consec.diff = 0.01,
  rep.consec.diff = 2,
  manual.rank = NULL,
  max.num.clusters = 4,
  seed.rsvd = 1,
  seed.ckmeans = 2
)

Arguments

counts.matrix

m x n scRNA-seq counts matrix with m samples and n genes.

rank.range.end

Upper value of the rank for RRR.

min.consec.diff

Minimum difference in the rate of change between a pair of successive standard deviation estimate.

rep.consec.diff

Frequency of the minimum difference in the rate of change between a pair of successive standard deviation estimate.

manual.rank

Optional, user-specified upper value of the rank used for RRR as an alternative to automatically computed rank.

max.num.clusters

Maximum number of clusters for computation.

seed.rsvd

Seed specified to ensure reproducibility of the RRR.

seed.ckmeans

Seed specified to ensure reproducibility of the clustered thresholding.

Value

  • thresholded.mat - A m x n thresholded RRR matrix with m samples and n genes.

  • rrr.mat - A m x n RRR matrix with m samples and n genes.

  • rrr.rank - Automatically computed rank.

  • component.stdev - A vector corresponding to standard deviations of non-centered sample principal components.

  • clust.num - A vector of length n indicating the number of clusters identified by the Ckmeans.1d.dp() algorithm for each gene.

  • clust.max.prop - A vector of length n indicating the proportion of samples with the specified maximum number of clusters for each gene.

Examples

set.seed(10)
data.mat <- matrix(data = rbinom(n = 18400, size = 230, prob = 0.01), nrow = 80)
speck.full <- speck(counts.matrix = data.mat, rank.range.end = 60,
min.consec.diff = 0.01, rep.consec.diff = 2,
manual.rank = NULL, max.num.clusters = 4,
seed.rsvd = 1, seed.ckmeans = 2)
print(speck.full$component.stdev)
print(speck.full$rrr.rank)
head(speck.full$clust.num); table(speck.full$clust.num)
head(speck.full$clust.max.prop); table(speck.full$clust.max.prop)
speck.output <- speck.full$thresholded.mat
dim(speck.output); str(speck.output)


SPECK documentation built on Nov. 18, 2023, 1:12 a.m.