ssc.clustSubsamplingClassification: Clustering with subsampling and classification
In Japrin/sscClust: simpler single cell RNAseq data clustering

ssc.clustSubsamplingClassification

R Documentation

Clustering with subsampling and classification

Description

Clustering with subsampling and classification

Usage

ssc.clustSubsamplingClassification(
  obj,
  assay.name = "exprs",
  frac = 0.4,
  method.vgene = "HVG.sd",
  method.reduction = "iCor",
  method.clust = "kmeans",
  method.classify = "knn",
  pca.npc = NULL,
  iCor.niter = 1,
  use.proj = T,
  vis.proj = F,
  ncore = NULL,
  k.batch = 2:6,
  seed = NULL
)

Arguments

`obj`	object of `singleCellExperiment` class
`assay.name`	character; which assay (default: "exprs")
`frac`	numeric; subsample to frac of original samples. (default: 0.4)
`method.vgene`	character; variable gene identification method used. (default: "HVG.sd")
`method.reduction`	character; which dimention reduction method to be used, should be one of "iCor", "pca" and "none". (default: "iCor")
`method.clust`	character; clustering method to be used, should be one of "kmeans" and "hclust". (default: "kmeans")
`method.classify`	character; method used for classification, one of "knn" and "RF". (default: "knn")
`pca.npc`	integer; number of pc be used. Only for reduction method "pca". (default: NULL)
`iCor.niter`	integer; number of iteration of calculating the correlation. Used in reduction method "iCor". (default: 1)
`use.proj`	logical; whether use the projected data for classification. (default: T)
`vis.proj`	logical; whether get low dimensional representation for visualization. (default: F)
`ncore`	integer; number of cpu to use. (default: NULL)
`k.batch`	integer; number of clusters to be evaluated. (default: 2:6)
`seed`	integer; seed of random number generation. (default: NULL)

Details

The function first subsmaple the samples to the specified fraction (such 40 make labels for the subsampled samples. Using the labels, original data or projected data via the method specified in "method.reduction" will be used for trainning a classifier. Then the classifier will predict the labels of the samples not subsampled, using original data or projected data dependent on the option use.proj. The final cluster labels combining that of bath sampled and unsampled samples, will stored in the colData of the object of singleCellExperiment class, with colname in the format of {method.reduction}.{method}k{k} where {k} get value(s) from k.batch.