cidr-package: Clustering through Imputation and Dimensionality Reduction
In VCCRI/CIDR: Clustering through Imputation and Dimensionality Reduction

Ultrafast and accurate clustering through imputation and dimensionality reduction for single-cell RNA-Seq (scRNA-Seq) data.

Peijie Lin <p.lin@victorchang.edu.au>, Michael Troup

par(ask=FALSE)
## Generate simulated single-cell RNA-Seq tags.
N=3 ## 3 cell types
k=50 ## 50 cells per cell type
sData <- scSimulator(N=N, k=k)
## tags - the tag matrix
tags <- as.matrix(sData$tags)
cols <- c(rep("RED",k), rep("BLUE",k), rep("GREEN",k))
## Standard principal component analysis.
ltpm <- log2(t(t(tags)/colSums(tags))*1000000+1)
pca <- prcomp(t(ltpm))
plot(pca$x[,c(1,2)],col=cols,pch=1,xlab="PC1",ylab="PC2",main="prcomp")
## Use cidr to analyse the simulated dataset.
## The input for cidr should be a tag matrix.
sData <- scDataConstructor(tags)
sData <- determineDropoutCandidates(sData)
sData <- wThreshold(sData)
sData <- scDissim(sData)
sData <- scPCA(sData)
sData <- nPC(sData)
nCluster(sData)
sData <- scCluster(sData)
## Two dimensional visualization: different colors denote different cell types,
## while different plotting symbols denote the clusters output by cidr.
plot(sData@PC[,c(1,2)], col=cols,
     pch=sData@clusters, main="CIDR", xlab="PC1", ylab="PC2")
## Use Adjusted Rand Index to measure the accuracy of the clustering output by cidr.
adjustedRandIndex(sData@clusters,cols)
## 0.92