Description Usage Arguments Value Author(s) Examples
The function classifies samples in an unsupervised fashion by:
Running a principal component analysis.
Uses Horn's technique to evaluate components to retain via
paran
.
Finds k nearest neighbors in PCA space.
Calculates the Euclidean distance between samples in PCA space.
Constructs a weighted graph where each sample is connected to the k nearest neighbors with an edge weight = 1 - Euclidean distance.
Uses the Louvain community detection algorithm to classify the samples.
1 | kNNclassify(cpm, geneIdx, PCiter, k, pca = NULL, quietly = TRUE)
|
cpm |
matrix; Counts per million. |
geneIdx |
Integer; Indices of genes to include in PCA. |
PCiter |
Integer; Length 1 vector indicating the number of iterations to perform when determining the numer of retained principal components. |
k |
Integer; Length 1 vector indicating the number of nearest neighbors for each sample. |
pca |
Matrix; Optional pre-computed PCA. If NULL, PCA will be computed within the function. |
quietly |
Logical; indicates if function should be verbose. |
Returns a tibble with two columns; the first indicating the sample name and the second indicating the classification.
Jason T. Serviss
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #setup input data
s <- stringr::str_detect(colnames(testCounts), "^s")
e <- stringr::str_detect(rownames(testCounts), "^ERCC\\-[0-9]*$")
c <- testCounts[!e, s]
cpm <- t(t(c) / colSums(c) * 10^6)
#pre-run PCA
pca <- gmodels::fast.prcomp(t(cpm), scale. = TRUE)$x
#run KNN graph classification
kc <- kNNclassify(cpm, 1:nrow(c), 20, 15, pca = pca)
#plot
pData <- merge(kc, matrix_to_tibble(pca[, 1:2], "sample"))
plot(pData$PC1, pData$PC2, col = rainbow(4)[pData$louvain], pch = 16)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.