runLSA | R Documentation |
This function takes a Snap obj as input with bmat/pmat/gmat slot and run Latent Semantic Analysis (LSA).
runLSA(obj, input.mat = c("bmat", "pmat"), pc.num = 50, logTF = FALSE, scale.factor = 1e+05, min.cell = 10, seed.use = 10)
obj |
A snap obj |
input.mat |
Input matrix to be used for LSA c("bmat", "pmat"). |
pc.num |
An integer number of dimetions to return [50]. |
logTF |
A logical variable indicates wehther to log-scale term frequency [TRUE]. |
scale.factor |
A numeric variable used to scale the logTF [100000]. |
seed.use |
A numeric class that indicates random seeding number [10]. |
Below instruction is modified from 10X cell-ranger website The a cell-by-bin (bmat) or cell-by-peak (pmat) matrix is first normalized via the inverse-document frequency (idf) transform where each peak/bin count is scaled by the log of the ratio of the number of barcodes in the matrix and the number of barcodes where the peak has a non-zero count. This provides greater weight to counts in peaks that occur in fewer barcodes. Singular value decomposition (SVD) is performed on this normalized matrix using IRLBA without scaling or centering, to produce the transformed matrix in lower dimensional space, as well as the components and the singular values signifying the importance of each component.
LSA has four major steps: 1) term frequency - TF = t(t(X) / Matrix::colSums(X)); When logTF is TRUE, TF is also log scaled. 2) inverse document frequency - IDF = log(1 + ncol(X) / rowSums(X)) 3) TF-IDF - TF * IDF 4) SVD - Run singular value decomposition
data(demo.sp); demo.sp = makeBinary(demo.sp); demo.sp = runLSA(obj=demo.sp, input.mat="bmat", pc.num=50, logTF=TRUE, min.cell=0);
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.