Article:IKAP - Identifying K mAjor cell Population groups in single-cell RNA-seq analysis Yun-Ching Chen, Abhilash Suresh, Chingiz Underbayev, Clare Sun, Komudi Singh, Fayaz Seifuddin, Adrian Wiestner, Mehdi Pirooznia. https://academic.oup.com/gigascience/article/8/10/giz121/5579995
* Note: for Seurat3 please see Seurat3_code folder
Please install the following R libraries before installing IKAP: Seurat, dplyr, reshape2, PRROC, WriteXLS, rpart, stringr, and rpart.plot
First, you need to install the devtools package. You can do this from CRAN. Invoke R and then type
```{r, eval = FALSE} install.packages("devtools") wzxhzdk:0Install IKAP
```{r, eval = FALSE} devtools::install_github("NHLBI-BCB/IKAP") wzxhzdk:1 Seurat_obj <- IKAP(Seurat_obj, out.dir = "./IKAP") ``` Returned data and output files (saved in the output directory, default = ./IKAP/): Seurat object: IKAP returns a Seurat object with all explored sets in the metadata data frame. - **_PC_K.pdf_**: The heatmap shows the statistics for every combination of r and nPC explored. Candidate sets are marked as 'X' with the best marked as 'B'. The corresponding cell membership can be found in the metadata of the returned Seurat object with column name 'PC?K?'. For example, if 'B' (the best set) is marked at nPC = 20 and k = 8, the corresponding cell membership is stored in column 'PC20K8' in the metadata. - **_data.xls_** and **_markers.all.rds_**: It saves the statistics (plotted in PC_K.pdf) for determining candidate sets in the first sheet. The other sheets display the (upregulated) marker genes for candidate sets. The R object, markers.all.rds, contains a data frame of marker genes for every candidate set. - **_*.png_**: Heatmaps show expression of top 10 (ranked by expression fold change) marker genes from each cell group for candidate sets. They are plotted using Seurat DoHeatmap function. - **_DT_plot.pdf_**, **_DT_summary.rds_**, and **_DT.rds_**: Decision tree output files. A decision tree is built using marker genes for every cell group in every candidate set using R package rpart. All decision trees are plotted in DT_plot.pdf. Classification errors are summarized in the R object DT_summary.rds. DT.rds is the output object from rpart. - **_*tSNE.pdf_**: tSNE plots for candidate sets. Functions in the R script: -------------------------- - IKAP: The main function runs the following steps: - (1) regress out confounding variables and scale data using Seurat::ScaleData; - (2) find variable genes for principal component analysis (PCA) using Seurat::FindVariableGenes; - (3) perform PCA using Seurat::RunPCA; - (4) estimate k.max; - (5) explore ranges of k and nPC and compute gap statistics; - GapStatistic, ObservedLogW, and ExpectedLogW: Compute gap statistics given a data matrix (used for computing data point Euclidean distances) and K sets of clusters with k = 1 … K. GapStatistic calls ObservedLogW and ExpectedLogW to compute sum of within-group distances for observed data and random data respectively. - BottomUpMerge and NearestCluster (5): Generate sets of cell groups by exploring ranges of k and nPC. BottomUpMerge finds k.max groups using Seurat::FindClusters and gradually merges two nearest clusters measured by NearestCluster. - (6) select candidate sets; - SelectCandidate: Select candidate sets based on gap statistics. - (7) compute marker genes using Seurat::FindAllMarkers; - ComputeMarkers: Compute marker genes for all cell groups in all candidate sets using Seurat::FindAllMarkers. In addition, compute Area Under the ROC curve (AUROC) for each marker genes using the R package PRROC. Plot marker gene heatmap(s) using Seurat::DoHeatmap. - (8) build decision trees; - DecisionTree: Build decision trees for all cell groups in all candidate sets using the R package rpart and compute the classification error for each candidate set. - (9) plot tSNE plots and PC_K.pdf - PlotSummary: Mark the best set based on classification error and plot PC_K.pdf License -------- MIT license: https://opensource.org/licenses/MIT Contact -------- If you have any question, please contact: yun-ching.chen@nih.govAdd the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.