This package contains functions that trains a Random Forest Model with a labelled Seurat Object, for predicting cell types/states in unlabelled datasets. It also contains a pre-trained Random Forest model, as well as example datasets.
Manual cell annotation of scRNAseq datasets, typically based on marker genes, can be time-consuming and biased. Being able to automatically predict cell types/states in a cell-by-cell and cluster-unbiased way is useful for fast and accurate phenotyping.
In addition, despite the increasing amounts of scRNAseq datasets being generated, thorough analysis of these datasets is lagging, and/or done in silos. This package comes with a preloaded Random Forest model based on different datasets and cell types/states, that will be constantly updated.
You can install this package from GitHub with devtools::install_github("kimberle9/rfca")
View the manual here: https://kimberle9.github.io/rfca/articles/vignette.html
# Load the rfca and Seurat libraries, as well as example datasets library(rfca) library(Seurat) # Create Seurat Object with PCA and UMAP calculated # Cell ranger raw data ("/filtered_feature_bc_matrix" directory) is required for this step mySeuratObject <- createSeuratObjectPipeline(data.dir = "~/filtered_feature_bc_matrix", nFeature_RNA_lower = 500, nFeature_RNA_upper = 5000, percent.mt = 5, nfeatures = 2000, dims = 20, clusterResolution = 0.8) # Assign cell type Idents to mySeuratObject manually, if you want to use it as a training dataset
library(rfca) library(Seurat) data("exampleSeuratObjectUnlabelled") data("exampleSeuratObjectLabelled") # Create Random Forest Model with your labelled Seurat Object myRandomForestModel <- createRFModel(exampleSeuratObjectLabelled) # Create marker gene list from random forest model markerGeneList <- createGeneLists(myRandomForestModel) # Visualize Feature Plot based on marker gene list myPlot <- cellMarkerPlots(exampleSeuratObjectLabelled, geneList = markerGeneList) myPlot # Predict cells based on your own Random Forest Model created above autoLabelledSeuratObject <- predictCells(exampleSeuratObjectUnlabelled, myRandomForestModel) # Predict cells based on my pre-loaded and pre-trained Random Forest Model # With no model passed in, default model used is mouseBrain. # See documentation for available tissue types. autoLabelledSeuratObject <- predictCells(exampleSeuratObjectUnlabelled) # Visualize autoLabelledSeuratObject DimPlot(autoLabelledSeuratObject)
library(rfca) library(Seurat) # Create Seurat object (microglia-only data) microglia <- createSeuratObjectPipeline(data.dir = "~/filtered_feature_bc_matrix") # Define microglia subtypes based on cluster numbers DimPlot(microglia) # Build Random Forest Model with cluster number as labels microgliaModel <- createRFModel(microglia) # Phenotype subtype proportion for different mouse models mouseModel1 <- predictCells(mouseModel1, microgliaModel) mouseModel2 <- predictCells(mouseModel2, microgliaModel) mouseModel3 <- predictCells(mouseModel3, microgliaModel)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.