knitr::opts_chunk$set(echo = TRUE)
SingleR is a package that performs reference-based annotation of single-cell RNA-seq data. Here we show different ways to create SingleR objects. These objects can then be used with visualization functions available in the SingleR package and the SingleR web app (http://comphealth.ucsf.edu/SingleR/).
SingleR provides built-in wrapper functions to run a complete pipeline with one function. SingleR provides support to Seurat (http://satijalab.org/seurat/), but any other scRNA-seq package can be used. These functions are explained in Case 1 and 2. These functions assist in reading the single-cell data, calculating labels using different references and creating an object that can be used by SungleR plotting functions. However, to run SingleR and retrieve labels for each cell the following function can be used:
singler = SingleR(method = "single", sc_data, ref_data, types, clusters = NULL, genes = "de", quantile.use = 0.8, p.threshold = 0.05, fine.tune = TRUE, fine.tune.thres = 0.05, sd.thres = 1, do.pvals = T, numCores = SingleR.numCores)
warning('Do not use the scaled.data field in Seurat as input. This field represents relative expression across cells, and is not appropriate as input for SingleR. The raw.data and data field are ok, but only if from a non full-length method.')
This is the basic SingleR function. To use wrapper functions see the cases below.
In this case we have counts data, and don't have any prefered previous analysis of the data.
To create the SingleR object simply run the following function:
singler = CreateSinglerSeuratObject(counts, annot = NULL, project.name, min.genes = 200, technology = "10X", species = "Human" (or "Mouse"), citation = "", ref.list = list(), normalize.gene.length = F, variable.genes = "de", fine.tune = T, reduce.file.size = T, do.signatures = T, min.cells = 2, npca = 10, regress.out = "nUMI", do.main.types = T, reduce.seurat.object = T, numCores = SingleR.numCores) save(singler,file='singler_object.RData')
The returned singler object is a list that can be used for further analyses. See below.
In this case we already have a single-cell object with tSNE coordinates and clusters. We want to annotate this object and use those parameters.
To create the SingleR object simply run the following function:
singler = CreateSinglerObject(counts, annot = NULL, project.name, min.genes = 0, technology = "10X", species = "Human", citation = "", ref.list = list(), normalize.gene.length = F, variable.genes = "de", fine.tune = T, do.signatures = T, clusters = NULL, do.main.types = T, reduce.file.size = T, numCores = SingleR.numCores) singler$seurat = seurat.object # (optional) singler$meta.data$orig.ident = seurat.object@meta.data$orig.ident # the original identities, if not supplied in 'annot' ## if using Seurat v3.0 and over use: singler$meta.data$xy = seurat.object@reductions$tsne@cell.embeddings # the tSNE coordinates singler$meta.data$clusters = seurat.object@active.ident # the Seurat clusters (if 'clusters' not provided) ## if using a previous Seurat version use: singler$meta.data$xy = seurat.object@dr$tsne@cell.embeddings # the tSNE coordinates singler$meta.data$clusters = seurat.object@ident # the Seurat clusters (if 'clusters' not provided) # this example is of course if the previous analysis was performed with Seurat, but any other previous coordinates and clusters can be used. save(singler,file='singler_object.RData')
All the parameters are similar to case 1.
We have a reference dataset that we want to use. It contains N samples that can be annotated to n1 main cell types (i.e. macrophages or DCs) and n2 cell states (i.e. alveolar macrophages, interstitial macrophages, pDCs and cDCs).
The gene expression data should be gene-length normalized (TPM, FPKM etc.) and in log2 scale. The rownames must be gene symbols.
This is how we define the reference object:
name = 'My_reference' expr = as.matrix(expr) # the expression matrix types = as.character(types) # a character list of the types. Samples from the same type should have the same name. main_types = as.character(main_types) # a character list of the main types. ref = list(name=name,data = expr, types=types, main_types=main_types) # if using the de method, we can predefine the variable genes ref$de.genes = CreateVariableGeneSet(expr,types,200) ref$de.genes.main = CreateVariableGeneSet(expr,main_types,300) # if using the sd method, we need to define an sd threshold sd = apply(expr,1,sd) sd.thres = sort(sd, decreasing = T)[4000] # or any other threshold ref$sd.thres = sd.thres save(ref,file='ref.RData') # it is best to name the object and the file with the same name. # we can then use this reference in the previous functions. Multiple references can used. singler = CreateSinglerObject(... ref.list = list(immgen, ref, mouse.rnaseq)
There are examples in http://comphealth.ucsf.edu/sample-apps/SingleR/SingleR_specifications.html
We will soon add more simple examples with full analysis of datasets.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.