Signac and SPRING: Learning CD56 NK cells from multi-modal analysis of CITE-seq PBMCs from 10X Genomics
In SignacX: Cell Type Identification and Discovery from Single Cell Gene Expression Data

This vignette shows how to use SignacX with Seurat and SPRING to learn a new cell type category from single cell data.

Load data

We start with CITE-seq data that were already classified with SignacX using the SPRING pipeline.

library(Seurat)
library(SignacX)

Load CITE-seq data from 10X Genomics processed with SPRING and classified with SignacX already.

# load CITE-seq data
data.dir = './CITESEQ_EXPLORATORY_CITESEQ_5K_PBMCS/FullDataset_v1_protein'
E = CID.LoadData(data.dir = data.dir)

# Load labels
json_data = rjson::fromJSON(file=paste0(data.dir,'/categorical_coloring_data.json'))

Create a Seurat object for the protein expression data; we will use this as a reference.

# separate protein and gene expression data
logik = grepl("Total", rownames(E))
P = E[logik,]
E = E[!logik,]

# CLR normalization in Seurat
colnames(P) <- 1:ncol(P)
colnames(E) <- 1:ncol(E)
reference <- CreateSeuratObject(E)
reference[["ADT"]] <- CreateAssayObject(counts = P)
reference <- NormalizeData(reference, assay = "ADT", normalization.method = "CLR")

Identify CD56 bright NK cells based on protein expression data.

# generate labels 
lbls = json_data$CellStates$label_list
lbls[lbls != "NK"] = "Unclassified"
CD16 = reference@assays$ADT@counts[rownames(reference@assays$ADT@counts) == "CD16-TotalSeqB-CD16",]
CD56 = reference@assays$ADT@counts[rownames(reference@assays$ADT@counts) == "CD56-TotalSeqB-CD56",]
logik = log2(CD56) > 10 & log2(CD16) < 7.5 & lbls == "NK"; sum(logik)
lbls[logik] = "NK.CD56bright"

SignacX

Generate a training data set from the reference data and save it for later use. Note:

SignacBoot performs feature selection, bootstrapping, imputation and normalization to derive a training data set from single cell data.

# generate bootstrapped single cell data
R_learned = SignacBoot(E = E, spring.dir = data.dir, L = c("NK", "NK.CD56bright"), labels = lbls, logfc.threshold = 1)

# save the training data
save(R_learned, file = "training_NKBright_v207.rda")

Classify a new data set with the model

Load expression data for a different data set (this was also previously processed through SPRING and SignacX)

# Classify another data set with new model
# load new data
new.data.dir = "./PBMCs_5k_10X/FullDataset_v1"
E = CID.LoadData(data.dir = new.data.dir)
# load cell types identified with Signac
json_data = rjson::fromJSON(file=paste0(new.data.dir,'/categorical_coloring_data.json'))

Generate new labels. Note:

Signac trains an ensemble of 100 neural network classifiers using the new training data set built above (R_learned), and then classifies unseen data (E).

# generate new labels
cr_learned = Signac(E = E, R = R_learned, spring.dir = new.data.dir)

Now we amend the existing labels (classified previously with SignacX); we add the new labels and generate a new SPRING layout.Note:

We usually copy the existing SPRING files from "FullDataset_v1" to "FullDataset_v1_Learned" to generate a new layout while preserving the existing layout.

# modify the existing labels
cr = lapply(json_data, function(x) x$label_list)
logik = cr$CellStates == 'NK'
cr$CellStates[logik] = cr_learned[logik]
logik = cr$CellStates_novel == 'NK'
cr$CellStates_novel[logik] = cr_learned[logik]
new.data.dir = paste0(new.data.dir, "_Learned")

Save results

# save
dat = CID.writeJSON(cr, spring.dir = new.data.dir, new_colors = c('red'), new_populations = c( 'NK.CD56bright'))

Session Info

sessionInfo()

Any scripts or data that you put into this service are public.

SignacX documentation built on Nov. 18, 2021, 5:07 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SignacX
Cell Type Identification and Discovery from Single Cell Gene Expression Data

Signac and SPRING: Learning CD56 NK cells from multi-modal analysis of CITE-seq PBMCs from 10X Genomics
In SignacX: Cell Type Identification and Discovery from Single Cell Gene Expression Data

Load data

SignacX

Classify a new data set with the model

Try the SignacX package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SignacX Cell Type Identification and Discovery from Single Cell Gene Expression Data

Signac and SPRING: Learning CD56 NK cells from multi-modal analysis of CITE-seq PBMCs from 10X Genomics In SignacX: Cell Type Identification and Discovery from Single Cell Gene Expression Data

Load data

SignacX

Classify a new data set with the model

Try the SignacX package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

SignacX
Cell Type Identification and Discovery from Single Cell Gene Expression Data

Signac and SPRING: Learning CD56 NK cells from multi-modal analysis of CITE-seq PBMCs from 10X Genomics
In SignacX: Cell Type Identification and Discovery from Single Cell Gene Expression Data