Description Usage Arguments Details Warning References See Also Examples
Identification of cluster-specific gene sets and cell classification according to their expression.
CellSIUS enables the identification and characterization of (rare) cell sub-populations from complex scRNA-seq datasets: it takes as input expression values of N cells grouped into M(>1) clusters. Within each cluster, genes with a bimodal distribution are selected and only genes with cluster-specific expression are retained. Among these candidate marker genes, sets with correlated expression patterns are identified by graph-based clustering. Finally, cells are assigned to subgroups based on their average expression of each gene set. The CellSIUS algorithm output provides the rare/ sub cell types by cell indices and their transcriptomic signatures.
1 2 3 |
mat.norm |
Numeric Matrix: normalized gene expression matrix where |
group_id |
Character: vector with cluster cell assignment. Make sure that the order of cell cluster assignment reflects the order of the columns of norm.mat. |
min_n_cells |
Integer: when identifying bimodal gene distributions, this specifies the minimum number of cells per mode. Clusters with a total number of cells below |
min_fc |
Numeric: minimum difference in mean [log2] between the two modes of the gene expression distribution. Defaults to 2. |
corr_cutoff |
Numeric: correlation cutoff for MCL clustering of candidate marker genes. If |
iter |
0 or 1: relevant for the final step (assigning cells to subgroups). If set to 1, the first mode of gene expression will be discarded. Cells are then assigned based on the 2nd and 3rd mode, which results in a more stringent assignment. The default is 0 and is usually stringent enough. Defaults to 0. |
max_perc_cells |
Numeric: maximum percentage of cells that can be part of a subcluster. Defaults to 50, implying that a “subgroup” cannot contain more than half of the total observations. |
fc_between_cutoff |
Numeric: minimum difference [log2] in gene expression between cells in the subcluster and all other cells. The higher, the more cluster-specific is the gene signature. Note that this should not be set higher than min_fc. |
mcl_path |
Character: path to the MCL executable. The external GNU MCL software for UNIX needs to be installed by the user. |
CellSIUS
returns a data.table
of a collection of key/ value pairs:
cell_idx Cell indices correspond to the colnames
of norm.mat
gene_id Gene IDs correspond to the rownames
of norm.mat
main_cluster Name of the main cluster defined in group_id
input
expr Average gene expression of ‘gene_id’ across the cells in the correspondent ‘main_cluster’
sub_cluster identifies if the cell is not a member of a CellSIUS subcluster. Subclusters are named as follows: [Name-of-main-cluster]_[number of subcluster]_[0 or 1], where in the last position 0 means the cell is not a member, 1 means it is a member.
N_cells Number of cells in the correspondent CellSIUS sub cluster
log2FC log2 fold change of between the two modes of the bimodal gene expression distribution across cells in the correspondent ‘main_cluster’
The execution of CellSIUS requires the external GNU Markov Cluster Algorithm (MCL) software for UNIX. It needs to be installed by the user [link].
Wegmann, R.et Al., 2018. CellSIUS for sensitive and specific identification and characterization of rare cell populations from complex single-cell RNA sequencing data. Nature Comunications Submission
Accessory functions are provided to help the user to summarize and visualize the CellSIUS results:
CellSIUS_GetResults
CellSIUS_plot
CellSIUS_final_cluster_assignment
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | require(CellSIUS)
require(ggplot2)
require(data.table)
# 2D cell projection
qplot(my_tsne[,1],my_tsne[,2],xlab="tSNE1",ylab="tSNE1",main='2D scRNAseq cell projection')
qplot(my_tsne[,1],my_tsne[,2],xlab="tSNE1",ylab="tSNE1",color=as.factor(clusters),main="Cells colored by clusters")
# Run CellSIUS
# !WARNING: before to run the CellSIUS function, take care to correctly set the mcl_path to point
# to the executable of your local installation of MCL algorithm
CellSIUS.out<-CellSIUS(mat.norm = norm.counts,group_id = clusters,min_n_cells=10, min_fc = 2,
corr_cutoff = NULL, iter=0, max_perc_cells = 50,
fc_between_cutoff = 1,mcl_path = "~/local/bin/mcl")
#____________________________
# EXPLORE CellSIUS RESULTS:
#____________________________
# Summary of sub-populations identified by CellSIUS
Result_List=CellSIUS_GetResults(CellSIUS.out=CellSIUS.out)
# 2D visualization of of sub-populations identified by CellSIUS
require(RColorBrewer)
CellSIUS_plot(coord = my_tsne,CellSIUS.out = CellSIUS.out)
# Final clustering assignment
Final_Clusters = CellSIUS_final_cluster_assignment(CellSIUS.out=CellSIUS.out, group_id=clusters, min_n_genes = 3)
table(Final_Clusters)
qplot(my_tsne[,1],my_tsne[,2],xlab="tSNE1",ylab="tSNE1",color=as.factor(Final_Clusters))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.