individual_clustering: Single-cell Aggregated clustering From Ensemble (SAFE)

Usage Arguments

View source: R/individual_clustering.R

Usage

1
2
3
4
5
individual_clustering(inputTags, mt_filter = TRUE, mt.pattern = "^MT-", mt.cutoff = 0.1, 
  SC3 = TRUE, gene_filter = FALSE, svm_num_cells = 5000, CIDR = TRUE, nPC.cidr = NULL, 
  Seurat = TRUE, nGene_filter = TRUE, low.genes = 200, high.genes = 8000, nPC.seurat = NULL, resolution = 0.7,
  tSNE = TRUE, saver = FALSE, dimensions = 3, perplexity = 30, tsne_min_cells = 200, tsne_min_perplexity = 10,
  var_genes = NULL, SEED = 1)

Arguments

inputTags

a G*N matrix with G genes and N cells.

mt_filter

is a boolean variable that defines whether to filter outlier cells according to mitochondrial gene percentage. Default is "TRUE".

mt.pattern

defines the pattern of mitochondrial gene names in the data, for example, mt.pattern = "^MT-" for human and mt.pattern = "^mt-" for mouse. Default is mt.pattern = "^MT-".

mt.cutoff

defines a high cutoff of mitochondrial percentage (Default is 10

\item

SC3a boolean variable that defines whether to cluster cells using SC3 method. Default is "TRUE".

\item

gene_filtera boolean variable defines whether to perform gene filtering before SC3 clustering, when SC3 = TRUE.

\item

svm_num_cells, if SC3 = TRUE, then defines the mimimum number of cells above which SVM will be run.

\item

CIDRa boolean parameter that defines whether to cluster cells using CIDR method. Default is "TRUE".

\item

nPC.cidrdefines the number of principal coordinates used in CIDR clustering, when CIDR = TRUE. Default value is esimated by nPC of CIDR.

\item

Seuratis a boolean variable that defines whether to cluster cells using Seurat method. Default is "TRUE".

\item

nGene_filteris a boolean variable that defines whether to filter outlier cells according to unique gene count before Seurat clustering. Default is "TRUE".

\item

low.genesdefines a low cutoff of unique gene counts (Default is 200) that cells having less than 200 genes are filtered out, when nGene_filter = TRUE.

\item

high.genesdefines a high cutoff of unique gene counts (Default is 8000) that cells having more than 8000 genes are filtered out, when nGene_filter = TRUE.

\item

nPC.seuratdefines the number of principal components used in Seurat clustering, when Seurat = TRUE. Default is nPC.seurat = nPC.cidr.

\item

resolutiondefines the value of resolution used in Seurat clustering, when Seurat = TRUE. Default is resolution = 0.7.

\item

tSNEis a boolean variable that defines whether to cluster cells using t-SNE method. Default is "TRUE".

\item

saveris a boolean variable that defines whether to revise the gene expression profile in noisy and sparse single-cell RNA-seq data for downstream tSNE analysis using SAVER method. Default is "FALSE".

\item

dimensionssets the number of dimensions wanted to be retained in t-SNE step. Default is 3.

\item

perplexitysets the perplexity parameter for t-SNE dimension reduction. Default is 30 when number of cells >=200.

\item

tsne_min_cellsdefines the number of cells in input dataset below which tsne_min_perplexity=10 would be employed for t-SNE step. Default is 200.

\item

tsne_min_perplexitysets the perplexity parameter of t-SNE step for small datasets (number of cells <200).

\item

var_genesdefines the number of variable genes used by t-SNE analysis, when tSNE = TRUE.

\item

SEEDsets the seed of the random number generator. Setting the seed to a fixed value can produce reproducible clustering results.

a matrix of indiviudal clustering results is output, where each row represents the cluster results of each method. This function performs single-cell clustering using four state-of-the-art methods, SC3, CIDR, Seurat and tSNE+kmeans. # Load the example data data_SAFE data("data_SAFE")

# Zheng dataset # Run individual_clustering cluster.result <- individual_clustering(inputTags=data_SAFE$Zheng.expr, SEED=123)

Yuchen Yang, Ruth Huh, Houston Culpepper, Yuan Lin, Michael Love, Yun Li. SAFE (Single-cell Aggregated clustering From Ensemble): Cluster ensemble for single-cell RNA-seq data. 2017 Yuchen Yang <yangyuchensysu@gmail.com>, Ruth Huh <rhuh@live.unc.edu>, Yun Li <yunli@med.unc.edu>


yycunc/SAFEclustering documentation built on March 29, 2021, 5:58 a.m.