classify_cells: Classify cells from multiple models

Description Usage Arguments Value Examples

Description

Classify cells from multiple models

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
classify_cells(
  classify_obj,
  classifiers = NULL,
  cell_types = "all",
  chunk_size = 5000,
  path_to_models = c("default", "."),
  ignore_ambiguous_result = FALSE,
  cluster_slot = NULL,
  ...
)

## S4 method for signature 'Seurat'
classify_cells(
  classify_obj,
  classifiers = NULL,
  cell_types = "all",
  chunk_size = 5000,
  path_to_models = c("default", "."),
  ignore_ambiguous_result = FALSE,
  cluster_slot = "seurat_clusters",
  seurat_assay = "RNA",
  seurat_slot = "counts",
  ...
)

## S4 method for signature 'SingleCellExperiment'
classify_cells(
  classify_obj,
  classifiers = NULL,
  cell_types = "all",
  chunk_size = 5000,
  path_to_models = c("default", "."),
  ignore_ambiguous_result = FALSE,
  sce_assay = "logcounts",
  cluster_slot = NULL,
  ...
)

Arguments

classify_obj

the object containing cells to be classified

classifiers

list of classification models. The model is obtained from train_classifier function or available in current working space. Users may test the model using test_classifier before using this function. If classifiers contain classifiers for sub cell types, classifiers for parent cell type must be indicated first in order to be applied before children classifiers. If classifiers is NULL, the method will use all classifiers in database.

cell_types

list of cell types containing models to be used for classification, only applicable if the models have been saved to package.

chunk_size

size of data chunks to be predicted separately. This option is recommended for large datasets to reduce running time. Default value at 5000, because smaller datasets can be predicted rapidly.

path_to_models

path to the folder containing the list of models. As default value, the pretrained models in the package will be used. If user has trained new models, indicate the folder containing the new_models.rda file.

ignore_ambiguous_result

return all ambiguous predictions (multiple cell types) to empty When this parameter turns to TRUE, most probably predicted cell types will be ignored.

cluster_slot

name of slot in meta data containing cluster information, in case users want to have additional cluster-level prediction

...

arguments passed to other methods

seurat_assay

name of assay to use in Seurat object, defaults to 'RNA' assay.

seurat_slot

type of expression data to use in Seurat object. Some available types are: "counts", "data" and "scale.data". Default to "counts", which is unnormalized data.

sce_assay

name of assay to use in SingleCellExperiment object, defaults to 'logcounts' assay.

Value

the input object with new slots in cells meta data New slots are: predicted_cell_type, most_probable_cell_type, slots in form of [cell_type]_p and [cell_type]_class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# load small example dataset
data("tirosh_mel80_example")

# train one classifier for one cell type, for ex, B cell
# define genes to use to classify this cell type
selected_features_B = c("CD19", "MS4A1", "CD79A")

# train the classifier
set.seed(123)
clf_b <- train_classifier(train_obj = tirosh_mel80_example, 
features = selected_features_B, cell_type = "B cells")

# do the same thing with other cell types, for example, T cells
selected_features_T = c("CD4", "CD8A", "CD8B")
set.seed(123)
clf_t <- train_classifier(train_obj = tirosh_mel80_example, 
features = selected_features_T, cell_type = "T cells")

# create a list of classifiers
classifier_ls <- list(clf_b, clf_t)

# classify cells with list of classifiers
seurat.obj <- classify_cells(classify_obj = tirosh_mel80_example, 
classifiers = classifier_ls)

grisslab/scClassifR documentation built on Oct. 27, 2021, 12:13 p.m.