cellassign: Annotate cells to cell types using cellassign

Description Usage Arguments Details Value Examples

View source: R/cellassign.R

Description

Automatically annotate cells to known types based on the expression patterns of a priori known marker genes.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
cellassign(
  exprs_obj,
  marker_gene_info,
  s = NULL,
  min_delta = 2,
  X = NULL,
  B = 10,
  shrinkage = TRUE,
  n_batches = 1,
  dirichlet_concentration = 0.01,
  rel_tol_adam = 1e-04,
  rel_tol_em = 1e-04,
  max_iter_adam = 1e+05,
  max_iter_em = 20,
  learning_rate = 0.1,
  verbose = TRUE,
  sce_assay = "counts",
  return_SCE = FALSE,
  num_runs = 1,
  threads = 0
)

Arguments

exprs_obj

Either a matrix representing gene expression counts or a SummarizedExperiment. See details.

marker_gene_info

Information relating marker genes to cell types. See details.

s

Numeric vector of cell size factors

min_delta

The minimum log fold change a marker gene must be over-expressed by in its cell type

X

Numeric matrix of external covariates. See details.

B

Number of bases to use for RBF dispersion function

shrinkage

Logical - should the delta parameters have hierarchical shrinkage?

n_batches

Number of data subsample batches to use in inference

dirichlet_concentration

Dirichlet concentration parameter for cell type abundances

rel_tol_adam

The change in Q function value (in pct) below which each optimization round is considered converged

rel_tol_em

The change in log marginal likelihood value (in pct) below which the EM algorithm is considered converged

max_iter_adam

Maximum number of ADAM iterations to perform in each M-step

max_iter_em

Maximum number of EM iterations to perform

learning_rate

Learning rate of ADAM optimization

verbose

Logical - should running info be printed?

sce_assay

The assay from the input#' SingleCellExperiment to use: this assay should always represent raw counts.

return_SCE

Logical - should a SingleCellExperiment be returned with the cell type annotations added? See details.

num_runs

Number of EM optimizations to perform (the one with the maximum log-marginal likelihood value will be used as the final).

threads

Maximum number of threads used by the algorithm (defaults to the number of cores available on the machine)

Details

Input format exprs_obj should be either a SummarizedExperiment (we recommend the SingleCellExperiment package) or a cell (row) by gene (column) matrix of raw RNA-seq counts (do not log-transform or otherwise normalize).

marker_gene_info should either be

Cell size factors If the cell size factors s are not provided they are computed using the computeSumFactors function from the scran package.

Covariates If X is not NULL then it should be an N by P matrix of covariates for N cells and P covariates. Such a matrix would typically be returned by a call to model.matrix with no intercept. It is also highly recommended that any numerical (ie non-factor or one-hot-encoded) covariates be standardized to have mean 0 and standard deviation 1.

cellassign A call to cellassign returns an object of class cellassign. To access the MLE estimates of cell types, call fit$cell_type. To access all MLE parameter estimates, call fit$mle_params.

Returning a SingleCellExperiment

If return_SCE is true, a call to cellassign will return the input SingleCellExperiment, with the following added:

Value

An object of class cellassign. See details

Examples

1
2
3
4
5
6
7
8
9
data(example_sce)
data(example_marker_mat)

fit <- em_result <- cellassign(example_sce[rownames(example_marker_mat),],
marker_gene_info = example_marker_mat,
s = colSums(SummarizedExperiment::assay(example_sce, "counts")),
learning_rate = 1e-2,
shrinkage = TRUE,
verbose = FALSE)

Irrationone/cellassign documentation built on April 23, 2020, 3:10 p.m.