CleanCITE: Removes noise from CITE-seq protein data by fitting a two...

View source: R/data_processing.R

CleanCITER Documentation

Removes noise from CITE-seq protein data by fitting a two component mixture model and computing each expression measurement as the cumulative distribution of the component with the higher median.

Description

This mixture model can either be: - a Negative Binomial on the expression counts (with optional weighting/normalization by the total ADT counts per cell after cleaning) - a Gaussian on the log-normalized expression with zeros removed, similar to the method proposed by Trong et al. in SISUA (https://www.biorxiv.org/content/10.1101/631382v1)

Usage

CleanCITE(
  stvea_object,
  model = "nb",
  num_cores = 1,
  maxit = 500,
  factr = 1e-09,
  optim_inits = NULL,
  normalize = TRUE
)

Arguments

stvea_object

STvEA.data class with CITE-seq protein data

model

"nb" (Negative Binomial) or "gaussian" model to fit

num_cores

number of cores to use for parallelized fits

maxit

maximum number of iterations for optim function - only used if model is "nb"

factr

accuracy of optim function - only used if model is "nb"

optim_inits

a matrix of (proteins x params) with initialization parameters for each protein to input to the optim function. If NULL, starts at two default parameter sets and picks the better one - only used if model is "nb"

normalize

divide cleaned CITE-seq expression by total ADT counts per cell - only used if model is "nb"


CamaraLab/STvEA documentation built on April 2, 2024, 6:07 a.m.