estCutoffs: Estimation of distance separation cutoffs

View source: R/estCutoffs.R

estCutoffsR Documentation

Estimation of distance separation cutoffs

Description

For each sample, estimates a cutoff parameter for the distance between positive and negative barcode populations.

Usage

estCutoffs(x)

Arguments

x

a SingleCellExperiment.

Details

For the estimation of cutoff parameters, we considered yields upon debarcoding as a function of the applied cutoffs. Commonly, this function will be characterized by an initial weak decline, where doublets are excluded, and subsequent rapid decline in yields to zero. In between, low numbers of counts with intermediate barcode separation give rise to a plateau. As an adequate cutoff estimate, we target the point that approximately marks the end of the plateau regime and the onset of yield decline. To facilitate robust cutoff estimation, we fit a linear and a three-parameter log-logistic function to the yields function:

f(x)=\frac{d}{1+e^{b(log(x)-log(e))}}

The goodness of the linear fit relative to the log-logistic fit is weighed with:

w=\frac{RSS_{log-logistic}}{RSS_{log-logistic}+RSS_{linear}}

and the cutoffs for both functions are defined as:

c_{linear}=-\frac{\beta_0}{2\beta_1}

c_{log-logistic}=argmin_x\{\frac{\vert f'(x)\vert}{ f(x)}>0.1\}

The final cutoff estimate is defined as the weighted mean between these estimates:

c=(1-w)\cdot c_{log-logistic}+w\cdot c_{linear}

Value

the input SingleCellExperiment is returned with an additional metadata slot sep_cutoffs.

Author(s)

Helena L Crowell helena.crowell@uzh.ch

References

Finney, D.J. (1971). Probit Analsis. Journal of Pharmaceutical Sciences 60, 1432.

Examples

library(SingleCellExperiment)

# construct SCE
data(sample_ff, sample_key)
sce <- prepData(sample_ff)
    
# assign preliminary barcode IDs
# & estimate separation cutoffs
sce <- assignPrelim(sce, sample_key)
sce <- estCutoffs(sce)

# access separation cutoff estimates
(seps <- metadata(sce)$sep_cutoffs)

# compute population yields
cs <- split(seq_len(ncol(sce)), sce$bc_id)
sapply(names(cs), function(id) {
  sub <- sce[, cs[[id]]]
  mean(sub$delta > seps[id])
})

# view yield plots including current cutoff
plotYields(sce, which = "A1")


HelenaLC/CATALYST documentation built on Nov. 30, 2024, 4:04 a.m.