User guide for executing `CytOpT` on `HIPC` data

library(reticulate)
# this vignette requires python 3.7 or newer to run
eval <- tryCatch({
  numeric_version(py_config()$version) >= "3.7" && py_numpy_available() && 
    py_module_available("scipy") && py_module_available("sklearn")
}, error = function(e) FALSE)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = eval
)

Introduction

CytOpT is a supervised method that directly estimates the cell proportions in a flow-cytometry data set by using a source gating as its input and relies on regularized optimal transport.

Analysis of HIPC data

As an illustrative example, we analyze here the flow cytometry data from the T-cell panel of the Human Immunology Project Consortium (HIPC) publicly available on ImmuneSpace Gottardo et al. [2014].

An HIPC data set has the following structure (split into 2 files):

Above, xx denotes the center where the data analysis was performed, and y denotes the patient and the replicate of the biological sample in question.

Data load

library(CytOpT)
data("HIPC_Stanford")

Here are the first few lines of the flow-cytometry measurements from patient 1228 replicate 1A:

knitr::kable(head(HIPC_Stanford_1228_1A))

The manual clustering of these data into 10 cell populations (r paste0(levels(HIPC_Stanford_1228_1A_labels), collapse=", ")) can be accessed from the HIPC_Stanford_1228_1A_labels object.

We will use the manual gating from patient 1228 replicate 1A as our source proportions to infer proportions for patient 1369 replicate 1A.

Computation of the benchmark class proportions for target data

Because in this example, we know the true proportions in the target data set HIPC_Stanford_1369_1A, we can assess the gap between the estimate form CytOpt and the cellular proportions from the reference manual gating. For this purpose, we compute those manual proportions with:

gold_standard_manual_prop <- c(table(HIPC_Stanford_1369_1A_labels)/length(HIPC_Stanford_1369_1A_labels))

CytOpT

Optimization

set.seed(123)
res <- CytOpT(X_s = HIPC_Stanford_1228_1A, X_t = HIPC_Stanford_1369_1A, 
              Lab_source = HIPC_Stanford_1228_1A_labels,
              theta_true = gold_standard_manual_prop,
              method='both', monitoring = TRUE)

Results

The results from CytOpt for both optimization algorithms are:

summary(res)

Some visualizations are provided by the plot() method:

plot(res)

Performance evaluation

Concordance between the manual gating gold-standard and CytOpt estimation can be graphically diagnosed with Bland-Altman plots:

Bland_Altman(res$proportions)

The methods implemented in the CytOpt package are detailed in the following article:

Paul Freulon, Jérémie Bigot, Boris P. Hejblum. CytOpT: Optimal Transport with Domain Adaptation for Interpreting Flow Cytometry data https://arxiv.org/abs/2006.09003



Try the CytOpT package in your browser

Any scripts or data that you put into this service are public.

CytOpT documentation built on Feb. 10, 2022, 1:07 a.m.