merge_flow_panels: Merge partially overlapping flow cytometry panels from...

View source: R/cytoMerge.R

merge_flow_panelsR Documentation

Merge partially overlapping flow cytometry panels from multiple files that profile the same biological sample. This function first logicle-transform the data, then applies quantile-normalization across the samples. It then uses XGBoost to predict all markers across all files (including markers that are overlapping - these can be used as a prediction quality control).

Description

Merge partially overlapping flow cytometry panels from multiple files that profile the same biological sample. This function first logicle-transform the data, then applies quantile-normalization across the samples. It then uses XGBoost to predict all markers across all files (including markers that are overlapping - these can be used as a prediction quality control).

Usage

merge_flow_panels(
  input_files,
  excluded_markers = character(),
  rds_dir = paste0(tempdir(), "tmp"),
  output_dir = paste0(tempdir(), "out"),
  writeTmpFilesToDisk = FALSE,
  verbose = TRUE,
  xgboost_params = list(nrounds = 100, eta = 0.05, verbose = 0L),
  umap_params = list(n_neighbors = 50L, n_epochs = 1000L, verbose = FALSE),
  chop_quantiles = 0.005,
  do_UMAP_and_plot = TRUE,
  train_set_fraction = 0.8
)

Arguments

input_files

Character vector of input FCS files: from the same biological samples, but with distinct and partially-overlapping antibody panels

excluded_markers

Markers to ignore (according to the FCS files' "name" parameter description - or "desc" if "name" is missing). For instance, Viability, Time, DNA content ...

rds_dir

Directory to store temporary (.Rds) files such as the regression models or normalized and harmonized data matrices. Defaults to a temporary subdirectory.

output_dir

Directory to save the output. Must be empty at the start of execution. Defaults to a temporary subdirectory.

writeTmpFilesToDisk

Boolean. Whether to save .Rds intermediary files to disk for further analyses. Defaults to FALSE.

verbose

Boolean. Whether to plot progress information.

xgboost_params

List of named arguments passed to xgboost::xgboost().

umap_params

List of named arguments passed to uwot::umap().

chop_quantiles

numeric (should be close to 0). If do_UMAP_and_plot is TRUE, on the PDF plots this will clip the top and bottom parts of the data to the corresponding quantile. Defaults to 0.005 (so the bottom 0.005 and top 0.995 quantiles are clipped). This only affect the color-mapping of the plots, not the predictions or UMAP embeddings.

do_UMAP_and_plot

Boolean. Whether to compute UMAP dimensionality reduction of the backbone and prediction-enriched expression matrices, and to overlay measured and predicted markers' expression on the UMAP embeddings. Setting this to TRUE will produce PDF files in the output directory (in addition to the FCS files). Setting this to TRUE will also include UMAP embeddings into the exported FCS files.

train_set_fraction

Fraction of the data to use as training set. Performance will be computed on the train and test sets. Defaults to 0.8.


ebecht/cytoMerge documentation built on Sept. 9, 2024, 5:38 p.m.