infinity_flow: Wrapper to the Infinity Flow pipeline

View source: R/00_master.R

infinity_flowR Documentation

Wrapper to the Infinity Flow pipeline

Description

Wrapper to the Infinity Flow pipeline

Usage

infinity_flow(
  path_to_fcs,
  path_to_output,
  path_to_intermediary_results = tempdir(),
  backbone_selection_file = NULL,
  annotation = NULL,
  isotype = NULL,
  input_events_downsampling = Inf,
  prediction_events_downsampling = 1000,
  cores = 1L,
  your_random_seed = 123,
  verbose = TRUE,
  extra_args_read_FCS = list(emptyValue = FALSE, truncate_max_range = FALSE,
    ignore.text.offset = TRUE),
  regression_functions = list(XGBoost = fitter_xgboost),
  extra_args_regression_params = list(list(nrounds = 500, eta = 0.05)),
  extra_args_UMAP = list(n_neighbors = 15L, min_dist = 0.2, metric = "euclidean", verbose
    = verbose, n_epochs = 1000L, n_threads = cores, n_sgd_threads = cores),
  extra_args_export = list(FCS_export = c("split", "concatenated", "none")[1], CSV_export
    = FALSE),
  extra_args_correct_background = list(FCS_export = c("split", "concatenated",
    "none")[1], CSV_export = FALSE),
  extra_args_plotting = list(chop_quantiles = 0.005),
  neural_networks_seed = NULL
)

Arguments

path_to_fcs

Path to the input directory where input FCS files are stored (one file per well). Will look for FCS files recursively in that directory.

path_to_output

Path to the output directory where final results will be stored

path_to_intermediary_results

Path to results to store temporary data. If left blank, will default to a temporary directory. It may be useful to store the intermediary results to further explore the data, tweak the pipeline or to resume computations.

backbone_selection_file

If that argument is missing and R is run interactively, the user will be prompted to state whether each channel in the FCS file should be considered backbone measurement, exploratory measurement or ignored. Otherwise, the user should run select_backbone_and_exploratory_markers in an interactive R session, save its output using write.csv(row.names=FALSE) and set this backbone_selection_file parameter to the path of the saved output.

annotation

Named character vector. Elements should be the targets of the exploratory antibodies, names should be the name of the FCS file where that exploratory antibody was measured.

isotype

Named character vector. Elements should be the isotype used in each of the well and that (e.g. IgG2). The corresponding isotype should be present in annotation (e.g. Isotype_IgG2, with this capitalization exactly). Autofluorescence measurements should be listed here as "Blank"

input_events_downsampling

How many event should be kept per input FCS file. Default to no downsampling. In any case, half of the events will be used to train regression models and half to test the performance. Predictions will be made only on events from the test set, and downsampled according to prediction_events_downsampling.

prediction_events_downsampling

How many event should be kept per input FCS file to output prediction for. Default to 1000.

cores

Number of cores to use for parallel computing. Defaults to 1 (no parallel computing)

your_random_seed

Deprecated: was used to set a seed for computationally reproducible results but is not allowed by Bioconductor. Please set a random seed yourself using set.seed(somenumber) if you desire computionally-reproducible results.

verbose

Whether to print information about progress

extra_args_read_FCS

list of named arguments to pass to flowCore:read.FCS. Defaults to list(emptyValue=FALSE,truncate_max_range=FALSE,ignore.text.offset=TRUE) which in our experience avoided issues with data loading.

regression_functions

named list of fitter_* functions (see ls("package:infinityFlow") for the complete list). The names should be desired names for the different models. Each object of the list will correspond to a machine learning model to train. Defaults to list(XGBoost = fitter_xgboost).

extra_args_regression_params

list of lists the same length as the regression_functions argument. Each element should be a named list, that will be passed as named arguments to the corresponding fitter_ function. Defaults to list(list(nrounds = 500, eta = 0.05)).

extra_args_UMAP

list of named arguments to pass to uwot:umap. Defaults to list(n_neighbors=15L,min_dist=0.2,metric="euclidean",verbose=verbose,n_epochs=1000L)

extra_args_export

Whether raw imputed data should be exported. Possible values are list(FCS_export = "split") to export one FCS file per input well, list(FCS_export = "concatenated") to export a single concatenated FCS file containing all the dataset, list(FCS_export = "csv") for a single CSV file containing all the dataset. You can export multiple modalities by using for instance extra_args_export = list(FCS_export = c("split", "concatenated", "csv"))

extra_args_correct_background

Whether background-corrected imputed data should be exported. Possible values are list(FCS_export = "split") to export one FCS file per input well, list(FCS_export = "concatenated") to export a single concatenated FCS file containing all the dataset, list(FCS_export = "csv") for a single CSV file containing all the dataset. You can export multiple modalities by using for instance extra_args_export = list(FCS_export = c("split", "concatenated", "csv"))

extra_args_plotting

list of named arguments to pass to plot_results. Defaults to list(chop_quantiles=0.005) which removes the top 0.05% and bottom 0.05% of the scale for each marker when mapping color palettes to intensities.

neural_networks_seed

Seed for computationally reproducible results when using neural networks (in additional to the other sources of stochasticity - sampling - that are made reproducible by the your_random_seed argument.

Value

Raw and background-corrected imputed expression data for every Infinity antibody


ebecht/infinityFlow documentation built on Jan. 31, 2024, 11:31 p.m.