prepare_data: Prepare a directory of .fcs files

prepare_dataR Documentation

Prepare a directory of .fcs files

Description

This is a wrapper function that takes you from a directory of .fcs files or a flowset to a transformed tibble.

Usage

prepare_data(
  data_dir = NULL,
  flowset = NULL,
  markers = NULL,
  pattern = "\\.fcs",
  metadata = NULL,
  filename_col = "filename",
  sample_ids = NULL,
  batch_ids = NULL,
  condition = NULL,
  anchor = NULL,
  down_sample = TRUE,
  sample_size = 5e+05,
  sampling_type = "random",
  seed = 473,
  panel = NULL,
  panel_channel = "fcs_colname",
  panel_antigen = "antigen",
  transform = TRUE,
  cofactor = 5,
  derand = TRUE,
  .keep = FALSE,
  clean_colnames = TRUE
)

Arguments

data_dir

Directory containing the .fcs files

flowset

Optional: Prepare a flowset instead of a directory of fcs files

markers

The markers to transform on

pattern

The pattern to use to find the files in the folder

metadata

Optional: Can be either a filename or data.frame of the metadata file. Please give the full path from working directory to metadata file

filename_col

Optional: The column in the metadata containing the fcs filenames. Needed if metadata is given, but sample_ids is not

sample_ids

Optional: If a character, it should be the sample column in the metadata. If its a vector, it should have the same length as the total flowset. If NULL, sample ids will be the file names. If a single value, all rows will be assigned this value.

batch_ids

Optional: If a character, it should be the column in the metadata containing the batch ids. If its a vector, it should have the same length as the total flowset. If a single value, all rows will be assigned this value.

condition

Optional: The column in the metadata containing the condition. Will be used as the covariate in ComBat, but can be specified later. You may use this to add a different column of choice, in case you want to use a custom column in the ComBat model matrix.

anchor

Experimental: The column in the metadata referencing the anchor samples (control references). Will be used as a covariate in ComBat, if specified. Please be aware that this column may be confounded with the condition column. You may use this to add a different column of choice, in case you want to use a custom column in the ComBat model matrix. You may use a custom column name, but it is good practice to add the name to the 'non_markers' object exported by cyCombine, to reduce the risk of unexpected errors.

down_sample

If TRUE, the output will be down-sampled to size sample_size

sample_size

The size to down-sample to. If a non-random sampling type is used and a group contains fewer cells than the sample_size, all cells of that group will be used.

sampling_type

The type of down-sampling to use. "random" to randomly select cells across the entire dataset, "batch_ids" to sample evenly (sample_size) from each batch, or "sample_ids" sample evenly (sample_size) from each sample.

seed

The seed to use for down-sampling

panel

Optional: Panel as a filename or data.frame. Is used to define colnames from the panel_antigen column

panel_channel

Optional: Only used if panel is given. It is the column name in the panel data.frame that contains the channel names

panel_antigen

Optional: Only used if panel is given. It is the column name in the panel data.frame that contains the antigen names

transform

If TRUE, the data will be transformed; if FALSE, it will not.

cofactor

The cofactor to use when transforming

derand

Derandomize. Should be TRUE for CyTOF data, otherwise FALSE.

.keep

Keep all channels. If FALSE, channels that are not transformed are removed

clean_colnames

(Default: TRUE). A logical defining whether column names should be cleaned or not. Cleaning involves removing isotope tags, spaces, dashes, underscores, and all bracket types.

See Also

Other dataprep: compile_fcs(), convert_flowset(), transform_asinh()

Examples

## Not run: 
uncorrected <- data_dir %>%
  prepare_data(metadata = "metadata.csv",
  markers = markers,
  filename_col = "FCS_name",
  batch_ids = "Batch",
  condition = "condition",
  down_sample = FALSE)
  
## End(Not run)

biosurf/cyCombine documentation built on May 23, 2024, 4:07 a.m.