PeacoQC: Peak-based detection of high quality cytometry data

View source: R/PeacoQC.R

PeacoQCR Documentation

Peak-based detection of high quality cytometry data

Description

PeacoQC will determine peaks on the channels in the flowframe. Then it will remove anomalies caused by e.g. clogs, changes in speed etc. by using an IsolationTree and/or the MAD method.

Usage

PeacoQC(ff, channels, determine_good_cells="all",
        plot=20, save_fcs=TRUE, output_directory=".",
        name_directory="PeacoQC_results", report=TRUE,
        events_per_bin=FindEventsPerBin(remove_zeros, ff, channels,
        min_cells, max_bins, step), min_cells=150, max_bins=500, step=500,
        MAD=6, IT_limit=0.6, consecutive_bins=5, remove_zeros=FALSE,
        suffix_fcs="_QC", force_IT=150, peak_removal = (1/3),
        min_nr_bins_peakdetection = 10, time_channel_parameter = "Time",
         ...)

Arguments

ff

A flowframe or the location of an fcs file. Make sure that the flowframe is compensated and transformed. If it is mass cytometry data, only a transformation is necessary.

channels

Indices or names of the channels in the flowframe on which peaks have to be determined.

determine_good_cells

If set to FALSE, the algorithm will only determine peaks. If it is set to "all", the bad measurements will be filtered out based on the MAD and IT analysis. It can also be put to "MAD" or "IT" to only use one method of filtering.

plot

When PeacoQC removes more than the specified percentage, an overview plot will be made of all the selected channels and the deleted measurements. If set to TRUE, the PlotPeacoQC function is run to make an overview plot of the deleted measurements, even when nothing is removed. Default is set to 20. If an increasing or decreasing trend is found, a figure will also be made except if plot is set to FALSE.

save_fcs

If set to TRUE, the cleaned fcs file will be saved in the output_directory as: filename_QC.fcs. The _QC name can be altered with the suffix_fcs parameter. An extra column named "Original_ID" is added to this fcs file where the cells are given their original cell id. Default is TRUE.

output_directory

Directory where a new folder will be created that consists of the generated fcs files, plots and report. If set to NULL, nothing will be stored.The default folder is the working directory.

name_directory

Name of folder that will be generated in output_directory. The default is "PeacoQC_results".

report

Overview text report that is generated after PeacoQC is run. If set to FALSE, no report will be generated. The default is TRUE.

events_per_bin

Number of events that are put in one bin. Default is calculated based on the rows in ff

min_cells

The minimum amount of cells (nonzero values) that should be present in one bin. Lowering this parameter can affect the robustness of the peak detection. Default is 150.

max_bins

The maximum number of bins that can be used in the cleaning process. If this value is lowered, larger bins will be made. Default is 500.

step

The step in events_per_bin to which the parameter is reduced to. Default is 500.

MAD

The MAD parameter. Default is 6. If this is increased, the algorithm becomes less strict.

IT_limit

The IsolationTree parameter. Default is 0.55. If this is increased, the algorithm becomes less strict.

consecutive_bins

If 'good' bins are located between bins that are removed, they will also be marked as 'bad'. The default is 5.

remove_zeros

If this is set to TRUE, the zero values will be removed before the peak detection step. They will not be indicated as 'bad' value. This is recommended when cleaning mass cytometry data. Default is FALSE.

suffix_fcs

The suffix given to the new fcs files. Default is "_QC".

force_IT

If the number of determined bins is less than this number, the IT analysis will not be performed. Default is 150 bins.

peak_removal

During the peak detection step, peaks are only kept if they are peak_removal percentage of the maximum height peak. Default is 1/3

min_nr_bins_peakdetection

The percentage of number of bins in which the maximum number of peaks has to be present. Default is 10.

time_channel_parameter

Name of the time channel in ff if present. Default is "Time".

...

Options to pass on to the PlotPeacoQC function (display_cells, manual_cells, prefix)

Value

This function returns a list with a number of items. It will include "FinalFF" where the transformed, compensated and cleaned flowframe is stored. It also contains the starting parameters and the information necessary to give to PlotPeacoQC if the two functions are run seperatly. The GoodCells list is also given where 'good' measurements are indicated as TRUE and the to be removed measurements as FALSE.

Examples

# General pipeline for preprocessing and quality control with PeacoQC

# Read in raw fcs file
fileName <- system.file("extdata", "111.fcs", package="PeacoQC")
ff <- flowCore::read.FCS(fileName)

# Define channels where the margin events should be removed
# and on which the quality control should be done
channels <- c(1, 3, 5:14, 18, 21)

ff <- RemoveMargins(ff=ff, channels=channels, output="frame")

# Compensate and transform the data

ff <- flowCore::compensate(ff, flowCore::keyword(ff)$SPILL)
ff <- flowCore::transform(ff,
                            flowCore::estimateLogicle(ff,
                            colnames(flowCore::keyword(ff)$SPILL)))
#Run PeacoQC
PeacoQC_res <- PeacoQC(ff, channels,
                        determine_good_cells="all",
                        save_fcs=TRUE)


saeyslab/PeacoQC documentation built on Aug. 24, 2023, 8:11 a.m.