Automatic quality control of flow cytometry data.

Share:

Description

For a set of FCS files, flow_auto_qc performs a complete and automatic quality control. It consists in the detection and removal of anomalies by checking three properties of flow cytometry: 1) flow rate, 2) signal acquisition, 3) dynamic range.

Usage

1
2
3
4
5
6
flow_auto_qc(fcsfiles, remove_from = "all", timeCh = NULL,
  second_fractionFR = 0.1, alphaFR = 0.01, decompFR = TRUE,
  ChRemoveFS = c("FSC", "SSC"), outlierFS = FALSE, pen_valueFS = 200,
  max_cptFS = 3, ChFM = NULL, sideFM = "both", neg_valuesFM = 1,
  html_report = "_QC", mini_report = "QCmini", fcs_QC = "_QC",
  fcs_highQ = FALSE, fcs_lowQ = FALSE, folder_results = "resultsQC")

Arguments

fcsfiles

It can be a character vector with the filenames of the FCS files, a flowSet or a flowFrame.

remove_from

Select from which of the three steps the anomalies have to be excluded in the high quality FCS file. The default option "all" removes the anomalies from all the three steps. Alternatively, you can use: "FR_FS", "FR_FM", "FS_FM", "FR", "FS", "FM", to remove the anomalies only on a subset of the steps where FR stands for the flow rate, FS stands for signal acquisition and FM stands for dynamic range.

timeCh

Character string corresponding to the name of the Time Channel in the set of FCS files. By default is NULL and the name is retrieved automatically.

second_fractionFR

The fraction of a second that is used to split the time channel in order to recreate the flow rate. Set it to "timestep" if you wish to recreate the flow rate at the maximum resolution allowed by the flow cytometry instrument. Usually, the timestep corresponds to 0.01, however, to shorten the running time of the analysis the fraction used by default is 0.1, corresponding to 1/10 of a second.

alphaFR

The level of statistical significance used to accept anomalies detected by the ESD method. The default value is 0.01.

decompFR

Logical indicating whether the flow rate should be decomposed in the trend and cyclical components. Default is TRUE and the ESD outlier detection will be executed on the trend component penalized by the magnitude of the cyclical component. If it is FALSE the ESD outlier detection will be executed on the original flow rate.

ChRemoveFS

Add a character vector with the names or name portions of the channels that you want to exclude from the signal acquisition check. The default option, c("FSC", "SSC"), excludes the scatter parameters. If you want to include all the parameters in the analysis use NULL.

outlierFS

logical indicating whether outliers have to be removed before the changepoint detection of the signal acquisition check. The default is FALSE.

pen_valueFS

The value of the penalty for the changepoint detection algorithm. This can be a numeric value or text giving the formula to use; for instance, you can use the character string "1.5*log(n)", where n indicates the number of cells in the FCS file. The higher the penalty value the less strict is the detection of the anomalies. The default is 200.

max_cptFS

The maximum number of changepoints that can be detected for each channel. The default is 3.

ChFM

A character vector that indicates which channels need to include for the dynamic range check. The default option is NULL and with it all the channels are selected for the analysis.

sideFM

Select whether the dynamic range check has to be executed on both limits, the upper limit or the lower limit. Use one of the options: "both", "upper", "lower". The default is "both".

neg_valuesFM

Scalar indicating the method to use for the removal of the anomalies from the lower limit of the dynamic range. Use 1 to remove negative outliers or use 2 to truncate the negative values to the cut-off indicated in the FCS file.

html_report

Suffix to be added to the FCS filename to name the HTML report of the quality control. The default is "_QC". If you do not want to generate a report use FALSE.

mini_report

Suffix to be added for the filename of the TXT file containing the percentage of anomalies detected in each FCS file analyzed. The default is "_QCmini". If you do not want to generate the mini report use FALSE.

fcs_QC

Suffix to be added for the filename of the new FCS containing a new channel where the low quality events have a random value between 10,000 to 20,000 (as for flowClean). The default is "_QC". If you do not want to generate the high quality FCS file use FALSE.

fcs_highQ

Suffix to be added for the filename of the new FCS containing only the events that passed the quality control. The default is FALSE and hence the high quality FCS file is not generated.

fcs_lowQ

Suffix to be added for the filename of the new FCS containing only the events that did not pass the quality control. The default is FALSE and hence the low quality FCS file is not generated.

folder_results

Character string used to name the directory that contains the results. The default is "resultsQC". If you intend to return the results in the main directory use FALSE.

Value

A complete quality control is performed on flow cytometry data in FCS format. By default the analysis returns a directory named resultsQC containing: 1. a set of new FCS files with a new parameter to gate out the low quality events 2. a set of HTML reports, one for each FCS file, that include graphs and table indicating where the anomalies were detected, 3. a single TXT file reporting the percentage of events removed in each FCS file.

Author(s)

Gianni Monaco, Chen Hao

Examples

1
2
3
4
5
## a sample dataset as flowSet object
data(Bcells)

## quality control on a flowFrame object
flow_auto_qc(Bcells[[1]], html_report = FALSE, mini_report = FALSE, fcs_QC = FALSE, folder_results = FALSE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.