FlowHist: FlowHist
In plantarum/flowPloidy: Analyze flow cytometer data to determine sample ploidy

FlowHist

R Documentation

FlowHist

Description

Creates a FlowHist object from an FCS file, setting up the histogram data for analysis.

Usage

FlowHist(
  file,
  channel,
  bins = 256,
  analyze = TRUE,
  linearity = "variable",
  debris = "SC",
  samples = 2,
  pick = FALSE,
  standards = 0,
  g2 = TRUE,
  debrisLimit = 40,
  truncate_max_range = TRUE,
  trimRaw = 0,
  ...
)

batchFlowHist(files, channel, verbose = TRUE, ...)

Arguments

`file`	character, the name of the single file to load
`channel`	character, the name of the data column to use
`bins`	integer, the number of bins to use to aggregate events into a histogram
`analyze`	logical, if TRUE the model will be analyzed immediately
`linearity`	character, either "variable", the default, or "fixed". If "fixed", linearity is fixed at 2; if "variable", linearity is fit as a model parameter.
`debris`	character, either "SC", the default, "MC", or "none", to set the debris model component to the Single-Cut or Multi-Cut models, or to not include a debris component (such as for gated data).
`samples`	integer; the number of samples in the data. Default is 2 (unknown and standard), but can be set to 3 if two standards are used, or up to 6 for endopolyploidy analysis.
`pick`	logical; if TRUE, the user will be prompted to select peaks to use for starting values. Otherwise (the default), starting values will be detected automatically.
`standards`	numeric; the size of the internal standard in pg. When loading a data set where different samples have different standards, a vector of all the standard sizes. If set to 0, calculation of pg for the unknown sample will not be done.
`g2`	a logical value, default is TRUE. Should G2 peaks be included in the model?
`debrisLimit`	an integer value, default is 40. Passed to `cleanPeaks`. Peaks with fluorescence values less than `debrisLimit` will be ignored by the automatic peak-finding algorithm. Used to ignore the debris often found at the left side of the histogram.
`truncate_max_range`	logical, default is TRUE. Can be turned off to avoid truncating extreme positive values from the instrument. See `read.FCS` for details.
`trimRaw`	numeric. If not 0, truncate the raw intensity data to below this threshold. Necessary for some cytometers, which emit a lot of empty data channels.
`...`	additional arguments passed from `batchFlowHist` to `FlowHist`, or to assorted helper functions. See `findPeaks` (arguments `window` and `smooth`)
`files`	character, a vector of file names to load, or a single character value giving the path to a directory; if the latter, all files in the directory will be loaded
`verbose`	logical; if TRUE, `batchFlowHist` will list files as it processes them.

Details

For most uses, simply calling FlowHist with a file, channel, and standards argument will do what you need. The other arguments are provided for optional tuning of this process. In practice, it's easier to correct the model fit using browseFlowHist than to determine 'perfect' values to pass in as arguments to FlowHist.

Similarly, batchFlowHist is usually used with only the files, channel, and standards arguments.

In operation, FlowHist starts by reading an FCS file (using the function read.FCS internally). This produces a flowFrame object, which we extend to a FlowHist object as follows:

Extract the fluorescence data from channel.
Remove the top bin, which contains off-scale readings we ignore in the analysis.
Remove negative fluorescence values, which are artifacts of instrument compensation
Removes the first 5 bins, which often contain noisy values, probably further artifacts of compensation.
aggregates the raw data into the desired number of bins, as specified with the bins argument. The default is 256, but you may also try 128 or 512. Any integer is technically acceptable, but I wouldn't stray from the default without a good reason. (I've never had a good reason!)
identify model components to include. All FlowHist objects will have the single-cut debris model and the G1 peak for sample A, and the broadened rectangle for the S-phase of sample A. Depending on the data, additional components for the G2 peak and sample B (G1, G2, s-phase) may also be added. The debris argument can be used to select the Multi-Cut debris model instead, or this can be toggled in browseFlowHist
Build the NLS model. All the components are combined into a single model.
Identify starting values for Gaussian (G1 and G2 peaks) model components. For reasonably clean data, the built-in peak detection is ok. You can evaluate this by plotting the FlowHist object with the argument init = TRUE. The easiest way to fix bad peak detection is via the browseFlowHist interface. You can also play with the window and smooth arguments (which is tedious!), or pick the peaks visually yourself with pick = TRUE.
Finally, we fit the model and calculate the fitted parameters. Model fitting is suppressed if the analyze argument is set as FALSE

Value

FlowHist returns a FlowHist object.

batchFlowHist returns a list of FlowHist objects.

Slots

raw: a flowFrame object containing the raw data from the FCS file
channel: character, the name of the data column to use
bins: integer, the number of bins to use to aggregate events into a histogram
linearity: character, either "fixed" or "variable" to indicate if linearity is fixed at 2 or fit as a model parameter
debris: character, either "SC" or "MC" to indicate if the model should include the single-cut or multi-cut model
gate: logical, a vector indicating events to exclude from the analysis. In normal use, the gate will be modified via interactive functions, not set directly by users.
trimRaw: numeric, the threshold for trimming/truncating raw data before binning. The default, 0, means no trimming will be done.
histdata: data.frame, the columns are the histogram bin number (xx), florescence intensity (intensity), and the raw single-cut and multi-cut debris model values (SCvals and MCvals), and the raw doublet, triplet and quadruplet aggregate values (DBvals, TRvals, and QDvals). The debris and aggregate values are used in the NLS fitting procedures.
peaks: matrix, containing the coordinates used for peaks when calculcating initial parameter values.
opts: list, currently unused. A convenient place to store flags when trying out new options.
comps: a list of ModelComponent objects included for these data.
model: the function (built from comps) to fit to these data.
limits: list, a list of lower and upper bounds for model parameters
init: a list of initial parameter estimates to use in fitting the model.
nls: the nls object produced by the model fitting
counts: a list of cells counted in each peak of the fitted model
CV: a list of the coefficients of variation for each peak in the fitted model.
RCS: numeric, the residual chi-square for the fitted model.
samples: numeric, the number of samples included in the data. The default is 2 (i.e., unknown and standard), but if two standards are used it should be set to 3. It can be up to 6 for endopolyploidy analysis, and can be interactively increased (or decreased) via browseFlowHist
standards: a FlowStandards object.
g2: logical, if TRUE the model will include G2 peaks for each sample (as long as the G1 peak is less than half-way across the histogram). Set to FALSE to drop the G2 peaks for endopolyploidy analyses.
annotation: character, user-added annotation for the sample.
fail: logical, set by the user via the browseFlowHist interface to indicate the sample failed and no model fitting should be done.

Author(s)

Tyler Smith

Examples

library(flowPloidyData) 
fh1 <- FlowHist(file = flowPloidyFiles()[1], channel = "FL3.INT.LIN")
fh1
batch1 <- batchFlowHist(flowPloidyFiles(), channel = "FL3.INT.LIN")
batch1

plantarum/flowPloidy documentation built on March 25, 2023, 1:37 a.m.