DebrisModels: Histogram Debris Models
In flowPloidy: Analyze flow cytometer data to determine sample ploidy

Description Usage Arguments Value Single Cut Model Multiple-Cut Model Debris Models and Gating Author(s) References Examples

Implementation of debris models described by Bagwell et al. (1991).

1
2
3

getSingleCutValsBase(intensity, xx, first.channel)

getMultipleCutVals(intensity, first.channel)

`intensity`	a numeric vector, the histogram intensity in each channel
`xx`	an integer vector, the ordered channels corresponding to the values in ‘intensity’.
`first.channel`	integer, the lowest bin to include in the modelling process. Determined by the internal function `fhStart`.

getSingleCutVals, the vectorized function built from getSingleCutValsBase, returns the fixed SCvals for the histogram.

getMultipleCutVals, a vectorized function, returns the fixed MCvals for the histogram.

This is the theoretical probability distribution of the size of pieces formed by a single random cut through an ellipsoid. In other words, we assume that the debris is composed of nuclei pieces generated by cutting a subset of the nuclei in a sample into two pieces.

The model is:

S(x) = a ∑_{j = x + 1}^{n} √[3]{j} Y_j P_s(j, x)

x the histogram channel that we're estimating the debris value for.
SCaP the amplitude parameter.
Y_j the histogram intensity for channel j.

where P_s(j, x) is the probability of a nuclei from channel j falling into channel x when cut. That is, for j > x, the probability that fragmenting a nuclei from channel j with a single cut will produce a fragment of size x. This probability is calculated as:

P_s(j, x) = \frac{2}{(π j √{(x/j) (1 - x/j)}}

This model involves a recursive calculation, since the fitted value for channel x depends not just on the intensity for channel x, but also the intensities at all channels > x. I deal with this by pre-calculating the raw values, which don't actually depend on the only parameter, SCaP. These raw values are stored in the histData matrix (which is a slot in the FlowHist object). This must be accomodated by treating SCvals as a 'special parameter' in the ModelComponent definition. See that help page for details.

The Multiple-Cut model extends the Single-Cut model by assuming that a single nuclei may be cut multiple times, thus creating more than two fragments.

The model is:

S(x) = MCaP e^{-kx}∑_{j = x + 1}^{n} Y_j

x the histogram channel that we're estimating the debris value for.
k an exponential fitting parameter
MCaP the amplitiude parameter
Y_j the histogram intensity for channel j.

This model involves another recursive or "histogram-dependent" component. Again, the sum is independent of the fitted parameters, so we can pre-compute that and add it to the histData slot, in the column MCvals. This is treated as a 'special parameter' when the Multiple-Cut model is applied, so we only need to fit the parameters k and MCaP.

The debris models assume that all debris is composed of nuclei (G1 and G2), that have been cut into 2 or more fragments. In actual practice, at least when working with plant cells, the debris likely also includes other cellular debris, including secondary compounds. This non-nuclear debris may take up, and interact with, the stain in unpredictable ways. In extreme cases, such as the Vaccinium example in the “flowPloidy Getting Started” vignette, this cellular debris can completely obscure the G1 and G2 peaks, requiring gating.

The ideal gate would be one that excludes all of the non-nuclear debris, and none of the nuclear debris (i.e., the nuclei fragments). If we could accomplish this, then gating would improve our model-fitting. Leaving non-nuclear debris in the data will result in it getting fit by some combination of the model components, with a negative impact on their accuracy. On the other hand, excluding nuclear debris will reduce the information used to fit the SC or MC components, which will also reduce model accuracy.

Of course, we can't define an ideal gate, anymore than we can optimize our sample preparation such that our histograms are completely free of debris. As a practical approach, we recommend avoiding gating whenever possible, and taking a conservative approach when it is unavoidable.

Tyler Smith

Bagwell, C. B., Mayo, S. W., Whetstone, S. D., Hitchcox, S. A., Baker, D. R., Herbert, D. J., Weaver, D. L., Jones, M. A. and Lovett, E. J. (1991), DNA histogram debris theory and compensation. Cytometry, 12: 107-118. doi: 10.1002/cyto.990120203

## This is an internal function, called from setBins()
## Not run: 
  ## ...
  SCvals <- getSingleCutVals(intensity, xx, startBin)
  MCvals <- getMultipleCutVals(intensity, startBin)
  ## ...
  fhHistData(fh) <- data.frame(xx = xx, intensity = intensity,
                           SCvals = SCvals, MCvals = MCvals,
                           DBvals = DBvals, TRvals = TRvals,
                           QDvals = QDvals, gateResid = gateResid)
  ## ...

## End(Not run)