ModelComponent: An S4 class to represent model components

Description Details Slots Coding Concepts Examples

View source: R/models.R

Description

ModelComponent objects bundle the actual mathematical function for a particular component with various associated data necesarry to incorporate them into a complete NLS model.

Details

To be included in the automatic processing of potential model components, a ModelComponent needs to be added to the variable fhComponents.

Slots

name

character, a convenient name with which to refer to the component

desc

character, a short description of the component, for human readers

color

character, the color to use when plotting the component

includeTest

function, a function which takes a single argument, a FlowHist object, and returns TRUE if the component should be included in the model for that object.

function

function, a single-line function that returns the value of the component. The function can take multiple arguments, which usually will include xx, the bin number (i.e., x value) of the histogram. The other arguments are model parameters, and should be included in the initParams function.

initParams

function, a function with a single argument, a FlowHist object, which returns named list of model parameters and their initial estimates.

specialParams

list, a named list. The names are variables to exclude from the default argument list, as they aren't parameters to fit in the NLS procedure, but are actually fixed values. The body of the list element is the object to insert into the model formula to account for that variable. Note that this slot is not set directly, but should be provided by the value returned by specialParamSetter (which by default is list(xx = substitute(xx))).

specialParamSetter

function, a function with one argument, the FlowHist object, used to set the value of specialParams. This allows parameters to be declared 'special' based on values in the FlowHist object. The default value for this slot is a function which returns list(xx = substitute(xx))

paramLimits

list, a named list with the upper and lower limits of each parameter in the function.

doCounts

logical, should cell counts be evaluated for this component? Used to exclude the debris models, which don't work with R's Integrate function.

Coding Concepts

See the source code file models.R for the actual code used in defining model components. Here are a few examples to illustrate different concepts.

We'll start with the G1 peaks. They are modelled by the components fA1 and fB1 (for the A and B samples). The includeTest for fA1 is simply function(fh) TRUE, since there will always be at least one peak to fit. fB1 is included if there is more than 1 detected peak, and the setting samples is more than 1, so the includeTest is

1
function(fh) nrow(fhPeaks(fh)) > 1 && fhSamples(fh) > 1

The G1 component is defined by the function

1
2
(a1 / (sqrt(2 * pi) * Sa) * exp(-((xx - Ma)^2)/(2 *
  Sa^2)))

with the arguments a1, Ma, Sa, xx. xx is treated specially, by default, and we don't need to deal with it here. The initial estimates for the other parameters are calculated in initParams:

1
2
3
4
5
6
function(fh){
  Ma <- as.numeric(fhPeaks(fh)[1, "mean"])
  Sa <- as.numeric(Ma / 20)
  a1 <- as.numeric(fhPeaks(fh)[1, "height"] * Sa / 0.45)
  list(Ma = Ma, Sa = Sa, a1 = a1)
}

Ma is the mean of the distribution, which should be very close to the peak. Sa is the standard distribution of the distribution. If we assume the CV is 5%, that means the Sa should be 5% of the distribution mean, which gives us a good first estimate. a1 is a scaling parameter, and I came up with the initial estimate by trial-and-error. Given the other two values are going to be reasonably close, the starting value of a1 doesn't seem to be that crucial.

The limits for these values are provided in paramLimits.

1
2
paramLimits = list(Ma = c(0, Inf), Sa = c(0, Inf), a1 =
  c(0, Inf))

They're all bound between 0 and Infinity. The upper bound for Ma and Sa could be lowered to the number of bins, but I haven't had time or need to explore this yet.

The G2 peaks include the d argument, which is the ratio of the G2 peak to the G1 peak. That is, the linearity parameter:

1
2
3
4
func = function(a2, Ma, Sa, d, xx){
  (a2 / (sqrt(2 * pi) * Sa * 2) *
    exp(-((xx - Ma * d)^2)/(2 * (Sa * 2)^2))) 
}

d is the ratio between the G2 and G1 peaks. If linearity = "fixed", it is set to 2. Otherwise, it is fit as a model parameter. This requires special handling. First, we check the linearity value in initParams, and provide a value for d if needed:

1
2
3
res <- list(a2 = a2)
if(fhLinearity(fh) == "variable")
    res <- c(res, d = 2)

Here, a2 is always treated as a parameter, and d is appended to the initial paramter list only if needed.

We also need to use the specialParamSetter function, in this case calling the helper function setLinearity(fh). This function checks the value of linearity, and returns the appropriate object depending on the result.

Note that we use the arguments Ma and Sa appear in the function slot for fA2, but we don't need to provide their initial values or limits. These values are already supplied in the definition of fA1, which is always present when fA2 is.

NB.: This isn't checked in the code! I know fA1 is always present, but there is no automated checking of this fact. If you create a ModelComponent that has parameters that are not defined in that component, and are not defined in other components (like Ma is in this case), you will cause problems. There is also nothing to stop you from defining a parameter multiple times. That is, you could define initial estimates and limits for Ma in fA1 and fA2. This may also cause problems. It would be nice to do some sanity-checking to protect against using parameters without defining initial estimates or limits, or providing multiple/conflicting definitions.

The Single-Cut Debris component is unusual in two ways. It doesn't include the argument xx, but it uses the pre-computed values SCvals. Consequently, we must provide a function for specialParamSetter to deal with this:

1
2
specialParamSetter = function(fh){ list(SCvals =
substitute(SCvals)) } 

The Multi-Cut Debris component MC is similar, but it needs to include xx as a special parameter. The aggregate component AG also includes several special parameters.

For more discussion of the debris components, see DebrisModels.

The code responsible for this is in the file models.R. Accessor functions are provided (but not exported) for getting and setting ModelComponent slots. These functions are named mcSLOT, and include mcFunc, mcColor, mcName, mcDesc, mcSpecialParams, mcSpecialParamSetter, mcIncludeTest, mcInitParams.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
## The 'master list' of components is stored in fhComponents:
flowPloidy:::fhComponents ## outputs a list of component summaries

## adding a new component to the list:
## Not run: 
fhComponents$pois <-
  new("ModelComponent", name = "pois", color = "bisque",
      desc = "A poisson component, as a silly example",
      includeTest = function(fh){
          ## in this case, we check for a flag in the opt slot
          ## We could also base the test on some feature of the
          ## data, perhaps something in the peaks or histData slots
          "pois" %in% fh@opt
      },
      func = function(xx, plam){
          ## The function needs to be complete on a single line, as it
          ## will be 'stitched' together with other functions to make
          ## the complete model.
          exp(-plam)*plam^xx/factorial(xx)
      },
      initParams = function(fh){
          ## If we were to use this function for one of our peaks, we
          ## could use the peak position as our initial estimate of
          ## the Poisson rate parameter:
          plam <- as.numeric(fhPeaks(fh)[1, "mean"])
      },
      ## bound the search for plam between 0 and infinity. Tighter
      ## bounds might be useful, if possible, in speeding up model
      ## fitting and avoiding local minima in extremes.
      paramLimits = list(plam = c(0, Inf)) 
  )

  ## specialParamSetter is not needed here - it will default to a
  ## function that returns "xx = xx", indicating that all other
  ## parameters will be fit. That is what we need for this example. If
  ## the component doesn't include xx, or includes other fixed
  ## parameters, then specialParamSetter will need to be provided.  

  ## Note that if our intention is to replace an existing component with
  ## a new one, we either need to explicitly change the includeTest for
  ## the existing component to account for situations when the new one
  ## is used instead. As a temporary hack, you could add both and then
  ## manually remove one with \code{dropComponents}. 
  
## End(Not run)

flowPloidy documentation built on Nov. 1, 2018, 2:26 a.m.