qfa.fit: Growth curve modelling

Description Usage Arguments Value

View source: R/qfa.R

Description

Given a series of culture density observations from colonyzer.read, this function will fit the generalised logistic growth model to timecourse observations for all colonies by least squares using either the L-BFGS-B algorithm in R's optim function, or the differential evolution, stochastic global optimisation package DEoptim. It will also calculate a numerical Area Under Curve (nAUC) fitness measure by integrating under a loess smooothed version of the dataset if there are sufficient observations or under a linear interpolation between observations if observations are too infrequent.

Usage

1
2
3
qfa.fit(d,inocguess,ORF2gene="ORF2GENE.txt",fmt="%Y-%m-%d_%H-%M-%S",minK=0.025,
detectThresh=0.0005,globalOpt=FALSE,logTransform=FALSE,fixG=TRUE,AUCLim=5,STP=20,
nCores=1,glog=TRUE,modelFit=TRUE,checkSlow=TRUE,nrate=FALSE,...)

Arguments

d

The data.frame containing the timecourse data for each colony (returned from colonyzer.read).

inocguess

The best guess for starting density of viable cells in each colony. This is the g parameter in the generalised logistic model. Typically, for dilute inoculum 384 format spotted cultures, this value cannot be observed directly by photography. inocguess should be in the same units as the values in the Growth column in d. If fixG=TRUE, only values of g within the range 0.9*inocguess and 1.1*inocguess will be assessed during optimisation. Otherwise values within 0.01*inocguess and 100.0*inocguess will be tried. Without a sensible independent estimate for inoculum density, the best we can do is to estimate it based on observed data. Estimating inoculum density will only work well if the inoculum density is high enough to be measurable (e.g. pinned cultures or conc. spotted) and is clearly observed. Clearly observed means: no condensation on plates immediately after they are placed in incubator for example. If we are making an independent estimate of inoculum density, then we should also reset the time at which the experiment "begins". This experiment start time should be the time at which the inoculum density is observed.

ORF2gene

The location of the text file whose first column is of the relevant ORF names and whose second column is of corresponding gene names. If human readable gene names are not important and unique strain identifiers will suffice, set to FALSE.

fmt

The date.time format that the inoculation time (Inoc.Time) and measurement times (Date.Time) are stored in

minK

The minimum value of K above which a strain is said to be alive. Strains with K optimised to lie below this value will be classified as dead, by setting r to be zero.

detectThresh

The minimum detectable cell density (or Growth value) which reliably identifies the presence of cells. Cell densities below this value are classified as noise and discarded.

globalOpt

Flag indicating whether qfa.fit should use the slower, but more robust DEoptim global optimisation functions to fit the generalised logistic model to the data, or the quicker optim function.

logTransform

Experimental flag signalling use of different objective function for optimisation. You should probably ignore this or set it to FALSE

fixG

Flag indicating whether to allow g parameter to vary over a wide or narrow range during optimisation. fixG=TRUE corresponds to narrow constraints on g.

AUCLim

Numerical AUC (nAUC) is calculated as the integral of an approximation of the growth curve between time 0 and AUCLim

STP

Time to use for “Single Time Point” fitness estimate. Defaults to 20 days (very late in growth curve) which is like carrying capacity.

nCores

Can attempt to split model fitting load across multiple parallel cores. Experimental, probably best to leave this value set to default (1)

glog

Boolean (TRUE or FALSE) specifying whether to carry out generalised (asymmetric) logistic model fit to growth curve data. When set to FALSE, carry out simpler logistic model fit (as in Addinall et al. 2011)

modelFit

Boolean (TRUE or FALSE) specifying whether to carry out any model fitting at all. When set to FALSE, only numerical fitness estimates such as nr, nMDP, nAUC are generated

checkSlow

Boolean (TRUE or FALSE) specifying whether to re-optimise curve-fitting for slow-growing strains. If TRUE, slow-growing or dead strains are identified heuristically and a second round of curve fitting using global (but slower) optimisation is carried out. Heuristic identification of slow-growing strains is currently experimental, it seems we have over-tuned these to datasets we capture at Newcastle. If you notice a banding pattern in your MDR or r fitness distributions, please set checkSlow to FALSE.

nrate

Boolean specifiying whether to include numerical rate estimates in the output results.

...

Extra arguments passed to optim

Value

R data.frame, similar to that returned by the colonyzer.read function. The major difference is that instead of a row for every cell density observation for every culture, this object summarises all timecourse density observations for each culture with fitted generalised logistic parameters and numerical fitness estimates.


qfa documentation built on Feb. 22, 2020, 3:01 a.m.

Related to qfa.fit in qfa...