Description Usage Arguments Value
Given a series of culture density observations from colonyzer.read, this function will fit the generalised logistic growth model to timecourse observations for all colonies by least squares using either the L-BFGS-B algorithm in R's optim function, or the differential evolution, stochastic global optimisation package DEoptim. It will also calculate a numerical Area Under Curve (nAUC) fitness measure by integrating under a loess smooothed version of the dataset if there are sufficient observations or under a linear interpolation between observations if observations are too infrequent.
1 2 3 |
d |
The data.frame containing the timecourse data for each colony (returned from colonyzer.read). |
inocguess |
The best guess for starting density of viable cells in each colony. This is the g parameter in the generalised logistic model. Typically, for dilute inoculum 384 format spotted cultures, this value cannot be observed directly by photography. inocguess should be in the same units as the values in the Growth column in d. If fixG=TRUE, only values of g within the range 0.9*inocguess and 1.1*inocguess will be assessed during optimisation. Otherwise values within 0.01*inocguess and 100.0*inocguess will be tried. Without a sensible independent estimate for inoculum density, the best we can do is to estimate it based on observed data. Estimating inoculum density will only work well if the inoculum density is high enough to be measurable (e.g. pinned cultures or conc. spotted) and is clearly observed. Clearly observed means: no condensation on plates immediately after they are placed in incubator for example. If we are making an independent estimate of inoculum density, then we should also reset the time at which the experiment "begins". This experiment start time should be the time at which the inoculum density is observed. |
ORF2gene |
The location of the text file whose first column is of the relevant ORF names and whose second column is of corresponding gene names. If human readable gene names are not important and unique strain identifiers will suffice, set to FALSE. |
fmt |
The date.time format that the inoculation time (Inoc.Time) and measurement times (Date.Time) are stored in |
minK |
The minimum value of K above which a strain is said to be alive. Strains with K optimised to lie below this value will be classified as dead, by setting r to be zero. |
detectThresh |
The minimum detectable cell density (or Growth value) which reliably identifies the presence of cells. Cell densities below this value are classified as noise and discarded. |
globalOpt |
Flag indicating whether qfa.fit should use the slower, but more robust DEoptim global optimisation functions to fit the generalised logistic model to the data, or the quicker optim function. |
logTransform |
Experimental flag signalling use of different objective function for optimisation. You should probably ignore this or set it to FALSE |
fixG |
Flag indicating whether to allow g parameter to vary over a wide or narrow range during optimisation. fixG=TRUE corresponds to narrow constraints on g. |
AUCLim |
Numerical AUC (nAUC) is calculated as the integral of an approximation of the growth curve between time 0 and AUCLim |
STP |
Time to use for “Single Time Point” fitness estimate. Defaults to 20 days (very late in growth curve) which is like carrying capacity. |
nCores |
Can attempt to split model fitting load across multiple parallel cores. Experimental, probably best to leave this value set to default (1) |
glog |
Boolean (TRUE or FALSE) specifying whether to carry out generalised (asymmetric) logistic model fit to growth curve data. When set to FALSE, carry out simpler logistic model fit (as in Addinall et al. 2011) |
modelFit |
Boolean (TRUE or FALSE) specifying whether to carry out any model fitting at all. When set to FALSE, only numerical fitness estimates such as nr, nMDP, nAUC are generated |
checkSlow |
Boolean (TRUE or FALSE) specifying whether to re-optimise curve-fitting for slow-growing strains. If TRUE, slow-growing or dead strains are identified heuristically and a second round of curve fitting using global (but slower) optimisation is carried out. Heuristic identification of slow-growing strains is currently experimental, it seems we have over-tuned these to datasets we capture at Newcastle. If you notice a banding pattern in your MDR or r fitness distributions, please set checkSlow to FALSE. |
nrate |
Boolean specifiying whether to include numerical rate estimates in the output results. |
... |
Extra arguments passed to optim |
R data.frame, similar to that returned by the colonyzer.read function. The major difference is that instead of a row for every cell density observation for every culture, this object summarises all timecourse density observations for each culture with fitted generalised logistic parameters and numerical fitness estimates.
Barcode - Unique plate identifier
Row - Row number (counting from top of image) of culture in rectangular gridded array
Col - Column number (counting from left of image) of culture in rectangular gridded array
ScreenID - Unique identifier for this QFA screen
Treatment - Conditions applied externally to plates (e.g. temperature(s) at which cultures were grown, UV irradiation applied, etc.)
Medium - Nutrients/drugs in plate agar
ORF - Systematic, unique identifier for genotype in this position in arrayed library
Screen.Name - Name of screen (identifies biological repeats, and experiment)
Library.Name - Name of library, specifying particular culture location
MasterPlate Number - Library plate identifier
Timeseries order - Sequential photograph number
Inoc.Time - User specified date and time of inoculation (specified in ExptDescription.txt file)
TileX - Culture tile width (pixels)
TileY - Culture tile height (pixels)
XOffset - x-coordinate of top left corner of rectangular tile bounding culture (number of pixels from left of image)
YOffset - y-coordinate of top left corner of rectangular tile bounding culture (number of pixels from top of image)
Threshold - Global pixel intensity threshold used for image segmentation (after lighting correction)
EdgeLength - Number of culture pixels classified as being microcolony edge pixels (useful for classifying contaminants in cultures grown from dilute inoculum)
EdgePixels - Number of pixels classified as culture on edge of square tile
RepQuad - Integer identifying which of the quadrants of a 1536 plate were used to inoculate the current 384 plate (set equal to 1 for all cultures for 1536 format for example)
K - Generalised logistic model carrying capacity
r - Generalised logistic model rate parameter
g - Generalised logistic model inoculum density (referred to in vignette as $g_0$)
v - Generalised logistic model shape parameter (set to 1 to recover logistic model)
objval - Objective function value at selected optimum
tshift - Shift applied to observation times before fitting logistic model (need to apply same shift before overlaying curve on expt. obs.). Default is zero (expt. starts at inoculation time specified in experimental description file), but if qfa.fit function is called with inocguess=NULL, then the start of experiment is redefined as the time of the first reliable density observation.
t0 - Time of first detectable cell density observation (i.e. above detectThresh)
d0 - Normalised cell density of first observation (be careful about condensation on plates when using this). Note this is not necessarily the density at t0.
nAUC - Numerical Area Under Curve. This is a model-free fitness estimate.
nSTP - Single Time Point fitness. Cell density at time STP, as estimated with approximating function. This is a model-free fitness estimate.
nr - Numerical estimate of intrinsic growth rate. Growth rate estimated by fitting smoothing function to log of data, calculating numerical slope estimate across range of data and selecting the maximum estimate (should occur during exponential phase).
nr_t - Time at which maximum slope of log observations occurs
maxslp - Numerical estimate of maximum slope of growth curve. Slope estimated by fitting smoothing function to untransformed data and calculating numerical slope estimate of smoothed version of data and selecting the maximum estimate (should occur approximately half way through growth). This fitness measure will be affected by both rate of growth and final colony size. Final colony size is expected to be strongly affected by competition between cultures.
maxslp_t - Time at which maximum slope of observations occurs
Client - Client for whom screen was carried out
ExptDate - A representative/approximate date for the experiment (note that genome-wide QFA screens typically take weeks to complete)
User - Person who actually carried out screen
PI - Principal investigator leading project that screen is part of
Condition - The most important defining characteristic of screen, as specified by user (e.g. the temperature screen was carried out at if screen is part of multi-temperature set of screens, or the query mutation if part of a set of screens comparing query mutations, or the drugs present in the medium if part of a set of drug screens)
Inoc - Qualitative identifier of inoculation type (e.g. "DIL" for dilute inoculum, "CONC" for concentrated). Used to distinguish between experiments carried out with different methods of inoculation.
Gene - Identifier for genotype at a particular location on an agar plate. Typically prefer unambiguous, systematic gene names here.
TrtMed - Combination of treatment and medium identifiers, specifying the environment in which the cells have grown
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.