fit_growth: Fitting microbial growth

View source: R/top_fit.R

fit_growthR Documentation

Fitting microbial growth

Description

[Stable]

This function provides a top-level interface for fitting growth models to data describing the variation of the population size through time, either under constant or dynamic environment conditions. See below for details on the calculations.

Usage

fit_growth(
  fit_data,
  model_keys,
  start,
  known,
  environment = "constant",
  algorithm = "regression",
  approach = "single",
  env_conditions = NULL,
  niter = NULL,
  ...,
  check = TRUE,
  logbase_mu = logbase_logN,
  logbase_logN = 10,
  formula = logN ~ time
)

Arguments

fit_data

observed microbial growth. The format varies depending on the type of model fit. See the relevant sections (and examples) below for details.

model_keys

a named list assigning equations for the primary and secondary models. See the relevant sections (and examples) below for details.

start

a named numeric vector assigning initial guesses to the model parameters to estimate from the data. See relevant section (and examples) below for details.

known

named numeric vector of fixed model parameters, using the same conventions as for "start".

environment

type of environment. Either "constant" (default) or "dynamic" (see below for details on the calculations for each condition)

algorithm

either "regression" (default; Levenberg-Marquard algorithm) or "MCMC" (Adaptive Monte Carlo algorithm).

approach

approach for model fitting. Either "single" (the model is fitted to a unique experiment) or "global" (the model is fitted to several dynamic experiments).

env_conditions

Tibble describing the variation of the environmental conditions for dynamic experiments. See the relevant sections (and examples) below for details. Ignored for environment="constant".

niter

number of iterations of the MCMC algorithm. Ignored when algorithm!="MCMC".

...

Additional arguments for modFit().

check

Whether to check the validity of the models. TRUE by default.

logbase_mu

Base of the logarithm the growth rate is referred to. By default, the same as logbase_logN. See vignette about units for details.

logbase_logN

Base of the logarithm for the population size. By default, 10 (i.e. log10). See vignette about units for details.

formula

An object of class "formula" defining the names of the x and y variables in the data. logN ~ time as a default.

Value

If ⁠approach="single⁠, an instance of GrowthFit. If approach="multiple", an instance of GlobalGrowthFit

Please check the help pages of each class for additional information.

Fitting under constant conditions

When environment="constant", the functions fits a primary growth model to the population size observed during an experiment. In this case, the data has to be a tibble (or data.frame) with two columns:

  • time: the elapsed time

  • logN: the logarithm of the observed population size Nonetheless, the names of the columns can be modified with the formula argument.

The model equation is defined through the model_keys argument. It must include an entry named "primary" assigned to a model. Valid model keys can be retrieved calling primary_model_data().

The model is fitted by non-linear regression (using modFit()). This algorithm needs initial guesses for every model parameter. This are defined as a named numeric vector. The names must be valid model keys, which can be retrieved using primary_model_data() (see example below). Apart from that, any model parameter can be fixed using the "known" argument. This is a named numeric vector, with the same convenctions as "start".

Fitting under dynamic conditions to a single experiment

When environment="constant" and approach="single", a dynamic growth model combining the Baranyi primary growth model with the gamma approach for the effect of the environmental conditions on the growth rate is fitted to an experiment gathered under dynamic conditions. In this case, the data is similar to fitting under constant conditions: a tibble (or data.frame) with two columns:

  • time: the elapsed time

  • logN: the logarithm of the observed population size Note that these default names can be changed using the formula argument.

The values of the experimental conditions during the experiment are defined using the "env_conditions" argument. It is a tibble (or data.frame) with one column named ("time") defining the elapsed time. Note that this default name can be modified using the formula argument of the function. The tibble needs to have as many additional columns as environmental conditions included in the model, providing the values of the environmental conditions.

The model equations are defined through the model_keys argument. It must be a named list where the names match the column names of "env_conditions" and the values are model keys. These can be retrieved using secondary_model_data().

The model can be fitted using regression (modFit()) or an adaptive Monte Carlo algorithm (modMCMC()). Both algorithms require initial guesses for every model parameter to fit. These are defined through the named numeric vector "start". Each parameter must be named as factor+"_"+parameter, where factor is the name of the environmental factor defined in "model_keys". The parameter is a valid key that can be retrieved from secondary_model_data(). For instance, parameter Xmin for the factor temperature would be defined as "temperature_xmin".

Note that the argument ... allows passing additional arguments to the fitting functions.

Fitting under dynamic conditions to multiple experiments (global fitting)

When environment="constant" and approach="global", fit_growth tries to find the vector of model parameters that best describe the observations of several growth experiments.

The input requirements are very similar to the case when approach="single". The models (equations, initial guesses, known parameters, algorithms...) are identical. The only difference is that "fit_data" must be a list, where each element describes the results of an experiment (using the same conventions as when approach="single"). In a similar fashion, "env_conditions" must be a list describing the values of the environmental factors during each experiment. Although it is not mandatory, it is recommended that the elements of both lists are named. Otherwise, the function assigns automatically-generated names, and matches them by order.#'

Examples


## Example 1 - Fitting a primary model --------------------------------------

## A dummy dataset describing the variation of the population size 

my_data <- data.frame(time = c(0, 25, 50, 75, 100), 
                      logN = c(2, 2.5, 7, 8, 8))
                      
## A list of model keys can be gathered from 

primary_model_data()
                      
## The primary model is defined as a list

models <- list(primary = "Baranyi")

## The keys of the model parameters can also be gathered from primary_model_data

primary_model_data("Baranyi")$pars

## Any model parameter can be fixed

known <- c(mu = .2)

## The remaining parameters need initial guesses 

start <- c(logNmax = 8, lambda = 25, logN0 = 2)

primary_fit <- fit_growth(my_data, models, start, known,
                          environment = "constant",
                          )
                          
## The instance of FitIsoGrowth includes several useful methods

print(primary_fit)
plot(primary_fit)
coef(primary_fit)
summary(primary_fit)

## time_to_size can be used to calculate the time for some concentration

time_to_size(primary_fit, 4)

## Example 2 - Fitting under dynamic conditions------------------------------

## We will use the example data included in the package

data("example_dynamic_growth")

## And the example environmental conditoins (temperature & aw)

data("example_env_conditions")

## Valid keys for secondary models can be retrived from

secondary_model_data()

## We need to assign a model equation (secondary model) to each environmental factor

sec_models <- list(temperature = "CPM", aw = "CPM")

## The keys of the model parameters can be gathered from the same function

secondary_model_data("CPM")$pars

## Any model parameter (of the primary or secondary models) can be fixed

known_pars <- list(Nmax = 1e4,  # Primary model
                   N0 = 1e0, Q0 = 1e-3,  # Initial values of the primary model
                   mu_opt = 4, # mu_opt of the gamma model
                   temperature_n = 1,  # Secondary model for temperature
                   aw_xmax = 1, aw_xmin = .9, aw_n = 1  # Secondary model for water activity
                   )
                   
## The rest, need initial guesses (you know, regression)

my_start <- list(temperature_xmin = 25, temperature_xopt = 35,
                 temperature_xmax = 40, aw_xopt = .95)
                 
## We can now fit the model


dynamic_fit <- fit_growth(example_dynamic_growth, 
                          sec_models, 
                          my_start, known_pars,
                          environment = "dynamic",
                          env_conditions = example_env_conditions
                          ) 
                          
## The instance of FitDynamicGrowth has several S3 methods

plot(dynamic_fit, add_factor = "temperature")
summary(dynamic_fit)

## We can use time_to_size to calculate the time required to reach a given size

time_to_size(dynamic_fit, 3)



## Example 3- Fitting under dynamic conditions using MCMC -------------------

## We can reuse most of the arguments from the previous example
## We just need to define the algorithm and the number of iterations


set.seed(12421)
MCMC_fit <- fit_growth(example_dynamic_growth, 
                       sec_models, 
                       my_start, known_pars,
                       environment = "dynamic",
                       env_conditions = example_env_conditions,
                       algorithm = "MCMC",
                       niter = 1000
                       ) 
                       
## The instance of FitDynamicGrowthMCMC has several S3 methods

plot(MCMC_fit, add_factor = "aw")
summary(MCMC_fit)

## We can use time_to_size to calculate the time required to reach a given size

time_to_size(MCMC_fit, 3)

## It can also make growth predictions including uncertainty

uncertain_growth <- predictMCMC(MCMC_fit, 
                                seq(0, 10, length = 1000),  
                                example_env_conditions, 
                                niter = 1000)

## The instance of MCMCgrowth includes several nice S3 methods

plot(uncertain_growth)
print(uncertain_growth)

## time_to_size can calculate the time to reach some count

time_to_size(uncertain_growth, 2)
time_to_size(uncertain_growth, 2, type = "distribution")



## Example 4 - Fitting a unique model to several dynamic experiments --------

## We will use the data included in the package

data("multiple_counts")
data("multiple_conditions")

## We need to assign a model equation for each environmental factor

sec_models <- list(temperature = "CPM", pH = "CPM")

## Any model parameter (of the primary or secondary models) can be fixed

known_pars <- list(Nmax = 1e8, N0 = 1e0, Q0 = 1e-3,
                   temperature_n = 2, temperature_xmin = 20, 
                   temperature_xmax = 35,
                   pH_n = 2, pH_xmin = 5.5, pH_xmax = 7.5, pH_xopt = 6.5)
                   
## The rest, need initial guesses

my_start <- list(mu_opt = .8, temperature_xopt = 30)

## We can now fit the model


global_fit <- fit_growth(multiple_counts, 
                         sec_models, 
                         my_start, 
                         known_pars,
                         environment = "dynamic",
                         algorithm = "regression",
                         approach = "global",
                         env_conditions = multiple_conditions
                         ) 
                         
## The instance of FitMultipleDynamicGrowth has nice S3 methods

plot(global_fit)
summary(global_fit)
print(global_fit)

## We can use time_to_size to calculate the time to reach a given size

time_to_size(global_fit, 4.5)



## Example 5 - MCMC fitting a unique model to several dynamic experiments ---

## Again, we can re-use all the arguments from the previous example
## We just need to define the right algorithm and the number of iterations
## On top of that, we will also pass upper and lower bounds to modMCMC


set.seed(12421)
global_MCMC <- fit_growth(multiple_counts, 
                         sec_models, 
                         my_start, 
                         known_pars,
                         environment = "dynamic",
                         algorithm = "MCMC",
                         approach = "global",
                         env_conditions = multiple_conditions,
                         niter = 1000,
                         lower = c(.2, 29),  # lower limits of the model parameters
                         upper = c(.8, 34)  # upper limits of the model parameters
                         ) 
                         
## The instance of FitMultipleDynamicGrowthMCMC has nice S3 methods

plot(global_MCMC)
summary(global_MCMC)
print(global_MCMC)

## We can use time_to_size to calculate the time to reach a given size

time_to_size(global_MCMC, 3)

## It can also be used to make model predictions with parameter uncertainty

uncertain_prediction <- predictMCMC(global_MCMC,
                                    seq(0, 50, length = 1000), 
                                    multiple_conditions[[1]], 
                                    niter = 100
                                    )

## The instance of MCMCgrowth includes several nice S3 methods

plot(uncertain_growth)
print(uncertain_growth)

## time_to_size can calculate the time to reach some count

time_to_size(uncertain_growth, 2)
time_to_size(uncertain_growth, 2, type = "distribution")




biogrowth documentation built on Aug. 19, 2023, 1:06 a.m.