growthSS | R Documentation |
Output from this should be passed to fitGrowth to fit the specified model.
growthSS(
model,
form,
sigma = NULL,
df,
start = NULL,
pars = NULL,
type = "brms",
tau = 0.5,
hierarchy = NULL
)
model |
The name of a model as a character string.
Supported options are c("logistic", "gompertz", "weibull", "frechet", "gumbel", "monomolecular",
"exponential", "linear", "power law", "bragg", "lorentz", "beta",
"double logistic", "double gompertz", "gam", "int"), with "int" representing an intercept only model
which is only used in brms (and is expected to only be used in threshold models or to model
homoskedasticity). Note that the dose response curves (bragg, lorentz, and beta) may be difficult
to fit using the |
form |
A formula describing the model. The left hand side should only be
the outcome variable (phenotype), and a cutoff if you are making a survival model (see details).
The right hand side needs at least the x variable (which should be numeric and is
typically time). Grouping is also described in this formula using roughly lme4
style syntax,with formulas like |
sigma |
Other models for distributional parameters.
This argument is only used with "brms" and "nlme" models and is handled differently for each.
When type="brms" this can be supplied as a model or as a list of models.
It is turned into a formula (or list of formulas) with an entry corresponding to each distributional
parameter (after the mean) of the growth model family.
If no family was specified ( |
df |
A dataframe to use. Must contain all the variables listed in the formula. Note that rows with NA or infinite values in x, y, or hierarchical predictors are removed. |
start |
An optional named list of starting values OR means for prior distributions.
If this is not provided then starting values are picked with |
pars |
Optionally specify which parameters should change by group. Not this is model dependent and is not implemented for brms models due to their more flexible hypothesis testing. |
type |
Type of model to fit, options are "brms", "nlrq", "nlme", "nls", and "mgcv".
Note that the "mgcv" option only supports "gam" models.
Survival models can use the "survreg" model type
(this will be called if any non-brms/flexsurv type is given) or the "flexsurv" model type
which requires the flexsurv package to be installed.
Note that for non-brms models variables in the model will be labeled by the factor level of the
group, not necessarily by the group name.
This is done for ease of use with different modeling functions, the levels are alphabetically sorted
and can be checked using:
|
tau |
A vector of quantiles to fit for nlrq models. |
hierarchy |
Optionally a list of model parameters that should themselves by modeled by another predictor variable. This is only used with the brms backend. |
Default priors are not provided, but these can serve as starting points for each distribution.
You are encouraged to use growthSim
to consider what kind
of trendlines result from changes to your prior and for interpretation of each parameter.
The plotPrior function can be used to do prior predictive checks.
You should not looking back and forth at your data trying to match your
observed growth exactly with a prior distribution,
rather this should be informed by an understanding of the plants you
are using and expectations based on previous research.
For the "double" models the parameter interpretation is the same
as for their non-double counterparts except that there are A and A2, etc.
It is strongly recommended to familiarize yourself with the double sigmoid
distributions using growthSim before attempting to model one. Additionally,
those distributions are intended for use with long delays in an experiment,
think stress recovery experiments, not for minor hiccups in plant growth.
Logistic: list('A' = 130, 'B' = 12, 'C' = 3)
Gompertz: list('A' = 130, 'B' = 12, 'C' = 1.25)
Weibull: list('A' = 130, 'B' = 2, 'C' = 2)
Frechet: list('A' = 130, 'B' = 5, 'C' = 6)
Gumbel: list('A' = 130, 'B' = 6, 'C' = 4)
Double Logistic: list('A' = 130, 'B' = 12, 'C' = 3,
'A2' = 200, 'B2' = 25, 'C2' = 1)
Double Gompertz: list('A' = 130, 'B' = 12, 'C' = 0.25,
'A2' = 220, 'B2' = 30, 'C2' = 0.1)
Monomolecular: list('A' = 130, 'B' = 2)
Exponential: list('A' = 15, 'B' = 0.1)
Linear: list('A' = 1)
Power Law: list('A' = 13, 'B' = 2)
See details below about parameterization for each model option.
Logistic: 'A / (1 + exp( (B-x)/C) )' Where A is the asymptote, B is the inflection point, C is the growth rate.
Gompertz: 'A * exp(-B * exp(-C*x))' Where A is the asymptote, B is the inflection point, C is the growth rate.
Weibull: 'A * (1-exp(-(x/C)^B))' Where A is the asymptote, B is the weibull shape parameter, C is the weibull scale parameter.
Frechet: 'A * exp(-((x-0)/C)^(-B))' Where A is the asymptote, B is the frechet shape parameter, C is the frechet scale parameter. Note that the location parameter (conventionally m) is 0 in these models for simplicity but is still included in the formula.
Gumbel: 'A * exp(-exp(-(x-B)/C))' Where A is the asymptote, B is the inflection point (location), C is the growth rate (scale).
Double Logistic: 'A / (1+exp((B-x)/C)) + ((A2-A) /(1+exp((B2-x)/C2)))' Where A is the asymptote, B is the inflection point, C is the growth rate, A2 is the second asymptote, B2 is the second inflection point, and C2 is the second growth rate.
Double Gompertz: 'A * exp(-B * exp(-C*x)) + ((A2-A) * exp(-B2 * exp(-C2*(x-B))))' Where A is the asymptote, B is the inflection point, C is the growth rate, A2 is the second asymptote, B2 is the second inflection point, and C2 is the second growth rate.
Monomolecular: 'A-A * exp(-B * x)' Where A is the asymptote and B is the growth rate.
Exponential: 'A * exp(B * x)' Where A is the scale parameter and B is the growth rate.
Linear: 'A * x' Where A is the growth rate.
Power Law: 'A * x^(B)' Where A is the scale parameter and B is the growth rate.
Bragg: 'A * exp(-B * (x - C) ^ 2)' This models minima and maxima as a dose-response curve where A is the max response, B is the "precision" or slope at inflection, and C is the x position of the max response.
Lorentz: 'A / (1 + B * (x - C) ^ 2)' This models minima and maxima as a dose-response curve where A is the max response, B is the "precision" or slope at inflection, and C is the x position of the max response. Generally Bragg is preferred to Lorentz for dose-response curves.
Beta: 'A * (((x - D) / (C - D)) * ((E - x) / (E - C)) ^ ((E - C) / (C - D))) ^ B' This models minima and maxima as a dose-response curve where A is the Maximum Value, B is a shape/concavity exponent similar to the sum of alpha and beta in a Beta distribution, C is the position of maximum value, D is the minimum position where distribution > 0, E is the maximum position where distribution > 0. This is a difficult model to fit but can model non-symmetric dose-response relationships which may sometimes be worth the extra effort.
Note that for these distributions parameters do not exist in a vacuum.
Changing one will make the others look different in the resulting data.
The growthSim
function can be helpful in familiarizing further with these distributions.
Using the brms
backend the sigma
argument optionally specifies a sub model to account
for heteroskedasticity.
Any combination of models (except for decay models) can be specified in the sigma
term.
If you need variance to raise and lower then a gam/spline is the most appropriate option.
Using the brms
backend a model with lots of parameters may be difficult to estimate if there
are lots of groups.
If you have very many levels of your "group" variable in a complex model then consider fitting models
to subsets of the "group" variable and using combineDraws to make a data.frame for
hypothesis testing.
Limits on the Y variable can be specified in the brms
backend. This should generally be
unnecessary and will make the model slower to fit and potentially more difficult to set priors on.
If you do have a limited phenotype (besides the normal positive constraint for growth models)
then this may be helpful, one situation may be canopy coverage percentage which is naturally bounded
at an upper and lower limit.
To specify these limits add square brackets to the Y term with upper and lower limits such as
"y[0,100] ~ time|id/group"
. Other "Additional response information" such as resp_weights or
standard errors can be specified using the brms
backend, with those options documented fully
in the brms::brmsformula
details.
There are also three supported submodel options for nlme
models, but a varFunc
object
can also be supplied, see ?nlme::varClasses
.
none: varIdent(1|group)
, which models a constant variance separately for each
group.
power: varPower(x|group)
, which models variance as a power of x per group.
exp: varExp(x|group)
, which models variance as an exponent of x per group.
Survival models can be fit using the "survival" keyword in the model specification.
Using the "brms" backend (type argument) you can specify "weibull" (the default) or "binomial" for
the distribution to use in that model so that the final model string would be "survival binomial" or
"survival weibull" which is equivalent to "survival". Time to event data is very different than
standard phenotype data, so the formula argument should include a cutoff for the Y variable to count
as an "event". For example, if you were checking germination using area and wanted to use 50 pixels
as a germinated plant your formula would be area > 50 ~ time|id/group
.
Internally the input dataframe will be converted to time-to-event data based on that formula.
Alternatively you can make your own time to event data and supply that to growthSS. In that case your
data should have columns called "n_events"
(number of individuals experiencing the event at this time) and "n_eligible"
(number of individuals who had not experienced the event at least up to this time)
for the binomial model family OR "event" (binary 1,0 for TRUE, FALSE) for the Weibull model family.
Note that since these are linear models using different model families the priors are handled
differently. For survival models the default priors are weak regularizing priors (Normal(0,5))
on all parameters. If you wish to specify your own priors you can supply them as brmsprior objects
or as a list such as priors = list("group1" = c(0,3), "group2" = c(0,1))
where the order of
values is Mu, Sigma.
Any non-brms backend will instead use survival::survreg
to fit the model unless the
"flexsurv" type is specified.
Distributions will be passed to survreg
where options are "weibull", "exponential",
"gaussian", "logistic","lognormal" and "loglogistic" if type = "survreg" or to
flexsurv::flexsurvreg
if type = "flexsurv" where options are "gengamma", "gengamma.orig",
"genf", "genf.orig", "weibull", "gamma", "exp", "llogis", "lnorm", "gompertz", "exponential",
and "lognormal". In flexsurvreg
distributional modeling is supported and additional
formula can be passed as a list to the sigma argument of growthSS in the same way as to the anc
argument of flexsurv::flexsurvreg
.
Further additional arguments should be supplied via fitGrowth
if desired.
A named list of elements to make it easier to fit non linear growth models with several R packages.
For brms
models the output contains:
formula
: A brms::bf
formula specifying the growth model, autocorrelation,
variance submodel, and models for each variable in the growth model.
prior
: A brmsprior/data.frame object.
initfun
: A function to randomly initialize chains using a random draw from a gamma
distribution (confines initial values to positive and makes correct number
of initial values for chains and groups).
df
The data input, with dummy variables added if needed and a column to link groups to their
factor levels.
family
The model family, currently this will always be "student".
pcvrForm
The form argument unchanged. This is returned so that
it can be used later on in model visualization. Often it may be a good idea
to save the output of this function with the fit model, so having this can
be useful later on.
For quantreg::nlrq
models the output contains:
formula
: An nls
style formula specifying the growth model with groups if specified.
taus
: The quantiles to be fit
start
: The starting values, typically these will be generated from the growth model and your
data in a similar way as shown in stats::selfStart
models.
df
The input data for the model.
pcvrForm
The form argument unchanged.
For nls
models the output is the same as for quantreg::nlrq
models but without
taus
returned.
For nlme::nlme
models the output contains:
formula
: An list of nlme
style formulas specifying the model, fixed and random effects,
random effect grouping, and variance model (weights).
start
: The starting values, typically these will be generated from the growth model and your
data in a similar way as shown in stats::selfStart
models.
df
The input data for the model.
pcvrForm
The form argument unchanged.
For all models the type and model are also returned for simplicity downstream.
fitGrowth for fitting the model specified by this list and mvSS for the multi-value trait equivalent.
simdf <- growthSim("logistic",
n = 20, t = 25,
params = list("A" = c(200, 160), "B" = c(13, 11), "C" = c(3, 3.5))
)
ss <- growthSS(
model = "logistic", form = y ~ time | id / group,
sigma = "spline", df = simdf,
start = list("A" = 130, "B" = 12, "C" = 3), type = "brms"
)
lapply(ss, class)
ss$initfun()
# the next step would typically be compiling/fitting the model
# here we use very few chains and very few iterations for speed, but more of both is better.
fit_test <- fitGrowth(ss,
iter = 500, cores = 1, chains = 1, backend = "cmdstanr",
control = list(adapt_delta = 0.999, max_treedepth = 20)
)
# formulas and priors will look different if there is only one group in the data
ex <- growthSim("linear", n = 20, t = 25, params = list("A" = 2))
ex_ss <- growthSS(
model = "linear", form = y ~ time | id / group, sigma = "spline",
df = ex, start = list("A" = 1), type = "brms"
)
ex_ss$prior # no coef level grouping for priors
ex_ss$formula # intercept only model for A
ex2 <- growthSim("linear", n = 20, t = 25, params = list("A" = c(2, 2.5)))
ex2_ss <- growthSS(
model = "linear", form = y ~ time | id / group, sigma = "spline",
df = ex2, start = list("A" = 1), type = "brms"
)
ex2_ss$prior # has coef level grouping for priors
ex2_ss$formula # specifies an A intercept for each group and splines by group for sigma
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.