modelsearch: Develop Best-fit Vital Rate Estimation Models for MPM...
In lefko3: Historical and Ahistorical Population Projection Matrix Analysis

modelsearch

R Documentation

Develop Best-fit Vital Rate Estimation Models for MPM Development

Description

Function modelsearch() runs exhaustive model building and selection for each vital rate needed to estimate a function-based MPM or IPM. It returns best-fit models for each vital rate, model table showing all models tested, and model quality control data. The final output can be used as input in other functions within this package.

Usage

modelsearch(
  data,
  stageframe = NULL,
  historical = TRUE,
  approach = "mixed",
  suite = "size",
  interactions = FALSE,
  bestfit = "AICc&k",
  vitalrates = c("surv", "size", "fec"),
  surv = c("alive3", "alive2", "alive1"),
  obs = c("obsstatus3", "obsstatus2", "obsstatus1"),
  size = c("sizea3", "sizea2", "sizea1"),
  sizeb = c(NA, NA, NA),
  sizec = c(NA, NA, NA),
  repst = c("repstatus3", "repstatus2", "repstatus1"),
  fec = c("feca3", "feca2", "feca1"),
  stage = c("stage3", "stage2", "stage1"),
  matstat = c("matstatus3", "matstatus2", "matstatus1"),
  indiv = "individ",
  patch = NA,
  year = "year2",
  density = NA,
  test.density = FALSE,
  sizedist = "gaussian",
  sizebdist = NA,
  sizecdist = NA,
  fecdist = "gaussian",
  size.zero = FALSE,
  sizeb.zero = FALSE,
  sizec.zero = FALSE,
  size.trunc = FALSE,
  sizeb.trunc = FALSE,
  sizec.trunc = FALSE,
  fec.zero = FALSE,
  fec.trunc = FALSE,
  patch.as.random = TRUE,
  year.as.random = TRUE,
  juvestimate = NA,
  juvsize = FALSE,
  jsize.zero = FALSE,
  jsizeb.zero = FALSE,
  jsizec.zero = FALSE,
  jsize.trunc = FALSE,
  jsizeb.trunc = FALSE,
  jsizec.trunc = FALSE,
  fectime = 2,
  censor = NA,
  age = NA,
  test.age = FALSE,
  indcova = NA,
  indcovb = NA,
  indcovc = NA,
  random.indcova = FALSE,
  random.indcovb = FALSE,
  random.indcovc = FALSE,
  test.indcova = FALSE,
  test.indcovb = FALSE,
  test.indcovc = FALSE,
  annucova = NA,
  annucovb = NA,
  annucovc = NA,
  test.annucova = FALSE,
  test.annucovb = FALSE,
  test.annucovc = FALSE,
  test.group = FALSE,
  show.model.tables = TRUE,
  global.only = FALSE,
  accuracy = TRUE,
  data_out = FALSE,
  quiet = FALSE
)

Arguments

`data`	The vertical dataset to be used for analysis. This dataset should be of class `hfvdata`, but can also be a data frame formatted similarly to the output format provided by functions `verticalize3()` or `historicalize3()`, as long as all needed variables are properly designated.
`stageframe`	The stageframe characterizing the life history model used. Optional unless `test.group = TRUE`, in which case it is required. Defaults to `NULL`.
`historical`	A logical variable denoting whether to assess the effects of state in occasion t-1, in addition to state in occasion t. Defaults to `TRUE`.
`approach`	The statistical approach to be taken for model building. The default is `"mixed"`, which uses the mixed model approach utilized in packages `lme4` and `glmmTMB`. Other options include `"glm"`, which uses generalized linear modeling assuming that all factors are fixed.
`suite`	Either a single string value or a vector of 14 strings for each vital rate model. Describes the global model for each vital rate estimation, and has the following possible values: `full`, includes main effects and all two-way interactions of size and reproductive status; `main`, includes main effects only of size and reproductive status; `size`, includes only size (also interactions between size in historical model); `rep`, includes only reproductive status (also interactions between status in historical model); `age`, all vital rates estimated with age and y-intercepts only; `cons`, all vital rates estimated only as y-intercepts. If `approach = "glm"` and `year.as.random = FALSE`, then year is also included as a fixed effect, and, if `interactions = TRUE`, is included in two-way interactions. Order of models in the string vector if more than 1 value is used is: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `size`.
`interactions`	A variable denoting whether to include two-way interactions between all fixed factors in the global model. Defaults to `FALSE`.
`bestfit`	A variable indicating the model selection criterion for the choice of best-fit model. The default is `AICc&k`, which chooses the best-fit model as the model with the lowest AICc or, if not the same model, then the model that has the lowest degrees of freedom among models with `\Delta AICc <= 2.0`. Alternatively, `AICc` may be chosen, in which case the best-fit model is simply the model with the lowest AICc value.
`vitalrates`	A vector describing which vital rates will be estimated via linear modeling, with the following options: `surv`, survival probability; `obs`, observation probability; `size`, overall size; `repst`, probability of reproducing; and `fec`, amount of reproduction (overall fecundity). May also be set to `vitalrates = "leslie"`, which is equivalent to setting `c("surv", "fec")` for a Leslie MPM. This choice also determines how internal data subsetting for vital rate model estimation will work. Defaults to `c("surv", "size", "fec")`.
`surv`	A vector indicating the variable names coding for status as alive or dead in occasions t+1, t, and t-1, respectively. Defaults to `c("alive3", "alive2", "alive1")`.
`obs`	A vector indicating the variable names coding for observation status in occasions t+1, t, and t-1, respectively. Defaults to `c("obsstatus3", "obsstatus2", "obsstatus1")`.
`size`	A vector indicating the variable names coding for the primary size variable on occasions t+1, t, and t-1, respectively. Defaults to `c("sizea3", "sizea2", "sizea1")`.
`sizeb`	A vector indicating the variable names coding for the secondary size variable on occasions t+1, t, and t-1, respectively. Defaults to `c(NA, NA, NA)`, in which case `sizeb` is not used.
`sizec`	A vector indicating the variable names coding for the tertiary size variable on occasions t+1, t, and t-1, respectively. Defaults to `c(NA, NA, NA)`, in which case `sizec` is not used.
`repst`	A vector indicating the variable names coding for reproductive status in occasions t+1, t, and t-1, respectively. Defaults to `c("repstatus3", "repstatus2", "repstatus1")`.
`fec`	A vector indicating the variable names coding for fecundity in occasions t+1, t, and t-1, respectively. Defaults to `c("feca3", "feca2", "feca1")`.
`stage`	A vector indicating the variable names coding for stage in occasions t+1, t, and t-1. Defaults to `c("stage3", "stage2", "stage1")`.
`matstat`	A vector indicating the variable names coding for maturity status in occasions t+1, t, and t-1. Defaults to `c("matstatus3", "matstatus2", "matstatus1")`.
`indiv`	A text value indicating the variable name coding individual identity. Defaults to `"individ"`.
`patch`	A text value indicating the variable name coding for patch, where patches are defined as permanent subgroups within the study population. Defaults to `NA`.
`year`	A text value indicating the variable coding for observation occasion t. Defaults to `year2`.
`density`	A text value indicating the name of the variable coding for spatial density, should the user wish to test spatial density as a fixed factor affecting vital rates. Defaults to `NA`.
`test.density`	Either a logical value indicating whether to include `density` as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`sizedist`	The probability distribution used to model primary size. Options include `"gaussian"` for the Normal distribution (default), `"poisson"` for the Poisson distribution, `"negbin"` for the negative binomial distribution (quadratic parameterization), and `"gamma"` for the Gamma distribution.
`sizebdist`	The probability distribution used to model secondary size. Options include `"gaussian"` for the Normal distribution, `"poisson"` for the Poisson distribution, `"negbin"` for the negative binomial distribution (quadratic parameterization), and `"gamma"` for the Gamma distribution. Defaults to `NA`.
`sizecdist`	The probability distribution used to model tertiary size. Options include `"gaussian"` for the Normal distribution, `"poisson"` for the Poisson distribution, `"negbin"` for the negative binomial distribution (quadratic parameterization), and `"gamma"` for the Gamma distribution. Defaults to `NA`.
`fecdist`	The probability distribution used to model fecundity. Options include `"gaussian"` for the Normal distribution (default), `"poisson"` for the Poisson distribution, `"negbin"` for the negative binomial distribution (quadratic parameterization), and `"gamma"` for the Gamma distribution.
`size.zero`	A logical variable indicating whether the primary size distribution should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`sizeb.zero`	A logical variable indicating whether the secondary size distribution should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`sizec.zero`	A logical variable indicating whether the tertiary size distribution should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`size.trunc`	A logical variable indicating whether the primary size distribution should be zero-truncated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`. Cannot be `TRUE` if `size.zero = TRUE`.
`sizeb.trunc`	A logical variable indicating whether the secondary size distribution should be zero-truncated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`. Cannot be `TRUE` if `sizeb.zero = TRUE`.
`sizec.trunc`	A logical variable indicating whether the tertiary size distribution should be zero-truncated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`. Cannot be `TRUE` if `sizec.zero = TRUE`.
`fec.zero`	A logical variable indicating whether the fecundity distribution should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`fec.trunc`	A logical variable indicating whether the fecundity distribution should be zero-truncated. Only applies to the Poisson and negative binomial distributions. Defaults to `FALSE`. Cannot be `TRUE` if `fec.zero = TRUE`.
`patch.as.random`	If set to `TRUE` and `approach = "mixed"`, then `patch` is included as a random factor. If set to `FALSE` and `approach = "glm"`, then `patch` is included as a fixed factor. All other combinations of logical value and `approach` lead to `patch` not being included in modeling. Defaults to `TRUE`.
`year.as.random`	If set to `TRUE` and `approach = "mixed"`, then `year` is included as a random factor. If set to `FALSE`, then `year` is included as a fixed factor. All other combinations of logical value and `approach` lead to `year` not being included in modeling. Defaults to `TRUE`.
`juvestimate`	An optional variable denoting the stage name of the juvenile stage in the vertical dataset. If not `NA`, and `stage` is also given (see below), then vital rates listed in `vitalrates` other than `fec` will also be estimated from the juvenile stage to all adult stages. Defaults to `NA`, in which case juvenile vital rates are not estimated.
`juvsize`	A logical variable denoting whether size should be used as a term in models involving transition from the juvenile stage. Defaults to `FALSE`, and is only used if `juvestimate` does not equal `NA`.
`jsize.zero`	A logical variable indicating whether the primary size distribution of juveniles should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`jsizeb.zero`	A logical variable indicating whether the secondary size distribution of juveniles should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`jsizec.zero`	A logical variable indicating whether the tertiary size distribution of juveniles should be zero-inflated. Only applies to Poisson and negative binomial distributions. Defaults to `FALSE`.
`jsize.trunc`	A logical variable indicating whether the primary size distribution in juveniles should be zero-truncated. Defaults to `FALSE`. Cannot be `TRUE` if `jsize.zero = TRUE`.
`jsizeb.trunc`	A logical variable indicating whether the secondary size distribution in juveniles should be zero-truncated. Defaults to `FALSE`. Cannot be `TRUE` if `jsizeb.zero = TRUE`.
`jsizec.trunc`	A logical variable indicating whether the tertiary size distribution in juveniles should be zero-truncated. Defaults to `FALSE`. Cannot be `TRUE` if `jsizec.zero = TRUE`.
`fectime`	A variable indicating which year of fecundity to use as the response term in fecundity models. Options include `2`, which refers to occasion t, and `3`, which refers to occasion t+1. Defaults to `2`.
`censor`	A vector denoting the names of censoring variables in the dataset, in order from occasion t+1, followed by occasion t, and lastly followed by occasion t-1. Defaults to `NA`.
`age`	Designates the name of the variable corresponding to age in time t in the vertical dataset. Defaults to `NA`, in which case age is not included in linear models. Should only be used if building Leslie or age x stage matrices.
`test.age`	Either a logical value indicating whether to include `age` as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`indcova`	Vector designating the names in occasions t+1, t, and t-1 of an individual covariate. Defaults to `NA`.
`indcovb`	Vector designating the names in occasions t+1, t, and t-1 of a second individual covariate. Defaults to `NA`.
`indcovc`	Vector designating the names in occasions t+1, t, and t-1 of a third individual covariate. Defaults to `NA`.
`random.indcova`	A logical value indicating whether `indcova` should be treated as a random categorical factor, rather than as a fixed factor. Defaults to `FALSE`.
`random.indcovb`	A logical value indicating whether `indcovb` should be treated as a random categorical factor, rather than as a fixed factor. Defaults to `FALSE`.
`random.indcovc`	A logical value indicating whether `indcovc` should be treated as a random categorical factor, rather than as a fixed factor. Defaults to `FALSE`.
`test.indcova`	Either a logical value indicating whether to include the `indcova` variable as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`test.indcovb`	Either a logical value indicating whether to include the `indcovb` variable as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`test.indcovc`	Either a logical value indicating whether to include the `indcovc` variable as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`annucova`	A numeric vector of annual covariates to test within all vital rate models. If `historical = TRUE`, then the number of elements must be equal to the number of values of `year2` in the dataset plus one, and the vector should start with the value to be used in time t-1 in the first year of the dataset. In all other cases, the length of the vector must equal the number of values of `year2` in the dataset. Defaults to `NA`.
`annucovb`	A second numeric vector of annual covariates to test within all vital rate models. If `historical = TRUE`, then the number of elements must be equal to the number of values of `year2` in the dataset plus one, and the vector should start with the value to be used in time t-1 in the first year of the dataset. In all other cases, the length of the vector must equal the number of values of `year2` in the dataset. Defaults to `NA`.
`annucovc`	A third numeric vector of annual covariates to test within all vital rate models. If `historical = TRUE`, then the number of elements must be equal to the number of values of `year2` in the dataset plus one, and the vector should start with the value to be used in time t-1 in the first year of the dataset. In all other cases, the length of the vector must equal the number of values of `year2` in the dataset. Defaults to `NA`.
`test.annucova`	A logical value indicating whether to test the variable given as `annucova` as a fixed factor in analysis. Defaults to `FALSE`.
`test.annucovb`	A logical value indicating whether to test the variable given as `annucovb` as a fixed factor in analysis. Defaults to `FALSE`.
`test.annucovc`	A logical value indicating whether to test the variable given as `annucovc` as a fixed factor in analysis. Defaults to `FALSE`.
`test.group`	Either a logical value indicating whether to include the `group` variable from the input `stageframe` as a fixed categorical variable in linear models, or a logical vector of such values for 14 models, in order: 1) survival, 2) observation, 3) primary size, 4) secondary size, 5) tertiary size, 6) reproductive status, 7) fecundity, 8) juvenile survival, 9) juvenile observation, 10) juvenile primary size, 11) juvenile secondary size, 12) juvenile tertiary size, 13) juvenile reproductive status, and 14) juvenile maturity status. Defaults to `FALSE`.
`show.model.tables`	If set to TRUE, then includes full modeling tables in the output. Defaults to `TRUE`.
`global.only`	If set to TRUE, then only global models will be built and evaluated. Defaults to `FALSE`.
`accuracy`	A logical value indicating whether to test accuracy of models. See `Notes` section for details on how accuracy is assessed. Defaults to `TRUE`.
`data_out`	A logical value indicating whether to append all subsetted datasets used in model building and selection to the output. Defaults to `FALSE`.
`quiet`	May be a logical value, or any one of the strings `"yes"`, `"no"`, or `"partial"`. If set to `TRUE` or `"yes"`, then model building and selection will proceed with most warnings and diagnostic messages silenced. If set to `FALSE` or `"no"`, then all warnings and diagnostic messages will be displayed. If set to `"partial"`, then only messages related to transitions between different vital rate models will be displayed. Defaults to `FALSE`.

Value

This function yields an object of class lefkoMod, which is a list in which the first 14 elements are the best-fit models for survival, observation status, primary size, secondary size, tertiary size, reproductive status, fecundity, juvenile survival, juvenile observation, juvenile primary size, juvenile secondary size, juvenile tertiary size, juvenile transition to reproduction, and juvenile transition to maturity, respectively. This is followed by 14 elements corresponding to the model tables for each of these vital rates, in order, followed by a data frame showing the order and names of variables used in modeling, followed by a single character element denoting the criterion used for model selection, and ending on a data frame with quality control data:

`survival_model`	Best-fit model of the binomial probability of survival from occasion t to occasion t+1. Defaults to `1`.
`observation_model`	Best-fit model of the binomial probability of observation in occasion t+1 given survival to that occasion. Defaults to `1`.
`size_model`	Best-fit model of the primary size metric on occasion t+1 given survival to and observation in that occasion. Defaults to `1`.
`sizeb_model`	Best-fit model of the secondary size metric on occasion t+1 given survival to and observation in that occasion. Defaults to `1`.
`sizec_model`	Best-fit model of the tertiary size metric on occasion t+1 given survival to and observation in that occasion. Defaults to `1`.
`repstatus_model`	Best-fit model of the binomial probability of reproduction in occasion t+1, given survival to and observation in that occasion. Defaults to `1`.
`fecundity_model`	Best-fit model of fecundity in occasion t+1 given survival to, and observation and reproduction in that occasion. Defaults to `1`.
`juv_survival_model`	Best-fit model of the binomial probability of survival from occasion t to occasion t+1 of an immature individual. Defaults to `1`.
`juv_observation_model`	Best-fit model of the binomial probability of observation in occasion t+1 given survival to that occasion of an immature individual. Defaults to `1`.
`juv_size_model`	Best-fit model of the primary size metric on occasion t+1 given survival to and observation in that occasion of an immature individual. Defaults to `1`.
`juv_sizeb_model`	Best-fit model of the secondary size metric on occasion t+1 given survival to and observation in that occasion of an immature individual. Defaults to `1`.
`juv_sizec_model`	Best-fit model of the tertiary size metric on occasion t+1 given survival to and observation in that occasion of an immature individual. Defaults to `1`.
`juv_reproduction_model`	Best-fit model of the binomial probability of reproduction in occasion t+1, given survival to and observation in that occasion of an individual that was immature in occasion t. This model is technically not a model of reproduction probability for individuals that are immature, rather reproduction probability here is given for individuals that are mature in occasion t+1 but immature in occasion t. Defaults to `1`.
`juv_maturity_model`	Best-fit model of the binomial probability of becoming mature in occasion t+1, given survival to that occasion of an individual that was immature in occasion t. Defaults to `1`.
`survival_table`	Full dredge model table of survival probability.
`observation_table`	Full dredge model table of observation probability.
`size_table`	Full dredge model table of the primary size variable.
`sizeb_table`	Full dredge model table of the secondary size variable.
`sizec_table`	Full dredge model table of the tertiary size variable.
`repstatus_table`	Full dredge model table of reproduction probability.
`fecundity_table`	Full dredge model table of fecundity.
`juv_survival_table`	Full dredge model table of immature survival probability.
`juv_observation_table`	Full dredge model table of immature observation probability.
`juv_size_table`	Full dredge model table of primary size in immature individuals.
`juv_sizeb_table`	Full dredge model table of secondary size in immature individuals.
`juv_sizec_table`	Full dredge model table of tertiary size in immature individuals.
`juv_reproduction_table`	Full dredge model table of immature reproduction probability.
`juv_maturity_table`	Full dredge model table of the probability of an immature individual transitioning to maturity.
`paramnames`	A data frame showing the names of variables from the input data frame used in modeling, their associated standardized names in linear models, and a brief comment describing each variable.
`criterion`	Character variable denoting the criterion used to determine the best-fit model.
`qc`	Data frame with five variables: 1) Name of vital rate, 2) number of individuals used to model that vital rate, 3) number of individual transitions used to model that vital rate, 4) parameter distribution used to model the vital rats, and 5) accuracy of model, given as detailed in Notes section.
`subdata`	An optional list of data frames, each of which is the data frame used to develop each model in the `lefkoMod` object, in order.

Notes

When modelsearch() is called, it first trims the dataset down to just the variables that will be used (including all response terms and independent variables). It then subsets the data to only complete cases for those variables. Next, it builds global models for all vital rates, and runs them. If a global model fails, then the function proceeds by dropping terms. If approach = "mixed", then it will determine which random factor term contains the most categories in the respective subset, and drop that term. If this fails, or if approach = "glm", then it will drop any two-way interactions and run the model. If this fails, then the function will attempt to drop further terms, first patch alone, then year alone, then individual covariates by themselves, then combinations of these four, and finally individual identity. If all of these attempts fail and the approach used is mixed, then the function will try running a glm version of the original failed model. Finally, if all attempts fail, then the function displays a warning and returns 1 to allow model building assuming a constant rate or probability.

Including annual covariates is easy via the arguments annucova, annucovb, and annucovc together with test.annucova, test.annucovb, and test.annucovc. Rather than incorporate a nnual covariates into the dataset, the values corresponding to each year may be concatenated into a numeric vector, and then used in one of these three arguments. Function modelsearch() will then append the value associated with each year into the dataset, and proceed with model building. Two-way interactions can be explored with other main effects fixed variables by setting interactions = TRUE.

Setting suite = "cons" prevents the inclusion of size and reproductive status as fixed, independent factors in modeling. However, it does not prevent any other terms from being included. Density, age, individual covariates, individual identity, patch, and year may all be included.

The mechanics governing model building are fairly robust to errors and exceptions. The function attempts to build global models, and simplifies models automatically should model building fail. Model building proceeds through the functions lm() (GLM with Gaussian response), glm() (GLM with Poisson, Gamma, or binomial response), glm.nb() (GLM with negative binomial response), zeroinfl() (GLM with zero-inflated Poisson or negative binomial response), vglm() (GLM with zero-truncated Poisson or negative binomial response), lmer() (mixed model with Gaussian response), glmer() (mixed model with binomial, Poisson, or Gamma response), and glmmTMB() (mixed model with negative binomial, or zero-truncated or zero-inflated Poisson or negative binomial response). See documentation related to these functions for further information. Any response term that is invariable in the dataset will lead to a best-fit model for that response represented by a single constant value.

Exhaustive model building and selection proceeds via the dredge() function in package MuMIn. This function is verbose, so that any errors and warnings developed during model building, model analysis, and model selection can be found and dealt with. Interpretations of errors during global model analysis may be found in documentation for the functions and packages mentioned. Package MuMIn is used for model dredging (see dredge()), and errors and warnings during dredging can be interpreted using the documentation for that package. Errors occurring during dredging lead to the adoption of the global model as the best-fit, and the user should view all logged errors and warnings to determine the best way to proceed. The quiet = TRUE and quiet = "partial" options can be used to silence dredge warnings, but users should note that automated model selection can be viewed as a black box, and so care should be taken to ensure that the models run make biological sense, and that model quality is prioritized.

Exhaustive model selection through dredging works best with larger datasets and fewer tested parameters. Setting suite = "full" and interactions = TRUE may initiate a dredge that takes a dramatically long time, particularly if the model is historical, individual or annual covariates are used, or a zero-inflated distribution is assumed. In such cases, the number of models built and tested will run at least in the millions. Small datasets will also increase the error associated with these tests, leading to adoption of simpler models overall. Note also that zero-inflated models are processed as two models, and so include twice the assumed number of parameters. If suite = "full", then this function will switch to a main effects global model for the zero-inflated parameter models if the total number of parameters to test rises above the limits imposed by the dredge() function in package MuMIn.

Accuracy of vital rate models is calculated differently depending on vital rate and assumed distribution. For all vital rates assuming a binomial distribution, including survival, observation status, reproductive status, and juvenile version of these, accuracy is calculated as the percent of predicted responses equal to actual responses. In all other models, accuracy is actually assessed as a simple R-squared in which the observed response values per data subset are compared to the predicted response values according to each best-fit model. Note that some situations in which factor variables are used may result in failure to assess accuracy. In these cases, function modelsearch() simply yields NA values.

Care must be taken to build models that test the impacts of state in occasion t-1 for historical models, and that do not test these impacts for ahistorical models. Ahistorical matrix modeling particularly will yield biased transition estimates if historical terms from models are ignored. This can be dealt with at the start of modeling by setting historical = FALSE for the ahistorical case, and historical = TRUE for the historical case.

This function handles generalized linear models (GLMs) under zero-inflated distributions using the zeroinfl() function, and zero- truncated distributions using the vglm() function. Model dredging may fail with these functions, leading to the global model being accepted as the best-fit model. However, model dredges of mixed models work for all distributions. We encourage the use of mixed models in all cases.

The negative binomial and truncated negative binomial distributions use the quadratic structure emphasized in Hardin and Hilbe (2018, 4th Edition of Generalized Linear Models and Extensions). The truncated negative binomial distribution may fail to predict size probabilities correctly when dispersion is near that expected of the Poisson distribution. To prevent this problem, we have integrated a cap on the overdispersion parameter. However, when using this distribution, please check the matrix column sums to make sure that they do not predict survival greater than 1.0. If they do, then please use either the negative binomial distribution or the zero-truncated Poisson distribution.

If density dependence is explored through function modelsearch(), then the interpretation of density is not the full population size but rather the spatial density term included in the dataset.

Users building vital rate models for Leslie matrices must set vitalrates = c("surv", "fec") or vitalrates = "leslie" rather than the default, because only survival and fecundity should be estimated in these cases. Also, the suite setting can be set to either age or cons, as the results will be exactly the same.

Users wishing to test age, density, group, or individual covariates, must include test.age = TRUE, test.density = TRUE, test.group = TRUE, or test.indcova = TRUE (or test.indcovb = TRUE or test.indcovc = TRUE, whichever is most appropriate), respectively, in addition to stipulating the name of the variable within the dataset. The default for these options is always FALSE.

lefkoMod objects can be quite large when datasets are large and models are complicated. To reduce the amount of memory taken up by models, vrm_input objects can be created to summarize all relevant aspects of the vital rate models using function miniMod().

Examples


data(lathyrus)

sizevector <- c(0, 4.6, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8,
  9)
stagevector <- c("Sd", "Sdl", "Dorm", "Sz1nr", "Sz2nr", "Sz3nr", "Sz4nr",
  "Sz5nr", "Sz6nr", "Sz7nr", "Sz8nr", "Sz9nr", "Sz1r", "Sz2r", "Sz3r", 
  "Sz4r", "Sz5r", "Sz6r", "Sz7r", "Sz8r", "Sz9r")
repvector <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
obsvector <- c(0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
immvector <- c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
  0)
indataset <- c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 4.6, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 
  0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)

lathframeln <- sf_create(sizes = sizevector, stagenames = stagevector, 
  repstatus = repvector, obsstatus = obsvector, matstatus = matvector, 
  immstatus = immvector, indataset = indataset, binhalfwidth = binvec, 
  propstatus = propvector)

lathvertln <- verticalize3(lathyrus, noyears = 4, firstyear = 1988,
  patchidcol = "SUBPLOT", individcol = "GENET", blocksize = 9, 
  juvcol = "Seedling1988", sizeacol = "lnVol88", repstracol = "Intactseed88",
  fecacol = "Intactseed88", deadacol = "Dead1988", 
  nonobsacol = "Dormant1988", stageassign = lathframeln, stagesize = "sizea",
  censorcol = "Missing1988", censorkeep = NA, NAas0 = TRUE, censor = TRUE)

lathvertln$feca2 <- round(lathvertln$feca2)
lathvertln$feca1 <- round(lathvertln$feca1)
lathvertln$feca3 <- round(lathvertln$feca3)

lathmodelsln3 <- modelsearch(lathvertln, historical = TRUE, 
  approach = "mixed", suite = "main", 
  vitalrates = c("surv", "obs", "size", "repst", "fec"), juvestimate = "Sdl",
  bestfit = "AICc&k", sizedist = "gaussian", fecdist = "poisson", 
  indiv = "individ", patch = "patchid", year = "year2", year.as.random = TRUE,
  patch.as.random = TRUE, show.model.tables = TRUE, quiet = "partial")

# Here we use supplemental() to provide overwrite and reproductive info
lathsupp3 <- supplemental(stage3 = c("Sd", "Sd", "Sdl", "Sdl", "mat", "Sd", "Sdl"), 
  stage2 = c("Sd", "Sd", "Sd", "Sd", "Sdl", "rep", "rep"),
  stage1 = c("Sd", "rep", "Sd", "rep", "Sd", "mat", "mat"),
  eststage3 = c(NA, NA, NA, NA, "mat", NA, NA),
  eststage2 = c(NA, NA, NA, NA, "Sdl", NA, NA),
  eststage1 = c(NA, NA, NA, NA, "Sdl", NA, NA),
  givenrate = c(0.345, 0.345, 0.054, 0.054, NA, NA, NA),
  multiplier = c(NA, NA, NA, NA, NA, 0.345, 0.054),
  type = c(1, 1, 1, 1, 1, 3, 3), type_t12 = c(1, 2, 1, 2, 1, 1, 1),
  stageframe = lathframeln, historical = TRUE)

lathmat3ln <- flefko3(year = "all", patch = "all", stageframe = lathframeln, 
  modelsuite = lathmodelsln3, data = lathvertln, supplement = lathsupp3, 
  reduce = FALSE)

lefko3 documentation built on June 8, 2025, 10:04 a.m.