Description Usage Arguments Value Notes Examples
View source: R/modelselection.R
Function modelsearch()
returns both a bestfit model for each vital
rate, and a model table showing all models tested. The final output can be
used as input in other functions within this package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38  modelsearch(
data,
historical = TRUE,
approach = "mixed",
suite = "size",
bestfit = "AICc&k",
vitalrates = c("surv", "size", "fec"),
surv = c("alive3", "alive2", "alive1"),
obs = c("obsstatus3", "obsstatus2", "obsstatus1"),
size = c("sizea3", "sizea2", "sizea1"),
repst = c("repstatus3", "repstatus2", "repstatus1"),
fec = c("feca3", "feca2", "feca1"),
stage = c("stage3", "stage2", "stage1"),
indiv = "individ",
patch = NA,
year = "year2",
sizedist = "gaussian",
fecdist = "gaussian",
size.zero = FALSE,
size.trunc = FALSE,
fec.zero = FALSE,
fec.trunc = FALSE,
patch.as.random = TRUE,
year.as.random = TRUE,
juvestimate = NA,
juvsize = FALSE,
jsize.zero = FALSE,
jsize.trunc = FALSE,
fectime = 2,
censor = NA,
age = NA,
indcova = NA,
indcovb = NA,
indcovc = NA,
show.model.tables = TRUE,
global.only = FALSE,
quiet = FALSE
)

data 
The vertical dataset to be used for analysis. This dataset should
be of class 
historical 
A logical variable denoting whether to assess the effects of state in time t1 in addition to state in time t. Defaults to TRUE. 
approach 
The statistical approach to be taken for model building. The
default is 
suite 
This describes the global model for each vital rate estimation
and has the following possible values: 
bestfit 
A variable indicating the model selection criterion for the
choice of bestfit model. The default is 
vitalrates 
A vector describing which vital rates will be estimated via
linear modeling, with the following options: 
surv 
A vector indicating the variable names coding for status as alive
or dead in times t+1, t, and t1, respectively. Defaults
to 
obs 
A vector indicating the variable names coding for observation
status in times t+1, t, and t1, respectively. Defaults
to 
size 
A vector indicating the variable names coding for size in times
t+1, t, and t1, respectively. Defaults to

repst 
A vector indicating the variable names coding for reproductive
status in times t+1, t, and t1, respectively. Defaults
to 
fec 
A vector indicating the variable names coding for fecundity in
times t+1, t, and t1, respectively. Defaults to

stage 
A vector indicating the variables coding for stage in times
t+1, t, and t1. Defaults to 
indiv 
A variable indicating the variable name coding individual
identity. Defaults to 
patch 
A variable indicating the variable name coding for patch, where patches are defined as permanent subgroups within the study population. Defaults to NA. 
year 
A variable indicating the variable coding for observation time in
time t. Defaults to 
sizedist 
The probability distribution used to model size. Options
include 
fecdist 
The probability distribution used to model fecundity. Options
include 
size.zero 
A logical variable indicating whether size distribution should be zeroinflated. Only applies to Poisson and negative binomial distributions. Defaults to FALSE. 
size.trunc 
A logical variable indicating whether size distribution is
zerotruncated. Defaults to FALSE. Cannot be TRUE if 
fec.zero 
A logical variable indicating whether fecundity distribution should be zeroinflated. Only applies to Poisson and negative binomial distributions. Defaults to FALSE. 
fec.trunc 
A logical variable indicating whether fecundity distribution
is zerotruncated. Defaults to FALSE. Cannot be TRUE if

patch.as.random 
If set to TRUE and 
year.as.random 
If set to TRUE and 
juvestimate 
An optional variable denoting the stage name of the
juvenile stage in the vertical dataset. If not NA, and 
juvsize 
A logical variable denoting whether size should be used as a
term in models involving transition from the juvenile stage. Defaults to
FALSE, and is only used if 
jsize.zero 
A logical variable indicating whether size distribution of juveniles should be zeroinflated. Only applies to Poisson and negative binomial distributions. Defaults to FALSE. 
jsize.trunc 
A logical variable indicating whether size distribution in
juveniles is zerotruncated. Defaults to FALSE. Cannot be TRUE if

fectime 
A variable indicating which year of fecundity to use as the
response term in fecundity models. Options include 
censor 
A vector denoting the names of censoring variables in the dataset, in order from time t+1, followed by time t, and lastly followed by time t1. Defaults to NA. 
age 
Designates the name of the variable corresponding to age in the vertical dataset. Defaults to NA, in which case age is not included in linear models. Should only be used if building age x stage matrices. 
indcova 
Vector designating the names in times t+1, t, and t1 of an individual covariate. Defaults to NA. 
indcovb 
Vector designating the names in times t+1, t, and t1 of an individual covariate. Defaults to NA. 
indcovc 
Vector designating the names in times t+1, t, and t1 of an individual covariate. Defaults to NA. 
show.model.tables 
If set to TRUE, then includes full modeling tables in the output. Defaults to TRUE. 
global.only 
If set to TRUE, then only global models will be built and evaluated. Defaults to FALSE. 
quiet 
If set to TRUE, then model building and selection will proceed without warnings and diagnostic messages being issued. Note that this will not affect warnings and messages generated as models themselves are tested. Defaults to FALSE. 
This function yields an object of class lefkoMod
, which is a
list in which the first 9 elements are the bestfit models for survival,
observation status, size, reproductive status, fecundity, juvenile survival,
juvenile observation, juvenile size, and juvenile transition to reproduction,
respectively, followed by 9 elements corresponding to the model tables for
each of these vital rates, in order, followed by a single character element
denoting the criterion used for model selection, and ending on a quality
control vector:
survival_model 
Bestfit model of the binomial probability of survival from time t to time t+1. Defaults to 1. 
observation_model 
Bestfit model of the binomial probability of observation in time t+1 given survival to that time. Defaults to 1. 
size_model 
Bestfit model of size in time t+1 given survival to and observation in that time. Defaults to 1. 
repstatus_model 
Bestfit model of the binomial probability of reproduction in time t+1, given survival to and observation in that time. Defaults to 1. 
fecundity_model 
Bestfit model of fecundity in time t+1 given survival to, and observation and reproduction in that time. Defaults to 1. 
juv_survival_model 
Bestfit model of the binomial probability of survival from time t to time t+1 of an immature individual. Defaults to 1. 
juv_observation_model 
Bestfit model of the binomial probability of observation in time t+1 given survival to that time of an immature individual. Defaults to 1. 
juv_size_model 
Bestfit model of size in time t+1 given survival to and observation in that time of an immature individual. Defaults to 1. 
juv_reproduction_model 
Bestfit model of the binomial probability of reproduction in time t+1, given survival to and observation in that time of an individual that was immature in time t. This model is technically not a model of reproduction probability for individuals that are immature, rather reproduction probability here is given for individuals that are mature in time t+1 but immature in time t. Defaults to 1. 
survival_table 
Full dredge model table of survival probability. 
observation_table 
Full dredge model table of observation probability. 
size_table 
Full dredge model table of size. 
repstatus_table 
Full dredge model table of reproduction probability. 
fecundity_table 
Full dredge model table of fecundity. 
juv_survival_table 
Full dredge model table of immature survival probability. 
juv_observation_table 
Full dredge model table of immature observation probability. 
juv_size_table 
Full dredge model table of immature size. 
juv_reproduction_table 
Full dredge model table of immature reproduction probability. 
criterion 
Character variable denoting the criterion used to determine the bestfit model. 
qc 
Data frame with three variables: 1) Name of vital rate, 2) number of individuals used to model that vital rate, and 3) number of individual transitions used to model that vital rate. 
The mechanics governing model building are fairly robust to errors and
exceptions. The function attempts to build global models, and simplifies
models automatically should model building fail. Model building proceeds
through the functions lm()
(GLM with Gaussian response),
glm()
(GLM with Poisson or binomial response),
glm.nb()
(GLM with negative binomial response),
zeroinfl()
(zeroinflated Poisson or negative binomial
response), lmer()
(mixed model with Gaussian response),
glmer()
(mixed model with binomial or Poisson response),
and glmmTMB()
(mixed model with negative binomial,
or zerotruncated or zeroinflated Poisson or negative binomial response).
See documentation related to these functions for further information. Any
response term that is invariable in the dataset will lead to a bestfit model
for that response represented by a single constant value.
Exhaustive model building and selection proceeds via the
dredge()
function in package MuMIn
. This function
is verbose, so that any errors and warnings developed during model building,
model analysis, and model selection can be found and dealt with.
Interpretations of errors during global model analysis may be found in
documentation in for the functions and packages mentioned. Package
MuMIn
is used for model dredging (see dredge()), and
errors and warnings during dredging can be interpreted using the
documentation for that package. Errors occurring during dredging lead to the
adoption of the global model as the bestfit, and the user should view all
logged errors and warnings to determine the best way to proceed. The
quiet = TRUE
option can be used to silence dredge warnings, but users
should note that automated model selection can be viewed as a black box, and
so care should be taken to ensure that the models run make biological sense,
and that model quality is prioritized.
Exhaustive model selection through dredging works best with larger datasets
and fewer tested parameters. Setting suite = "full"
may initiate a
dredge that takes a dramatically long time, particularly if the model is
historical, individual covariates are used, or a zeroinflated distribution
is assumed. In such cases, the number of models built and tested will run at
least in the millions. Small datasets will also increase the error associated
with these tests, leading to adoption of simpler models overall. We do not
yet offer a parallelization option for function modelsearch()
, but
plan to offer one in the future to speed this process up for particularly
large global models.
Care must be taken to build models that test the impacts of state in time
t1 for historical models, and that do not test these impacts for
ahistorical models. Ahistorical matrix modeling particularly will yield
biased transition estimates if historical terms from models are ignored. This
can be dealt with at the start of modeling by setting
historical = FALSE
for the ahistorical case, and
historical = TRUE
for the historical case.
This function handles generalized linear models (GLMs) under zeroinflated
distributions using the zeroinfl()
function, and zero
truncated distributions using the vglm()
function. Model
dredging may fail with these functions, leading to the global model being
accepted as the bestfit model. However, model dredges of mixed models work
for all distributions. We encourage the use of mixed models in all cases.
The negative binomial and truncated negative binomial distributions use the quadratic structure emphasized in Hardin and Hilbe (2018, 4th Edition of Generalized Linear Models and Extensions). The truncated negative binomial distribution may fail to predict size probabilities correctly when dispersion is near that expected of the Poisson distribution. To prevent this problem, we have integrated a cap on the overdispersion parameter. However, when using this distribution, please check the matrix column sums to make sure that they do not predict survival greater than 1.0. If they do, then please use either the negative binomial distribution or the zerotruncated Poisson distribution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59  # Lathyrus example
data(lathyrus)
sizevector < c(0, 4.6, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8,
9)
stagevector < c("Sd", "Sdl", "Dorm", "Sz1nr", "Sz2nr", "Sz3nr", "Sz4nr",
"Sz5nr", "Sz6nr", "Sz7nr", "Sz8nr", "Sz9nr", "Sz1r", "Sz2r", "Sz3r",
"Sz4r", "Sz5r", "Sz6r", "Sz7r", "Sz8r", "Sz9r")
repvector < c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
obsvector < c(0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
matvector < c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
immvector < c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
propvector < c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0)
indataset < c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
binvec < c(0, 4.6, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)
lathframeln < sf_create(sizes = sizevector, stagenames = stagevector,
repstatus = repvector, obsstatus = obsvector, matstatus = matvector,
immstatus = immvector, indataset = indataset, binhalfwidth = binvec,
propstatus = propvector)
lathvertln < verticalize3(lathyrus, noyears = 4, firstyear = 1988,
patchidcol = "SUBPLOT", individcol = "GENET", blocksize = 9,
juvcol = "Seedling1988", sizeacol = "lnVol88", repstracol = "Intactseed88",
fecacol = "Intactseed88", deadacol = "Dead1988",
nonobsacol = "Dormant1988", stageassign = lathframeln, stagesize = "sizea",
censorcol = "Missing1988", censorkeep = NA, NAas0 = TRUE, censor = TRUE)
lathvertln$feca2 < round(lathvertln$feca2)
lathvertln$feca1 < round(lathvertln$feca1)
lathvertln$feca3 < round(lathvertln$feca3)
lathmodelsln3 < modelsearch(lathvertln, historical = TRUE,
approach = "glm", suite = "main",
vitalrates = c("surv", "obs", "size", "repst", "fec"), juvestimate = "Sdl",
bestfit = "AICc&k", sizedist = "gaussian", fecdist = "poisson",
indiv = "individ", patch = "patchid", year = "year2",year.as.random = TRUE,
patch.as.random = TRUE, show.model.tables = TRUE, quiet = TRUE)
# Here we use supplemental() to provide overwrite and reproductive info
lathsupp3 < supplemental(stage3 = c("Sd", "Sd", "Sdl", "Sdl", "mat", "Sd", "Sdl"),
stage2 = c("Sd", "Sd", "Sd", "Sd", "Sdl", "rep", "rep"),
stage1 = c("Sd", "rep", "Sd", "rep", "Sd", "mat", "mat"),
eststage3 = c(NA, NA, NA, NA, "mat", NA, NA),
eststage2 = c(NA, NA, NA, NA, "Sdl", NA, NA),
eststage1 = c(NA, NA, NA, NA, "Sdl", NA, NA),
givenrate = c(0.345, 0.345, 0.054, 0.054, NA, NA, NA),
multiplier = c(NA, NA, NA, NA, NA, 0.345, 0.054),
type = c(1, 1, 1, 1, 1, 3, 3), type_t12 = c(1, 2, 1, 2, 1, 1, 1),
stageframe = lathframeln, historical = TRUE)
lathmat3ln < flefko3(year = "all", patch = "all", stageframe = lathframeln,
modelsuite = lathmodelsln3, data = lathvertln, supplement = lathsupp3,
patchcol = "patchid", yearcol = "year2", year.as.random = FALSE,
patch.as.random = FALSE, reduce = FALSE)
summary(lathmat3ln)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.