# Model Selection and Multimodel Inference Based on (Q)AIC(c)

### Description

Description: This package includes functions to create model selection
tables based on Akaike's information criterion (AIC) and the
second-order AIC (AICc), as well as their quasi-likelihood counterparts
(QAIC, QAICc). The package also features functions to conduct classic
model averaging (multimodel inference) for a given parameter of interest
or predicted values, as well as a shrinkage version of model averaging
parameter estimates. Other handy functions enable the computation of
relative variable importance, evidence ratios, and confidence sets for
the best model. The present version works with Cox proportional hazards
models and conditional logistic regression (`coxph`

and
`coxme`

classes), linear models (`lm`

class), generalized
linear models (`glm`

, `vglm`

, `hurdle`

, and
`zeroinfl`

classes), linear models fit by generalized least squares
(`gls`

class), linear mixed models (`lme`

class), generalized
linear mixed models (`mer`

and `merMod`

classes), multinomial
and ordinal logistic regressions (`multinom`

, `polr`

,
`clm`

, and `clmm`

classes), robust regression models
(`rlm`

class), beta regression models (`betareg`

class),
parametric survival models (`survreg`

class), nonlinear models
(`nls`

and `gnls`

classes), nonlinear mixed models
(`nlme`

and `nlmerMod`

classes), univariate models
(`fitdist`

and `fitdistr`

classes), and certain types of
latent variable models (`lavaan`

class). The package also supports
various models of `unmarkedFit`

and `maxLikeFit`

classes
estimating demographic parameters after accounting for imperfect
detection probabilities. Some functions also allow the creation of
model selection tables for Bayesian models of the `bugs`

and
`rjags`

classes. Objects following model selection and multimodel
inference can be formatted to LaTeX using `xtable`

methods included
in the package.

### Details

Package: | AICcmodavg |

Type: | Package |

Version: | 2.1-0 |

Date: | 2016-11-17 |

License: | GPL (>=2 ) |

LazyLoad: | yes |

This package contains several useful functions for model selection and multimodel inference:

`AICc`

Computes AIC, AICc, and their quasi-likelihood counterparts (QAIC, QAICc).`aictab`

Constructs model selection tables with number of parameters, AIC, delta AIC, Akaike weights or variants based on other AICc, QAIC, and QAICc for a set of candidate models.`bictab`

Constructs model selection tables with number of parameters, BIC, delta BIC, BIC weights for a set of candidate models.`boot.wt`

Computes summary statistics from detection histories.`confset`

Determines the confidence set for the best model based on one of three criteria.`DIC`

Extracts DIC.`dictab`

Constructs model selection tables with number of parameters, DIC, delta DIC, DIC weights for a set of candidate models.`evidence`

Computes the evidence ratio between the highest-ranked model based on the information criteria selected and a lower-ranked model.`importance`

Computes importance values (w+) for the support of a given parameter among set of candidate models.`modavg`

Computes model-averaged estimate, unconditional standard error, and unconditional confidence interval of a parameter of interest among a set of candidate models.`modavgEffect`

Computes model-averaged effect sizes between groups based on the entire candidate model set.`modavgShrink`

Computes shrinkage version of model-averaged estimate, unconditional standard error, and unconditional confidence interval of a parameter of interest among entire set of candidate models.`modavgPred`

Computes model-average predictions, unconditional SE's, and confidence intervals among entire set of candidate models.`multComp`

Performs multiple comparisons across levels of a factor in a model selection framework.`useBIC`

Computes BIC or a quasi-likelihood counterparts (QBIC).

A number of functions for model diagnostics are available:

`c_hat`

Estimates variance inflation factor for binomial or Poisson GLM's based on various estimators.`checkConv`

Checks the convergence information of the algorithm for the model.`checkParms`

Checks the occurrence of parameter estimates with high standard errors in a model.`countDist`

Computes summary statistics from distance sampling data.`countHist`

Computes summary statistics from count history data.`covDiag`

Computes covariance diagnostics for lambda in*N*-mixture models.`detHist`

Computes summary statistics from detection histories.`extractCN`

Extracts condition number from models of certain classes.`mb.gof.test`

Computes the MacKenzie and Bailey goodness-of-fit test for single season and dynamic occupancy models using the Pearson chi-square statistic.`Nmix.gof.test`

Computes goodness-of-fit test for*N*-mixture models based on the Pearson chi-square statistic.

Other utility functions include:

`extractLL`

Extracts log-likelihood from models of certain classes.`extractSE`

Extracts standard errors from models of certain classes and adds the labels.`fam.link.mer`

Extracts the distribution family and link function from a generalized linear mixed model of classes`mer`

and`merMod`

.`predictSE`

Computes predictions and associated standard errors models of certain classes.`xtable`

Formats various objects resulting from model selection and multimodel inference to LaTeX or HTML tables.

### Author(s)

Marc J. Mazerolle <marc.mazerolle@uqat.ca>.

### References

Anderson, D. R. (2008) *Model-based inference in the life sciences:
a primer on evidence*. Springer: New York.

Burnham, K. P., and Anderson, D. R. (2002) *Model selection and
multimodel inference: a practical information-theoretic approach*. Second
edition. Springer: New York.

Burnham, K. P., Anderson, D. R. (2004) Multimodel inference:
understanding AIC and BIC in model selection. *Sociological
Methods and Research* **33**, 261–304.

Mazerolle, M. J. (2006) Improving data analysis in herpetology: using
Akaike's Information Criterion (AIC) to assess the strength of
biological hypotheses. *Amphibia-Reptilia* **27**, 169–180.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | ```
##anuran larvae example from Mazerolle (2006) - Poisson GLM with offset
data(min.trap)
##assign "UPLAND" as the reference level as in Mazerolle (2006)
min.trap$Type <- relevel(min.trap$Type, ref = "UPLAND")
##set up candidate models
Cand.mod <- list()
##global model
Cand.mod[[1]] <- glm(Num_anura ~ Type + log.Perimeter + Num_ranatra,
family = poisson, offset = log(Effort),
data = min.trap)
Cand.mod[[2]] <- glm(Num_anura ~ Type + log.Perimeter, family = poisson,
offset = log(Effort), data = min.trap)
Cand.mod[[3]] <- glm(Num_anura ~ Type + Num_ranatra, family = poisson,
offset = log(Effort), data = min.trap)
Cand.mod[[4]] <- glm(Num_anura ~ Type, family = poisson,
offset = log(Effort), data = min.trap)
Cand.mod[[5]] <- glm(Num_anura ~ log.Perimeter + Num_ranatra,
family = poisson, offset = log(Effort),
data = min.trap)
Cand.mod[[6]] <- glm(Num_anura ~ log.Perimeter, family = poisson,
offset = log(Effort), data = min.trap)
Cand.mod[[7]] <- glm(Num_anura ~ Num_ranatra, family = poisson,
offset = log(Effort), data = min.trap)
Cand.mod[[8]] <- glm(Num_anura ~ 1, family = poisson,
offset = log(Effort), data = min.trap)
##check c-hat for global model
c_hat(Cand.mod[[1]], method = "pearson") #uses Pearson's chi-square/df
##note the very low overdispersion: in this case, the analysis could be
##conducted without correcting for c-hat as its value is reasonably close
##to 1
##assign names to each model
Modnames <- c("type + logperim + invertpred", "type + logperim",
"type + invertpred", "type", "logperim + invertpred",
"logperim", "invertpred", "intercept only")
##model selection table based on AICc
aictab(cand.set = Cand.mod, modnames = Modnames)
##compute evidence ratio
evidence(aictab(cand.set = Cand.mod, modnames = Modnames))
##compute confidence set based on 'raw' method
confset(cand.set = Cand.mod, modnames = Modnames, second.ord = TRUE,
method = "raw")
##compute importance value for "TypeBOG" - same number of models
##with vs without variable
importance(cand.set = Cand.mod, modnames = Modnames, parm = "TypeBOG")
##compute model-averaged estimate of "TypeBOG"
modavg(cand.set = Cand.mod, modnames = Modnames, parm = "TypeBOG")
##compute model-averaged estimate of "TypeBOG" with shrinkage
##same number of models with vs without variable
modavgShrink(cand.set = Cand.mod, modnames = Modnames,
parm = "TypeBOG")
##compute model-average predictions for two types of ponds
##create a data set for predictions
dat.pred <- data.frame(Type = factor(c("BOG", "UPLAND")),
log.Perimeter = mean(min.trap$log.Perimeter),
Num_ranatra = mean(min.trap$Num_ranatra),
Effort = mean(min.trap$Effort))
##model-averaged predictions across entire model set
modavgPred(cand.set = Cand.mod, modnames = Modnames,
newdata = dat.pred, type = "response")
##compute model-averaged effect size between two groups
##works when data set has two rows
modavgEffect(cand.set = Cand.mod, modnames = Modnames,
newdata = dat.pred, type = "link")
##single-season occupancy model example modified from ?occu
## Not run:
require(unmarked)
##single season
data(frogs)
pferUMF <- unmarkedFrameOccu(pfer.bin)
## add some fake covariates for illustration
siteCovs(pferUMF) <- data.frame(sitevar1 = rnorm(numSites(pferUMF)),
sitevar2 = rnorm(numSites(pferUMF)))
## observation covariates are in site-major, observation-minor order
obsCovs(pferUMF) <- data.frame(obsvar1 = rnorm(numSites(pferUMF) *
obsNum(pferUMF)))
##check detection history data from data object
detHist(pferUMF)
##set up candidate model set
fm1 <- occu(~ obsvar1 ~ sitevar1, pferUMF)
##check detection history data from model object
detHist(fm1)
fm2 <- occu(~ 1 ~ sitevar1, pferUMF)
fm3 <- occu(~ obsvar1 ~ sitevar2, pferUMF)
fm4 <- occu(~ 1 ~ sitevar2, pferUMF)
Cand.models <- list(fm1, fm2, fm3, fm4)
Modnames <- c("fm1", "fm2", "fm3", "fm4")
##compute table
print(aictab(cand.set = Cand.models, modnames = Modnames,
second.ord = TRUE), digits = 4)
##compute evidence ratio
evidence(aictab(cand.set = Cand.models, modnames = Modnames))
##evidence ratio between top model vs lowest-ranked model
evidence(aictab(cand.set = Cand.models, modnames = Modnames), model.high = "fm2", model.low = "fm3")
##compute confidence set based on 'raw' method
confset(cand.set = Cand.models, modnames = Modnames, second.ord = TRUE,
method = "raw")
##compute importance value for "sitevar1" on occupancy
##same number of models with vs without variable
importance(cand.set = Cand.models, modnames = Modnames, parm = "sitevar1",
parm.type = "psi")
##compute model-averaged estimate of "sitevar1" on occupancy
modavg(cand.set = Cand.models, modnames = Modnames, parm = "sitevar1",
parm.type = "psi")
##compute model-averaged estimate of "sitevar1" with shrinkage
##same number of models with vs without variable
modavgShrink(cand.set = Cand.models, modnames = Modnames,
parm = "sitevar1", parm.type = "psi")
##compute model-average predictions for two types of ponds
##create a data set for predictions
dat.pred <- data.frame(sitevar1 = seq(from = min(siteCovs(pferUMF)$sitevar1),
to = max(siteCovs(pferUMF)$sitevar1), by = 0.5),
sitevar2 = mean(siteCovs(pferUMF)$sitevar2))
##model-averaged predictions of psi across range of values
##of sitevar1 and entire model set
modavgPred(cand.set = Cand.models, modnames = Modnames,
newdata = dat.pred, parm.type = "psi")
detach(package:unmarked)
## End(Not run)
``` |