Description Usage Arguments Details Value Note Author(s) See Also Examples
Generate a set of models with combinations (subsets) of fixed effect terms in the global model, with optional rules for model inclusion.
1 2 3 4 5 6  dredge(global.model, beta = c("none", "sd", "partial.sd"), evaluate = TRUE,
rank = "AICc", fixed = NULL, m.lim = NULL, m.min, m.max, subset,
trace = FALSE, varying, extra, ct.args = NULL, ...)
## S3 method for class 'model.selection'
print(x, abbrev.names = TRUE, warnings = getOption("warn") != 1L, ...)

global.model 
a fitted ‘global’ model object. See ‘Details’ for a list of supported types. 
beta 
indicates whether and how the coefficients estimates should be
standardized, and must be one of 
evaluate 
whether to evaluate and rank the models. If 
rank 
optional custom rank function (returning an information
criterion) to be used instead 
fixed 
optional, either a single sided formula or a character vector giving names of terms to be included in all models. See ‘Subsetting’. 
m.lim, m.max, m.min 
optionally, the limits 
subset 
logical expression describing models to keep in the resulting set. See ‘Subsetting’. 
trace 
if 
varying 
optionally, a named list describing the additional arguments
to vary between the generated models. Item names correspond to the
arguments, and each item provides a list of choices (i.e. 
extra 
optional additional statistics to include in the result,
provided as functions, function names or a list of such (best if named
or quoted). Similarly as in 
x 
a 
abbrev.names 
should printed term names be abbreviated? (useful with complex models). 
warnings 
if 
ct.args 
optional list of arguments to be passed to

... 
optional arguments for the 
Models are fitted through repeated evaluation of modified call extracted from
the global.model
(in a similar fashion as with update
). This
approach, while robust in that it can be applied to most model types,
is inefficient because of its considerable computational overhead.
Note that the number of combinations grows exponentially with number of predictors (2ⁿ, less when interactions are present, see below).
The fitted model objects are not stored in the result. To get (possibly a subset of)
the models, use get.models
on the object returned by dredge
.
Another way of getting all the models is to run
lapply(dredge(..., evaluate = FALSE), eval)
,
which avoids fitting the models twice.
For a list of model types that can be used as a global.model
see
list of supported models.
Modelling functions not storing call
in their result should be evaluated
via the wrapper function created by updateable
.
rank
is found by a call to match.fun
and may be specified as a
function or a symbol or a character string specifying
a function to be searched for from the environment of the call to dredge
.
The function rank
must accept model object as its first argument and
always return a scalar.
By default, marginality constraints are respected, so “all possible
combinations” include only those containing interactions with their
respective main effects and all lower order terms.
However, if global.model
makes an exception to this principle (e.g. due
to a nested design such as a / (b + d)
), this will be reflected in the
subset models.
There are three ways to constrain the resulting set of models: setting limits to
the number of terms in a model with m.lim
, binding
term(s) to all models with fixed
, and more complex rules can be applied
using argument subset
. To be included in the selection table, the model
formulation must satisfy all these conditions.
subset
can take either a form of an expression or a matrix.
The latter should be a lower triangular matrix with logical values, where
columns and rows correspond to global.model
terms. Value
subset["a", "b"] == FALSE
will exclude any model containing both terms
a and b.
demo(dredge.subset)
has examples of using the subset
matrix in
conjunction with correlation matrices to exclude models containing collinear
predictors.
In the form of expression
, the argument subset
acts in a similar
fashion to that in the function subset
for data.frames
: model
terms can be referred to by name as variables in the expression, with the
difference being that are interpreted as logical values (i.e. equal to
TRUE
if the term exists in the model).
There is also .(x)
and .(+x)
notation indicating, respectively,
any and all interactions including a term x
. It is only useful
with marginality exceptions.
The expression can contain any of the global.model
terms
(getAllTerms(global.model)
lists them), as well as names of the
varying
argument items. Names of global.model
terms take
precedence when identical to names of varying
, so to avoid ambiguity
varying
variables in subset
expression should be enclosed in
V()
(e.g. subset = V(family) == "Gamma"
assuming that
varying
is something like list(family = c(..., "Gamma"))
).
If item names in varying
are missing, the items themselves are coerced to
names. Call and symbol elements are represented as character values (via
deparse
), and everything except numeric, logical, character and
NULL
values is replaced by item numbers (e.g. varying =
list(family =
list(..., Gamma)
should be referred to as
subset = V(family) == 2
. This can quickly become confusing, therefore it
is recommended to use named lists. demo(dredge.varying)
provides examples.
The subset
expression can also contain variable
`*nvar*`
(backtickquoted), equal to
number of terms in the model (not the number of estimated parameters).
To make inclusion of a model term conditional on presence of another model term,
the function dc
(“dependency chain”) can be used in
the subset
expression. dc
takes any number of term names as
arguments, and allows a term to be included only if all preceding ones
are also present (e.g. subset = dc(a, b, c)
allows for models a
,
a+b
and a+b+c
but not b
, c
, b+c
or
a+c
).
subset
expression can have a form of an unevaluated call
,
expression
object, or a one sided formula
. See ‘Examples’.
Compound model terms (such as interactions, ‘asis’ expressions within
I()
or smooths in gam
) should be enclosed within curly brackets
(e.g. {s(x,k=2)}
), or backticks (like nonsyntactic
names, e.g.
`s(x, k = 2)`
).
Backticksquoted names must match exactly (including whitespace) the term names
as given by getAllTerms
.
subset
expression syntax summarya & b
indicates that model terms a and b must be present (see Logical Operators)
{log(x,2)}
or `
log(x, 2)
`
represent a complex
model term log(x, 2)
V(x)
represents a varying
variable x
.(x)
indicates that at least one term containing the term x must be present
.(+x)
indicates that all the terms containing the term x must be present
dc(a, b, c,...)
‘dependency chain’: b is allowed only if a is present, and c only if both a and b are present, etc.
`*nvar*`
number of terms.
To simply keep certain terms in all models, use of argument fixed
is much
more efficient. The fixed
formula is interpreted in the same manner
as model formula and so the terms need not to be quoted.
Use of na.action = "na.omit"
(R's default) or "na.exclude"
in
global.model
must be avoided, as it results with submodels fitted to
different data sets, if there are missing values. Error is thrown if it is
detected.
It is a common mistake to give na.action
as an argument in the call
to dredge
(typically resulting in an error from the rank
function to which the argument is passed through ‘...’), while the correct way
is either to pass na.action
in the call to the global model or to set
it as a global option.
There are subset
and
plot
methods, the latter creates a
graphical representation of model weights and variable relative importance.
Coefficients can be extracted with coef
or coefTable
.
An object of class c("model.selection", "data.frame")
, being a
data.frame
, where each row represents one model.
See model.selection.object
for its structure.
Users should keep in mind the hazards that a “thoughtless approach” of evaluating all possible models poses. Although this procedure is in certain cases useful and justified, it may result in selecting a spurious “best” model, due to the model selection bias.
“Let the computer find out” is a poor strategy and usually reflects the fact that the researcher did not bother to think clearly about the problem of interest and its scientific setting (Burnham and Anderson, 2002).
Kamil Bartoń
pdredge
is a parallelized version of this function (uses a
cluster).
get.models
, model.avg
. model.sel
for
manual model selection tables.
Possible alternatives: glmulti
in package glmulti
and bestglm
(bestglm).
regsubsets
in package leaps also performs allsubsets
regression.
Lasso variable selection provided by various packages, e.g. glmnet, lars or glmmLasso.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60  # Example from Burnham and Anderson (2002), page 100:
# prevent fitting submodels to different datasets
options(na.action = "na.fail")
fm1 < lm(y ~ ., data = Cement)
dd < dredge(fm1)
subset(dd, delta < 4)
# Visualize the model selection table:
par(mar = c(3,5,6,4))
plot(dd, labAsExpr = TRUE)
# Model average models with delta AICc < 4
model.avg(dd, subset = delta < 4)
#or as a 95% confidence set:
model.avg(dd, subset = cumsum(weight) <= .95) # get averaged coefficients
#'Best' model
summary(get.models(dd, 1)[[1]])
## Not run:
# Examples of using 'subset':
# keep only models containing X3
dredge(fm1, subset = ~ X3) # subset as a formula
dredge(fm1, subset = expression(X3)) # subset as expression object
# the same, but more effective:
dredge(fm1, fixed = "X3")
# exclude models containing both X1 and X2 at the same time
dredge(fm1, subset = !(X1 && X2))
# Fit only models containing either X3 or X4 (but not both);
# include X3 only if X2 is present, and X2 only if X1 is present.
dredge(fm1, subset = dc(X1, X2, X3) && xor(X3, X4))
# the same as above, without "dc"
dredge(fm1, subset = (X1  !X2) && (X2  !X3) && xor(X3, X4))
# Include only models with up to 2 terms (and intercept)
dredge(fm1, m.lim = c(0, 2))
## End(Not run)
# Add R^2 and Fstatistics, use the 'extra' argument
dredge(fm1, m.lim = c(NA, 1), extra = c("R^2", F = function(x)
summary(x)$fstatistic[[1]]))
# with summary statistics:
dredge(fm1, m.lim = c(NA, 1), extra = list(
"R^2", "*" = function(x) {
s < summary(x)
c(Rsq = s$r.squared, adjRsq = s$adj.r.squared,
F = s$fstatistic[[1]])
})
)
# Add other information criterions (but rank with AICc):
dredge(fm1, m.lim = c(NA, 1), extra = alist(AIC, BIC, ICOMP, Cp))

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.