pdredge: Automated model selection using parallel computation

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Parallelized version of dredge.

Usage

1
2
3
4
pdredge(global.model, cluster = NA, 
    beta = c("none", "sd", "partial.sd"), evaluate = TRUE, rank = "AICc", 
    fixed = NULL, m.lim = NULL, m.min, m.max, subset, trace = FALSE, 
    varying, extra, ct.args = NULL, check = FALSE, ...)

Arguments

global.model, beta, evaluate, rank

see dredge.

fixed, m.lim, m.max, m.min, subset, varying, extra, ct.args, ...

see dredge.

trace

displays the generated calls, but may not work as expected since the models are evaluated in batches rather than one by one.

cluster

either a valid "cluster" object, or NA for a single threaded execution.

check

either integer or logical value controlling how much checking for existence and correctness of dependencies is done on the cluster nodes. See ‘Details’.

Details

All the dependencies for fitting the global.model, including the data and any objects the modelling function will use must be exported into the cluster worker nodes (e.g. via clusterExport). The required packages must be also loaded thereinto (e.g. via clusterEvalQ(..., library(package)), before the cluster is used by pdredge.

If check is TRUE or positive, pdredge tries to check whether all the variables and functions used in the call to global.model are present in the cluster nodes' .GlobalEnv before proceeding further. This causes false errors if some arguments of the model call (other than subset) would be evaluated in data environment. In that case using check = FALSE (the default) is desirable.

If check is TRUE or greater than one, pdredge will compare the global.model updated at the cluster nodes with the one given as argument.

Value

See dredge.

Author(s)

Kamil Bartoń

See Also

makeCluster and other cluster related functions in packages parallel or snow.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# One of these packages is required:
## Not run: require(parallel) || require(snow)

# From example(Beetle)

Beetle100 <- Beetle[sample(nrow(Beetle), 100, replace = TRUE),]

fm1 <- glm(Prop ~ dose + I(dose^2) + log(dose) + I(log(dose)^2),
    data = Beetle100, family = binomial, na.action = na.fail)

msubset <- expression(xor(dose, `log(dose)`) & (dose | !`I(dose^2)`)
    & (`log(dose)` | !`I(log(dose)^2)`))
varying.link <- list(family = alist(logit = binomial("logit"),
    probit = binomial("probit"), cloglog = binomial("cloglog") ))

# Set up the cluster
clusterType <- if(length(find.package("snow", quiet = TRUE))) "SOCK" else "PSOCK"
clust <- try(makeCluster(getOption("cl.cores", 2), type = clusterType))

clusterExport(clust, "Beetle100")

# noticeable gain only when data has about 3000 rows (Windows 2-core machine)
print(system.time(dredge(fm1, subset = msubset, varying = varying.link)))
print(system.time(pdredge(fm1, cluster = FALSE, subset = msubset,
    varying = varying.link)))
print(system.time(pdd <- pdredge(fm1, cluster = clust, subset = msubset,
    varying = varying.link)))

print(pdd)

## Not run: 
# Time consuming example with 'unmarked' model, based on example(pcount).
# Having enough patience you can run this with 'demo(pdredge.pcount)'.
library(unmarked)
data(mallard)
mallardUMF <- unmarkedFramePCount(mallard.y, siteCovs = mallard.site,
    obsCovs = mallard.obs)
(ufm.mallard <- pcount(~ ivel + date + I(date^2) ~ length + elev + forest,
    mallardUMF, K = 30))
clusterEvalQ(clust, library(unmarked))
clusterExport(clust, "mallardUMF")

# 'stats4' is needed for AIC to work with unmarkedFit objects but is not
# loaded automatically with 'unmarked'.
require(stats4)
invisible(clusterCall(clust, "library", "stats4", character.only = TRUE))

#system.time(print(pdd1 <- pdredge(ufm.mallard,
#   subset = `p(date)` | !`p(I(date^2))`, rank = AIC)))

system.time(print(pdd2 <- pdredge(ufm.mallard, clust,
    subset = `p(date)` | !`p(I(date^2))`, rank = AIC, extra = "adjR^2")))


# best models and null model
subset(pdd2, delta < 2 | df == min(df))

# Compare with the model selection table from unmarked
# the statistics should be identical:
models <- get.models(pdd2, delta < 2 | df == min(df), cluster = clust)

modSel(fitList(fits = structure(models, names = model.names(models,
    labels = getAllTerms(ufm.mallard)))), nullmod = "(Null)")

## End(Not run)

stopCluster(clust)

MuMIn documentation built on Jan. 31, 2018, 1:03 a.m.