Description Usage Arguments Details Value Examples
Experimental function for fitting mixed-effects prediction rule ensembles. Estimates a random intercept in addition to a prediction rule ensemble. This allows for analysing datasets with a clustered or multilevel structure, or longitudinal datasets. Experimental, so use at own risk.
1 2 3 |
formula |
a formula with three-part right-hand side, like
|
cluster |
optional character string supplying the name of the cluster
indicator. If specified, |
data |
dataframe containing the variables specified in |
penalty.par.val |
as usual. |
learnrate |
as usual. |
use.grad |
as usual. |
conv.thresh |
numeric vector of length 1, specifies the convergence
criterion for estimation of the model. If |
family |
as usual. Note: should be a character vector! |
ridge.ranef |
logical vector of length 1. Should random effects be
estimated through a ridge regression? If set to |
max.iter |
numeric vector of length 1. Maximum number of iterations performed to re-estimate fixed and random effects parameters. |
... |
further arguments to be passed to |
Function premixed() allows for taking into account a random intercept in I) rule induction and/or II) coefficient estimation. To take into account the random intercept in both rule induction and coefficient estimation, see Example 1 below. To take into account the random intercept only in coefficient estimation, see Example 2 below. Alternatively, it has been suggested that random effects do not need to be taken into account explicitly but only through employing a blocked bootstrap or subampling approach, see Examples 3a and 3b below.
Note that approaches / examples 1 and 2 can be combined with the third approach / example 3. However, whether employing a cluster bootstrap- or subsampling approach is actually sufficient to take info account the clustered structure is a topic that still needs to be addressed.
Note that random intercept-only models are currently supported. That is, random slopes can currently not be specified.
An object of class 'premixed'.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | ## Example 1: Take into account clustered structure in rule induction
## as well as coeficient estimation:
set.seed(42)
airq <- airquality[complete.cases(airquality),]
airq.ens1 <- premixed(Ozone ~ 1 | Month | Solar.R + Wind + Temp + Day, data = airq, ntrees = 10)
airq.ens1
## Example 2: Take into account clustered stucture in coefficient estimation
## only:
set.seed(42)
airq <- airquality[complete.cases(airquality),]
airq.ens2 <- premixed(Ozone ~ Solar.R + Wind + Temp + Day, cluster = "Month", data = airq,
ntrees = 10)
airq.ens2
## Example 3a: Take into account clustered structure in rule induction through
## bootstrap- or subsampling:
## Create a sampling function that bootstrap samples whole clusters:
bb_sampfunc <- function(cluster = airq$Month) {
result <- c()
for(i in sample(unique(cluster), replace = TRUE)) {
result <- c(result, which(cluster == i))
}
result
}
## Employ blocked bootstrap sampling function in fitting PRE:
library(pre)
set.seed(42)
airq.ens3a.bs <- pre(Ozone ~ Solar.R + Wind + Temp + Day, data = airq, sampfrac = bb_sampfunc)
airq.ens3a.bs
## Create a sampling function that subsamples ~75% of the clusters:
ss_sampfunc <- function(cluster = airq$Month, sampfrac = .75) {
result <- c()
n_clusters <- round(length(unique(cluster)) * sampfrac)
for(i in sample(unique(cluster), size = n_clusters, replace = FALSE)) {
result <- c(result, which(cluster == i))
}
result
}
## Employ cluster subsampling in fitting PRE:
library(pre)
set.seed(42)
airq.ens3a.ss <- pre(Ozone ~ Solar.R + Wind + Temp + Day, data = airq, sampfrac = ss_sampfunc)
airq.ens3a.ss
## Example 3b: Take into account clustered structure in both rule induction and
## coefficient estimation:
## Generate fold ids:
airq <- airquality[complete.cases(airquality),]
foldids <- vector("numeric", length = nrow(airq))
counter <- 0
for (i in unique(airq$Month)) {
counter <- counter + 1
foldids[airq$Month == i] <- counter
}
foldids
## Employ clustered bootstrap sampling function for rule induction, as well as
## cluster-specific fold ids for estimating coefficients:
set.seed(42)
airq.ens3b.ss <- pre(Ozone ~ Solar.R + Wind + Temp + Day, data = airq, sampfrac = ss_sampfunc,
foldid = foldids)
airq.ens3b.ss
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.