Function to perform the indirect bagging and subagging.
1 2 3 4
data frame of explanatory, intermediate and response variables.
list of lists, which describe models for the intermediate variables, details are given below.
either a fixed function with argument
number of bootstrap samples.
proportion of sample to be drawn from the learning sample. By default, subagging with 50% is performed, i.e. draw 0.5*n out of n without replacement.
logical. Draw with or without replacement.
additional arguments (e.g.
A given data set is subdivided into three types of variables: explanatory, intermediate and response variables.
Here, each specified intermediate variable is modelled separately
pFUN, a list of lists with elements specifying an
arbitrary number of models for the intermediate variables and an
training.set = c("oob", "bag", "all"). The
training.set determines whether, predictive models for
the intermediate are calculated based on the out-of-bag sample
"oob"), the default, on the bag sample (
"bag") or on all
available observations (
"all"). The elements of
specifying the models for the intermediate variables are lists as
Note that, if no formula is given in these elements, the functional
formula is used.
The response variable is modelled following
This can either be a fixed classifying function as described in Peters
et al. (2003) or a list,
which specifies the modelling technique to be applied. The list
contains the arguments
model (which model to be fitted),
predict (optional, how to predict),
formula (optional, of
y~w1+w2+w3+x1+x2 determines the variables the classifying
function is based on) and the optional argument
c("fitted.bag", "original", "fitted.subset")
specifying whether the classifying function is trained on the predicted
observations of the bag sample (
on the original observations (
"original") or on the
predicted observations not included in a defined subset
"fitted.subset"). Per default the formula specified in
formula determines the variables, the classifying function is
Note that the default of
cFUN = list(model = NULL, training.set = "fitted.bag")
uses the function
the predict function
predict(object, newdata, type = "class").
An object of class
"inbagg", that is a list with elements
a list of length
vector of response values.
data frame of intermediate variables.
data frame of explanatory variables.
David J. Hand, Hua Gui Li, Niall M. Adams (2001), Supervised classification with structured class definitions. Computational Statistics & Data Analysis 36, 209–225.
Andrea Peters, Berthold Lausen, Georg Michelson and Olaf Gefeller (2003), Diagnosis of glaucoma by indirect classifiers. Methods of Information in Medicine 1, 99-103.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
library("MASS") library("rpart") y <- as.factor(sample(1:2, 100, replace = TRUE)) W <- mvrnorm(n = 200, mu = rep(0, 3), Sigma = diag(3)) X <- mvrnorm(n = 200, mu = rep(2, 3), Sigma = diag(3)) colnames(W) <- c("w1", "w2", "w3") colnames(X) <- c("x1", "x2", "x3") DATA <- data.frame(y, W, X) pFUN <- list(list(formula = w1~x1+x2, model = lm, predict = mypredict.lm), list(model = rpart)) inbagg(y~w1+w2+w3~x1+x2+x3, data = DATA, pFUN = pFUN)