bag.s: Subsampled Model Bagging

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/MSE_Test_File.R

Description

An implementation of different subsampled models, to then be used to in the MSE comparison procedure.

Usage

1
2
3
4
bag.s(X, y, base.learner = "rpart", ntree, k, mtry = ncol(X),
  form = as.formula("y~."), alpha = if (base.learner == "lm") 1 else NULL,
  glm_cv = if (base.learner == "lm") "external" else "none",
  lambda = if (glm_cv == "none" & base.learner == "lm") 1 else NULL, ranger = F)

Arguments

X

Data frame of covariates.

y

Response vector. Currently only numeric responses (regression) are supported.

base.learner

One of "rpart", "ctree", "rtree", or "lm". Base model to be used in the bagging.

ntree

Number of base learners.

k

Subsample size - each model is trained on k < n observations drawn without replacement.

mtry

"mtry" parameter associated with random forest models.

form

A "formula" object - no need to provide this by default.

alpha

Mixing parameter if base.learner = "lm" is chosen, quantifies amount between LASSO and Ridge penalties.

glm_cv

Should internal cross validation be performed on each Elastic Net model?

lambda

Regularization parameter if base.learner = "lm" is chosen.

ranger

If base.learner = "rtree" or base.learner = "ctree", should the models be ranger objects or randomForest objects (if rtree is chosen) or cforest objects (if ctree is chosen.)

Details

This function is not intended to be used as a standalone, rather it is called by the MSE_Test function.

Value

A list of length ntree, each containing the base learner model.

Author(s)

Tim Coleman

See Also

MSE_Test

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
N <- 1250
Nvar <- 10
N_test <- 150
name_vec <- paste("X", 1:(2*Nvar), sep = "")

# training data:
X <- data.frame(replicate(Nvar, runif(N)),
                replicate(Nvar, cut(runif(N), 3,
                                      labels = as.character(1:3)))) 
  mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) +  rnorm(N, sd = .05))
names(X) <- c(name_vec, "Y")

# some testing data:
X.t1 <- data.frame(replicate(Nvar, runif(N_test)),
                   replicate(Nvar, cut(runif(N_test), 3,
                                       labels = as.character(1:3)))) 
  mutate(Y = 5*(X3) + .5*X2^2 + ifelse(X6 > 10*X1*X8*X9, 1, 0) +  rnorm(N_test, sd = .05))
names(X.t1) <- c(name_vec, "Y")


## Trying each base learner
b.rpart <- bag.s(X = X 
                 base.learner = "rpart", ntree = 10, k = N^.85, mtry = 10, form = Y~.)
b.ctree <- bag.s(X = X 
                 base.learner = "ctree", ntree = 10, k =N^.95, mtry = 2)
b.rf <- bag.s(X = X 
              base.learner = "rtree", ntree = 10, k = N^.95, mtry = 2, Y~., ranger = F)
b.glmnet <-  bag.s(X = X 
                   base.learner = "lm", ntree = 10, k = N^.95, mtry = 2)

tim-coleman/RFtest documentation built on March 10, 2020, 12:28 p.m.