screen.rf.fuzzy: Random forest screener selects pre-specified variables

Description Usage Arguments Details Super Learner See Also Examples

Description

Random forest screener for SuperLearner() that selects user specified variables in addition to variables chosen data-adaptively

Usage

1
2
3
4
screen.rf.fuzzy(Y, X, family, nVar = 10, ntree = 500,
  mtry = ifelse(family$family == "gaussian", floor(sqrt(ncol(X))),
  max(floor(ncol(X)/3), 1)), nodesize = ifelse(family$family == "gaussian", 5,
  1), ...)

Arguments

Y

outcome variable (specified in SuperLearner())

X

data frame

nVar

number of variables for the screener to select

var.index

indices of variables to always be included by the screener

Details

If you do not care about the exact number of variables the screener chooses, use this function rather than screen.rf.fix.exact. This function is faster, but will not necessarily return exactly nVar variables to SuperLearner(). screen.rf.fuzzy selects the top nVar variables, and then also makes sure the user specified variables are also passed to SuperLearner(). If the user specified variables are in the top nVar variables, then nVar variables will be passed to SuperLearner(). If any of the user specified variables are outside the top nVar variables, then more than nVar variables will be passed to SuperLearner().

Super Learner

See SuperLearner() documentation for information on additional arguments and instructions on implementing SuperLearner().

See Also

screen.glmnet.fix for lasso screener, screen.rf.exact for exact random forest screener.

Examples

1
2
3
4
If you do not know the indices of the variables you always want to include, 
 you can get them from the variable name, where newdat is the dataframe name:
 
 var.index <- c(which(colnames(newdat)=="sex"), which(colnames(newdat)=="age"))

sl-bergquist/SLscreeners documentation built on Dec. 2, 2019, 1:29 a.m.