screen.ranger: Screen features via a fast implementation of Random Forest
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.ranger

R Documentation

Screen features via a fast implementation of Random Forest

Description

Speed up screen.randomForest or screen.randomForest.imp. Uses the cutoff selectors.

Usage

screen.ranger(
  Y,
  X,
  family,
  selector = c("cutoff.biggest.diff", "cutoff.k", "cutoff.k.percent"),
  k = switch(selector, cutoff.k = ceiling(0.5 * ncol(X)), cutoff.k.percent = 0.5, NULL),
  nTree = 1000,
  mTry = ifelse(family$family == "gaussian", floor(sqrt(ncol(X))), max(floor(ncol(X)/3),
    1)),
  nodeSize = ifelse(family$family == "gaussian", 5, 1),
  importanceType = c("permutation", "impurity"),
  scalePermutationImportance = TRUE,
  probabilityTrees = FALSE,
  numThreads = 1,
  verbose = FALSE,
  ...
)

Arguments

`Y`	Outcome (numeric vector). See `SuperLearner` for specifics.
`X`	Predictor variable(s) (data.frame or matrix). See `SuperLearner` for specifics.
`family`	Error distribution to be used in the model: `gaussian` or `binomial`. Currently unused. See `SuperLearner` for specifics.
`selector`	A string corresponding to a subset selecting function implemented in the FSelector package. One of: `cutoff.biggest.diff` (default), `cutoff.k`, or `cutoff.k.percent`.
`k`	Passed through to the `selector` in the case where `selector` is `cutoff.k` or `cutoff.k.percent`. Otherwise, should remain NULL (the default). For `cutoff.k`, this is an integer indicating the number of features to keep from `X`. For `cutoff.k.percent`, this is instead the proportion of features to keep.
`nTree`	Integer. Number of trees. Default: 1000.
`mTry`	Integer. Number of columns of `X` sampled at each split. Default: square root (`gaussian()` family) or one third (`binomial()` family) of total number of features, rounded down.
`nodeSize`	Integer. Minimum number of observations in terminal nodes. Default: 5 (`gaussian()` family) or 1 (`binomial()` family).
`importanceType`	Importance type. `"permutation"` (default) indicates mean decrease in accuracy (for `binomial()` family) or percent increase in mean squared error (for `gaussian()` family) when comparing predictions using the original variable versus a permuted version of the variable (column of `X`). `"impurity"` indicates increase in node purity achieved by splitting on that column of `X` (for `binomial()` family, measured by Gini index; for `gaussian()`, measured by variance of the responses). See `ranger` for more details.
`scalePermutationImportance`	Scale permutation importance by standard error. Ignored if `importanceType = "impurity"`. See `ranger` for more details.
`probabilityTrees`	Logical. If family is `binomial()` and `probabilityTrees` is FALSE (the default), classification trees are grown. If family is `binomial()` and `probabilityTrees` is TRUE (the default), probability trees are grown (Malley et al., 2012). Ignored if family is `gaussian()`, for which regression trees are always grown. See `ranger` for more details.
`numThreads`	Number of threads. Default: 1.
`verbose`	Should debugging messages be printed? Default: `FALSE`.
`...`	Currently unused.

Value

A logical vector with length equal to ncol(X).

References

http://dx.doi.org/10.18637/jss.v077.i01 http://dx.doi.org/10.1023/A:1010933404324 http://dx.doi.org/10.3414/ME00-01-0052

Examples

data(iris)
Y <- as.numeric(iris$Species=="setosa")
X <- iris[,-which(colnames(iris)=="Species")]
screen.ranger(Y, X, binomial(), selector = "cutoff.k.percent", k = 0.75)

data(mtcars)
Y <- mtcars$mpg
X <- mtcars[,-which(colnames(mtcars)=="mpg")]
screen.ranger(Y, X, gaussian(), importanceType = "impurity")

# based on examples in SuperLearner package
set.seed(1)
n <- 100
p <- 20
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- X[, 1] + sqrt(abs(X[, 2] * X[, 3])) + X[, 2] - X[, 3] + rnorm(n)

library(SuperLearner)
sl = SuperLearner(Y, X, family = gaussian(), cvControl = list(V = 2),
                  SL.library = list(c("SL.glm", "All"),
                                    c("SL.glm", "screen.ranger")))
sl
sl$whichScreen

saraemoore/SLScreenExtra documentation built on Nov. 4, 2023, 9:31 p.m.

saraemoore/SLScreenExtra index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

saraemoore/SLScreenExtra
A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.ranger: Screen features via a fast implementation of Random Forest
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

Screen features via a fast implementation of Random Forest

Description

Usage

Arguments

Value

References

Examples

Related to screen.ranger in saraemoore/SLScreenExtra...

R Package Documentation

Browse R Packages

We want your feedback!

saraemoore/SLScreenExtra A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.ranger: Screen features via a fast implementation of Random Forest In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

Screen features via a fast implementation of Random Forest

Description

Usage

Arguments

Value

References

Examples

Related to screen.ranger in saraemoore/SLScreenExtra...

R Package Documentation

Browse R Packages

We want your feedback!

saraemoore/SLScreenExtra
A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.ranger: Screen features via a fast implementation of Random Forest
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner