screen.FSelector.cfs: Correlation Feature Selection (CFS) screening algorithm

View source: R/fselector.R

screen.FSelector.cfsR Documentation

Correlation Feature Selection (CFS) screening algorithm

Description

CFS (Hall, 1999) utilizes best.first.search to find columns of X correlated with Y but not with one another (i.e., not redundant). CFS, combined with a search algorithm, does not rank features and therefore does not allow for specification of either the number of features to be chosen (k) or the method by which they should be chosen (selector).

Usage

screen.FSelector.cfs(Y, X, family, verbose = FALSE, ...)

Arguments

Y

Outcome (numeric vector). See SuperLearner for specifics.

X

Predictor variable(s) (data.frame or matrix). See SuperLearner for specifics.

family

Error distribution to be used in the model: gaussian or binomial. Currently unused. See SuperLearner for specifics.

verbose

Should debugging messages be printed? Default: FALSE.

...

Currently unused.

Value

A logical vector with length equal to ncol(X).

References

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.4643

Examples

data(iris)
Y <- as.numeric(iris$Species=="setosa")
X <- iris[,-which(colnames(iris)=="Species")]
screen.FSelector.cfs(Y, X, binomial())

data(mtcars)
Y <- mtcars$mpg
X <- mtcars[,-which(colnames(mtcars)=="mpg")]
screen.FSelector.cfs(Y, X, gaussian())

# based on examples in SuperLearner package
set.seed(1)
n <- 100
p <- 20
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- X[, 1] + sqrt(abs(X[, 2] * X[, 3])) + X[, 2] - X[, 3] + rnorm(n)

library(SuperLearner)
sl = SuperLearner(Y, X, family = gaussian(), cvControl = list(V = 2),
                  SL.library = list(c("SL.glm", "All"),
                                    c("SL.glm.interaction", "screen.FSelector.cfs")))
sl
sl$whichScreen

saraemoore/SLScreenExtra documentation built on Nov. 4, 2023, 9:31 p.m.