screen.FSelector.entropy: Entropy-based screening algorithms
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.FSelector.entropy

R Documentation

Entropy-based screening algorithms

Description

Information gain, gain ratio, and symmetrical uncertainty scores are calculated from the Shannon entropy of X and Y. Information gain (information.gain) (or, equivalently, mutual information) is a measure of entropy reduction achieved by the feature with regard to the outcome, Y. The information gain ratio (gain.ratio) is a normalized version of information gain, normalized by the entropy of the feature. Symmetrical uncertainty (symmetrical.uncertainty) is a normalized and bias-corrected version of information gain. Implemented for binomial() family only and designed to be used with binary or categorical X. Continuous X will be discretized by FSelector and Discretize using the MDL method (Fayyad & Irani, 1993).

Usage

screen.FSelector.entropy(
  Y,
  X,
  family,
  filter = c("symmetrical.uncertainty", "gain.ratio", "information.gain"),
  unit = formals(information.gain)$unit,
  selector = c("cutoff.biggest.diff", "cutoff.k", "cutoff.k.percent"),
  k = switch(selector, cutoff.k = ceiling(0.5 * ncol(X)), cutoff.k.percent = 0.5, NULL),
  verbose = FALSE,
  ...
)

Arguments

`Y`	Outcome (numeric vector). See `SuperLearner` for specifics.
`X`	Predictor variable(s) (data.frame or matrix). See `SuperLearner` for specifics.
`family`	Error distribution to be used in the model: `gaussian` or `binomial`. Currently unused. See `SuperLearner` for specifics.
`filter`	Character string. One of: `"symmetrical.uncertainty"` (default), `"gain.ratio"`, or `"information.gain"`
`unit`	Unit in which entropy is measured by `entropy`. Character string. One of: `"log"` (default), `"log2"`, or `"log10"`.
`selector`	A string corresponding to a subset selecting function implemented in the FSelector package. One of: `cutoff.biggest.diff`, `cutoff.k`, `cutoff.k.percent`, or `"all"`. Note that `"all"` is a not a function but indicates pass-thru should be performed in the case of a `filter` which selects rather than ranks features. Default: `"cutoff.biggest.diff"`.
`k`	Passed through to the `selector` in the case where `selector` is `cutoff.k` or `cutoff.k.percent`. Otherwise, should remain NULL (the default). For `cutoff.k`, this is an integer indicating the number of features to keep from `X`. For `cutoff.k.percent`, this is instead the proportion of features to keep.
`verbose`	Should debugging messages be printed? Default: `FALSE`.
`...`	Currently unused.

Value

A logical vector with length equal to ncol(X).

References

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.4643 http://hdl.handle.net/2014/35171

Examples

data(iris)
Y <- as.numeric(iris$Species=="setosa")
X <- iris[,-which(colnames(iris)=="Species")]
screen.FSelector.entropy(Y, X, binomial(), selector = "cutoff.k.percent", k = 0.75)

# based on example in SuperLearner package
set.seed(1)
n <- 100
p <- 20
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- rbinom(n, 1, plogis(.2*X[, 1] + .1*X[, 2] - .2*X[, 3] + .1*X[, 3]*X[, 4] - .2*abs(X[, 4])))

library(SuperLearner)
sl = SuperLearner(Y, X, family = binomial(), cvControl = list(V = 2),
                  SL.library = list(c("SL.lm", "All"),
                                    c("SL.lm", "screen.FSelector.entropy")))
sl
sl$whichScreen

saraemoore/SLScreenExtra documentation built on Nov. 4, 2023, 9:31 p.m.

saraemoore/SLScreenExtra index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

saraemoore/SLScreenExtra
A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.FSelector.entropy: Entropy-based screening algorithms
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

Entropy-based screening algorithms

Description

Usage

Arguments

Value

References

Examples

Related to screen.FSelector.entropy in saraemoore/SLScreenExtra...

R Package Documentation

Browse R Packages

We want your feedback!

saraemoore/SLScreenExtra A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.FSelector.entropy: Entropy-based screening algorithms In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

Entropy-based screening algorithms

Description

Usage

Arguments

Value

References

Examples

Related to screen.FSelector.entropy in saraemoore/SLScreenExtra...

R Package Documentation

Browse R Packages

We want your feedback!

saraemoore/SLScreenExtra
A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner

screen.FSelector.entropy: Entropy-based screening algorithms
In saraemoore/SLScreenExtra: A Collection of Additional Feature Selection Algorithms and Utilities for SuperLearner