screen.FSelector.entropy | R Documentation |
Information gain, gain ratio, and symmetrical uncertainty scores are
calculated from the Shannon entropy of X
and Y
. Information
gain (information.gain
) (or, equivalently, mutual
information) is a measure of entropy reduction achieved by the feature with
regard to the outcome, Y
. The information gain ratio
(gain.ratio
) is a normalized version of information
gain, normalized by the entropy of the feature. Symmetrical uncertainty
(symmetrical.uncertainty
) is a normalized and
bias-corrected version of information gain. Implemented for binomial()
family only and designed to be used with binary or categorical X
.
Continuous X
will be discretized by FSelector
and
Discretize
using the MDL method (Fayyad & Irani, 1993).
screen.FSelector.entropy(
Y,
X,
family,
filter = c("symmetrical.uncertainty", "gain.ratio", "information.gain"),
unit = formals(information.gain)$unit,
selector = c("cutoff.biggest.diff", "cutoff.k", "cutoff.k.percent"),
k = switch(selector, cutoff.k = ceiling(0.5 * ncol(X)), cutoff.k.percent = 0.5, NULL),
verbose = FALSE,
...
)
Y |
Outcome (numeric vector). See |
X |
Predictor variable(s) (data.frame or matrix). See
|
family |
Error distribution to be used in the model:
|
filter |
Character string. One of: |
unit |
Unit in which entropy is measured by
|
selector |
A string corresponding to a subset selecting function
implemented in the FSelector package. One of:
|
k |
Passed through to the |
verbose |
Should debugging messages be printed? Default: |
... |
Currently unused. |
A logical vector with length equal to ncol(X)
.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.4643 http://hdl.handle.net/2014/35171
data(iris)
Y <- as.numeric(iris$Species=="setosa")
X <- iris[,-which(colnames(iris)=="Species")]
screen.FSelector.entropy(Y, X, binomial(), selector = "cutoff.k.percent", k = 0.75)
# based on example in SuperLearner package
set.seed(1)
n <- 100
p <- 20
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
X <- data.frame(X)
Y <- rbinom(n, 1, plogis(.2*X[, 1] + .1*X[, 2] - .2*X[, 3] + .1*X[, 3]*X[, 4] - .2*abs(X[, 4])))
library(SuperLearner)
sl = SuperLearner(Y, X, family = binomial(), cvControl = list(V = 2),
SL.library = list(c("SL.lm", "All"),
c("SL.lm", "screen.FSelector.entropy")))
sl
sl$whichScreen
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.