View source: R/information_gain.R
| information_gain | R Documentation | 
Algorithms that find ranks of importance of discrete attributes, basing on their entropy with a continous class attribute. This function is a reimplementation of FSelector's information.gain, gain.ratio and symmetrical.uncertainty.
information_gain(
  formula,
  data,
  x,
  y,
  type = c("infogain", "gainratio", "symuncert"),
  equal = FALSE,
  discIntegers = TRUE,
  nbins = 5,
  threads = 1
)
| formula | An object of class formula with model description. | 
| data | A data.frame accompanying formula. | 
| x | A data.frame or sparse matrix with attributes. | 
| y | A vector with response variable. | 
| type | Method name. | 
| equal | A logical. Whether to discretize dependent variable with the
 | 
| discIntegers | logical value. If true (default), then integers are treated as numeric vectors and they are discretized. If false integers are treated as factors and they are left as is. | 
| nbins | Number of bins used for discretization. Only used if 'equal = TRUE' and the response is numeric. | 
| threads | defunct. Number of threads for parallel backend - now turned off because of safety reasons. | 
type = "infogain" is 
H(Class) + H(Attribute) - H(Class,
Attribute)
type = "gainratio" is 
\frac{H(Class) + H(Attribute) - H(Class,
Attribute)}{H(Attribute)}
type = "symuncert" is 
2\frac{H(Class) + H(Attribute) - H(Class,
Attribute)}{H(Attribute) + H(Class)}
where H(X) is Shannon's Entropy for a variable X and H(X, Y) is a joint Shannon's Entropy for a variable X with a condition to Y.
data.frame with the following columns:
attributes - variables names.
importance - worth of the attributes.
Zygmunt Zawadzki zygmunt@zstat.pl
irisX <- iris[-5]
y <- iris$Species
## data.frame interface
information_gain(x = irisX, y = y)
# formula interface
information_gain(formula = Species ~ ., data = iris)
information_gain(formula = Species ~ ., data = iris, type = "gainratio")
information_gain(formula = Species ~ ., data = iris, type = "symuncert")
# sparse matrix interface
if(require("Matrix")) {
  library(Matrix)
  i <- c(1, 3:8); j <- c(2, 9, 6:10); x <- 7 * (1:7)
  x <- sparseMatrix(i, j, x = x)
  y <- c(1, 1, 1, 1, 2, 2, 2, 2)
  information_gain(x = x, y = y)
  information_gain(x = x, y = y, type = "gainratio")
  information_gain(x = x, y = y, type = "symuncert")
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.