ComputeMaxInfoGainsDiscrete: Max information gains (discrete)

View source: R/information_gain.R

ComputeMaxInfoGainsDiscreteR Documentation

Max information gains (discrete)

Description

Max information gains (discrete)

Usage

ComputeMaxInfoGainsDiscrete(
  data,
  decision = NULL,
  dimensions = 1,
  pc.xi = 0.25,
  return.tuples = FALSE,
  return.min = FALSE,
  interesting.vars = vector(mode = "integer"),
  require.all.vars = FALSE
)

Arguments

data

input data where columns are variables and rows are observations (all discrete with the same number of categories)

decision

decision variable as a binary sequence of length equal to number of observations

dimensions

number of dimensions (a positive integer; 5 max)

pc.xi

parameter xi used to compute pseudocounts (the default is recommended not to be changed)

return.tuples

whether to return tuples where max IG was observed (one tuple per variable) - not supported with CUDA nor in 1D

return.min

whether to return min instead of max (per tuple) - not supported with CUDA

interesting.vars

variables for which to check the IGs (none = all) - not supported with CUDA

require.all.vars

boolean whether to require tuple to consist of only interesting.vars

Details

If decision is omitted, this function calculates either the variable entropy (in 1D) or mutual information (in higher dimensions). Translate "IG" respectively to entropy or mutual information in the rest of this function's description.

Value

A data.frame with the following columns:

  • IG – max information gain (of each variable)

  • Tuple.1, Tuple.2, ... – corresponding tuple (up to dimensions columns, available only when return.tuples == T)

  • Discretization.nr – always 1 (for compatibility with the non-discrete function; available only when return.tuples == T)

Additionally attribute named run.params with run parameters is set on the result.

Examples


ComputeMaxInfoGainsDiscrete(madelon$data > 500, madelon$decision, dimensions = 2)


MDFS documentation built on April 19, 2022, 5:05 p.m.