ComputeInterestingTuples | R Documentation |
Interesting tuples
ComputeInterestingTuples( data, decision = NULL, dimensions = 2, divisions = NULL, discretizations = 1, seed = NULL, range = NULL, pc.xi = 0.25, ig.thr = 0, I.lower = NULL, interesting.vars = vector(mode = "integer"), require.all.vars = FALSE, return.matrix = FALSE )
data |
input data where columns are variables and rows are observations (all numeric) |
decision |
decision variable as a binary sequence of length equal to number of observations |
dimensions |
number of dimensions (a positive integer; 5 max) |
divisions |
number of divisions (from 1 to 15; |
discretizations |
number of discretizations |
seed |
seed for PRNG used during discretizations ( |
range |
discretization range (from 0.0 to 1.0; |
pc.xi |
parameter xi used to compute pseudocounts (the default is recommended not to be changed) |
ig.thr |
IG threshold above which the tuple is interesting (0 and negative mean no filtering) |
I.lower |
IG values computed for lower dimension (1D for 2D, etc.) |
interesting.vars |
variables for which to check the IGs (none = all) |
require.all.vars |
boolean whether to require tuple to consist of only interesting.vars |
return.matrix |
boolean whether to return a matrix instead of a list (ignored if not using the optimised method variant) |
If running in 2D and no filtering is applied, this function is able to run in an optimised fashion. It is recommended to avoid filtering in 2D if only it is feasible.
If decision
is omitted, this function calculates mutual information.
Translate "IG" to mutual information in the rest of this function's
description, except for I.lower
where it means entropy.
A data.frame
or NULL
(following a warning) if no tuples are found.
The following columns are present in the data.frame
:
Var
– interesting variable index
Tuple.1, Tuple.2, ...
– corresponding tuple (up to dimensions
columns)
IG
– information gain achieved by var
in Tuple.*
Additionally attribute named run.params
with run parameters is set on the result.
ig.1d <- ComputeMaxInfoGains(madelon$data, madelon$decision, dimensions = 1, divisions = 1, range = 0, seed = 0) ComputeInterestingTuples(madelon$data, madelon$decision, dimensions = 2, divisions = 1, range = 0, seed = 0, ig.thr = 100, I.lower = ig.1d$IG)
