SDIGA: Subgroup Discovery Iterative Genetic Algorithm (SDIGA)

Description Usage Arguments Details Value How does this algorithm work? Parameters file structure Objective values References Examples

View source: R/SDIGA.R

Description

Perfoms a subgroup discovery task executing the algorithm SDIGA

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
SDIGA(
  parameters_file = NULL,
  training = NULL,
  test = NULL,
  output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
  seed = 0,
  nLabels = 3,
  nEval = 10000,
  popLength = 100,
  mutProb = 0.01,
  RulesRep = "can",
  Obj1 = "CSUP",
  w1 = 0.7,
  Obj2 = "CCNF",
  w2 = 0.3,
  Obj3 = "null",
  w3 = 0,
  minConf = 0.6,
  lSearch = "yes",
  targetVariable = NA,
  targetClass = "null"
)

Arguments

parameters_file

The path of the parameters file. NULL If you want to use training and test SDEFSR_Dataset variables

training

A SDEFSR_Dataset class variable with training data.

test

A SDEFSR_Dataset class variable with training data.

output

character vector with the paths of where store information file, rules file and test quality measures file, respectively.

seed

An integer to set the seed used for generate random numbers.

nLabels

Number of linguistic labels that represents numerical variables.

nEval

An integer for set the maximum number of evaluations in the evolutive process.

popLength

An integer to set the number of individuals in the population.

mutProb

Sets the mutation probability. A number in [0,1].

RulesRep

Representation used in the rules. "can" for canonical rules, "dnf" for DNF rules.

Obj1

Sets the Objective number 1. See Objective values for more information about the possible values.

w1

Sets the weight of Obj1.

Obj2

Sets the Objective number 2. See Objective values for more information about the possible values.

w2

Sets the weight of Obj2.

Obj3

Sets the Objective number 3. See Objective values for more information about the possible values.

w3

Sets the weight of Obj3.

minConf

Sets the minimum confidence that must have the rule returned by the genetic algorithm after the local optimitation phase. A number in [0,1].

lSearch

Sets if the local optimitation phase must be performed. A string with "yes" or "no".

targetVariable

A string with the name or an integer with the index position of the target variable (or class). It must be a categorical one.

targetClass

A string specifing the value the target variable. null for search for all possible values.

Details

This function sets as target variable the last one that appear in SDEFSR_Dataset object. If you want to change the target variable, you can set the targetVariable to change this target variable. The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find rules for all possible values of the target varaible. targetClass sets a value of the target variable where the algorithm only finds rules about this value.

If you specify in paramFile something distinct to NULL the rest of the parameters are ignored and the algorithm tries to read the file specified. See "Parameters file structure" below if you want to use a parameters file.

Value

The algorithm shows in the console the following results:

  1. The parameters used in the algorithm

  2. The rules generated.

  3. The quality measures for test of every rule and the global results.

Also, the algorithms save those results in the files specified in the output parameter of the algorithm or in the outputData parameter in the parameters file.

How does this algorithm work?

This algorithm has a genetic algorithm in his core. This genetic algorithm returns only the best rule of the population and it is executed so many times until a stop condition is reached. The stop condition is that the rule returned must cover at least one new example (not covered by previous rules) and must have a confidence greater than a minimum.

After returning the rule, a local improvement could be applied for make the rule more general. This local improve is done by means of a hill-climbing local search.

The genetic algorithm cross only the two best individuals. But the mutation operator is applied over all the population, individuals from cross too.

Parameters file structure

The parameters_file argument points to a file which has the necesary parameters for SDIGA works. This file must be, at least, those parameters (separated by a carriage return):

An example of parameter file could be:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
algorithm = SDIGA
inputData = "irisD-10-1tra.dat" "irisD-10-1tst.dat"
outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt"
seed = 0
nLabels = 3
nEval = 500
popLength = 100
mutProb = 0.01
minConf = 0.6
RulesRep = can
Obj1 = Comp
Obj2 = Unus
Obj3 = null
w1 = 0.7
w2 = 0.3
w3 = 0.0
lSearch = yes

Objective values

You can use the following quality measures in the ObjX value of the parameter file using this values:

If you dont want to use a objetive value you must specify null

References

M. J. del Jesus, P. Gonzalez, F. Herrera, and M. Mesonero, "Evolutionary Fuzzy Rule Induction Process for Subgroup Discovery: A case study in marketing," IEEE Transactions on Fuzzy Systems, vol. 15, no. 4, pp. 578-592, 2007.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
SDIGA(parameters_file = NULL, 
      training = habermanTra, 
      test = habermanTst, 
      output = c(NA, NA, NA),
      seed = 0, 
      nLabels = 3,
      nEval = 300, 
      popLength = 100, 
      mutProb = 0.01, 
      RulesRep = "can",
      Obj1 = "CSUP", 
      w1 = 0.7,
      Obj2 = "CCNF",
      w2 = 0.3,
      Obj3 = "null",
      w3 = 0,
      minConf = 0.6,
      lSearch = "yes",
      targetClass = "positive")
## Not run: 
SDIGA(parameters_file = NULL, 
      training = habermanTra, 
      test = habermanTst, 
      output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
      seed = 0, 
      nLabels = 3,
      nEval = 300, 
      popLength = 100, 
      mutProb = 0.01, 
      RulesRep = "can",
      Obj1 = "CSUP", 
      w1 = 0.7,
      Obj2 = "CCNF",
      w2 = 0.3,
      Obj3 = "null",
      w3 = 0,
      minConf = 0.6,
      lSearch = "yes",
      targetClass = "positive")
      
## End(Not run)

SDEFSR documentation built on April 30, 2021, 9:10 a.m.