D.global.discernibility.heuristic.RST: Supervised discretization based on the maximum discernibility...
In RoughSets: Data Analysis Using Rough Set and Fuzzy Rough Set Theories

D.global.discernibility.heuristic.RST

R Documentation

Supervised discretization based on the maximum discernibility heuristic

Description

It is a function used for computing globally semi-optimal cuts using the maximum discernibility heuristic.

Usage

D.global.discernibility.heuristic.RST(
  decision.table,
  maxNOfCuts = 2 * ncol(decision.table),
  attrSampleSize = ncol(decision.table) - 1,
  cutCandidatesList = NULL,
  discFunction = global.discernibility,
  ...
)

Arguments

`decision.table`	an object inheriting from the `"DecisionTable"` class, which represents a decision system. See `SF.asDecisionTable`. It should be noted that for this particular method all conditional attributes must be numeric.
`maxNOfCuts`	a positive integer indicating the maximum number of allowed cuts.
`attrSampleSize`	an integer between 1 and the number of conditional attributes (the default). It indicates the attribute sample size for the Monte Carlo selection of candidating cuts.
`cutCandidatesList`	an optional list containing candidates for optimal cut values. By default the candidating cuts are determined automatically.
`discFunction`	a function used for computation of cuts. Currently only one implementation of maximu discernibility heuristic is available (the default). However, this parameter can be used to integrate custom implementations of discretization functions with the `RoughSets` package.
`...`	additional parameters to the `discFunction` (currently unsupported).

Details

A complete description of the implemented algorithm can be found in (Nguyen, 2001).

It should be noted that the output of this function is an object of a class "Discretization" which contains the cut values. The function SF.applyDecTable has to be used in order to generate the new (discretized) decision table.

Value

An object of a class "Discretization" which stores cuts for each conditional attribute. See D.discretization.RST.

Author(s)

Andrzej Janusz

References

S. H. Nguyen, "On Efficient Handling of Continuous Attributes in Large Data Bases", Fundamenta Informaticae, vol. 48, p. 61 - 81 (2001).

Jan G. Bazan, Hung Son Nguyen, Sinh Hoa Nguyen, Piotr Synak, and Jakub Wroblewski, "Rough Set Algorithms in Classification Problem", Chapter 2 In: L. Polkowski, S. Tsumoto and T.Y. Lin (eds.): Rough Set Methods and Applications Physica-Verlag, Heidelberg, New York, p. 49 - 88 ( 2000).

Examples

#################################################################
## Example: Determine cut values and generate new decision table
#################################################################
data(RoughSetData)
wine.data <- RoughSetData$wine.dt
cut.values <- D.global.discernibility.heuristic.RST(wine.data)

## generate a new decision table:
wine.discretized <- SF.applyDecTable(wine.data, cut.values)
dim(wine.discretized)
lapply(wine.discretized, unique)

## remove attributes with only one possible value:
to.rm.idx <- which(sapply(lapply(wine.discretized, unique), function(x) length(x) == 1))
to.rm.idx
wine.discretized.reduced <- wine.discretized[-to.rm.idx]
dim(wine.discretized.reduced)

## check whether the attributes in the reduced data are a super-reduct of the original data:
colnames(wine.discretized.reduced)
class.idx <- which(colnames(wine.discretized.reduced) == "class")
sum(duplicated(wine.discretized.reduced)) == sum(duplicated(wine.discretized.reduced[-class.idx]))
## yes it is

RoughSets documentation built on May 29, 2024, 7:34 a.m.