discretize: Discretize data to learn discrete Bayesian networks

Description Usage Arguments Value Note Author(s) References Examples

Description

Discretize data to learn discrete Bayesian networks.

Usage

1
  discretize(x, method, breaks = 3, ..., debug = FALSE)

Arguments

x

a data frame containing either numeric or factor columns.

method

a character string, either interval for interval discretization, quantile for quantile discretization (the default) or hartemink for Hartemink's pairwise mutual information method.

breaks

if method is set to hartemink, an integer number, the number of levels the variables are to be discretized into. Otherwise, a vector of integer numbers, one for each column of the data set, specifying the number of levels for each variable.

...

additional tuning parameters, see below.

debug

a boolean value. If TRUE a lot of debugging output is printed; otherwise the function is completely silent.

Value

discretize returns a data frame with the same structure (number of columns, column names, etc.) as x, containing the discretized variables.

Note

Hartemink's algorithm has been designed to deal with sets of homogeneous, continuous variables; this is the reason why they are initially transformed into discrete variables, all with the same number of levels (given by the ibreaks argument). Which of the other algorithms is used is specified by the idisc argument (quantile is the default). The implementation in bnlearn also handles sets of discrete variables with the same number of levels, which are treated as adjacent interval identifiers. This allows the user to perform the initial discretization with the algorithm of his choice, as long as all variables have the same number of levels in the end.

Author(s)

Marco Scutari

References

Hartemink A (2001). Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks. Ph.D. thesis, School of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.

Examples

1
2
3
data(gaussian.test)
d = discretize(gaussian.test, method = 'hartemink', breaks = 4, ibreaks = 20)
plot(hc(d))

vspinu/bnlearn documentation built on May 3, 2019, 7:08 p.m.