preprocessing: Pre-process data to better learn Bayesian networks
In jtrecenti/bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference

Description Usage Arguments Details Value Note Author(s) References Examples

Screen and transform the data to make them more suitable for structure and parameter learning.

  # discretize continuous data into factors.
  discretize(data, method, breaks = 3, ordered = FALSE, ..., debug = FALSE)
  # screen continuous data for highly correlated pairs of variables.
  dedup(data, threshold, debug = FALSE)

`data`	a data frame containing numeric columns (for `dedup`) or a combination of numeric or factor columns (for ).
`threshold`	a numeric value between zero and one, the absolute correlation used a threshold in screening highly correlated pairs.
`method`	a character string, either `interval` for interval discretization, `quantile` for quantile discretization (the default) or `hartemink` for Hartemink's pairwise mutual information method.
`breaks`	if `method` is set to `hartemink`, an integer number, the number of levels the variables are to be discretized into. Otherwise, a vector of integer numbers, one for each column of the data set, specifying the number of levels for each variable.
`ordered`	a boolean value. If `TRUE` the discretized variables are returned as ordered factors instead of unordered ones.
`...`	additional tuning parameters, see below.
`debug`	a boolean value. If `TRUE` a lot of debugging output is printed; otherwise the function is completely silent.

discretize takes a data frame of continuous variables as its first argument and returns a secdond data frame of discrete variables, transformed using of three methods: interval, quantile or hartemink.

dedup screens the data for pairs of highly correlated variables, and discards one in each pair.

discretize returns a data frame with the same structure (number of columns, column names, etc.) as data, containing the discretized variables.

dedup returns a data frame with a subset of the columns of data.

Hartemink's algorithm has been designed to deal with sets of homogeneous, continuous variables; this is the reason why they are initially transformed into discrete variables, all with the same number of levels (given by the ibreaks argument). Which of the other algorithms is used is specified by the idisc argument (quantile is the default). The implementation in bnlearn also handles sets of discrete variables with the same number of levels, which are treated as adjacent interval identifiers. This allows the user to perform the initial discretization with the algorithm of his choice, as long as all variables have the same number of levels in the end.

Marco Scutari

Hartemink A (2001). Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks. Ph.D. thesis, School of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.

data(gaussian.test)
d = discretize(gaussian.test, method = 'hartemink', breaks = 4, ibreaks = 20)
plot(hc(d))
d2 = dedup(gaussian.test)

jtrecenti/bnlearn documentation built on May 20, 2019, 3:16 a.m.

jtrecenti/bnlearn index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jtrecenti/bnlearn
Bayesian Network Structure Learning, Parameter Learning and Inference

preprocessing: Pre-process data to better learn Bayesian networks
In jtrecenti/bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to preprocessing in jtrecenti/bnlearn...

R Package Documentation

Browse R Packages

We want your feedback!

jtrecenti/bnlearn Bayesian Network Structure Learning, Parameter Learning and Inference

preprocessing: Pre-process data to better learn Bayesian networks In jtrecenti/bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Related to preprocessing in jtrecenti/bnlearn...

R Package Documentation

Browse R Packages

We want your feedback!

jtrecenti/bnlearn
Bayesian Network Structure Learning, Parameter Learning and Inference

preprocessing: Pre-process data to better learn Bayesian networks
In jtrecenti/bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference