data_disc: Data set discretization and formatting

Description Usage Arguments Value Examples

Description

Removes rows containing missing data, and discretizes the data set using Minimum Description Length Partitioning (MDLP).

Usage

1
data_disc(data, n_train = NULL, missing = "?")

Arguments

data

Data frame, where the last column must be the class variable.

n_train

Number of data frame rows to use as the training set - the rest are used for the test set. If NULL, all rows are used for training, and there is no test set (default=NULL).

missing

Label that denotes missing values in your data frame (default='?').

Value

A discretized data set:

TrainX

Matrix containing the training data.

TrainY

Vector containing the class labels for the training data.

TestX

Matrix containing the test data (optional).

TestY

Vector containing the class labels for the test data (optional).

Examples

1
2
data(iris)
iris_disc = data_disc(iris)


Search within the sbfc package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.