disc_train_data: Discretization for train dataset

Description Usage Arguments Details Value Examples

View source: R/nb.R

Description

In the case of a large number of samples, the discretization method performs better, because a large number of samples can learn the distribution of the data.This function would be used to supervisedly discrete the train dataset

Usage

1
disc_train_data(x, y, alpha =0.05)

Arguments

x

A dataframe of train data with some numeric columns, X must have dim larger than 1

y

A dataframe or vector of categorical labels, should be factored

alpha

Significance level value, default is 0.05

Details

function of discreting the train data with supervised method and return the cut points and discreted train dataset.

Value

A list with cut points and new x dataframe

Examples

1
2
3
4
5
6
x=iris[c(1:40,51:90,101:140),-5]
y=iris[c(1:40,51:90,101:140),5]
testx = iris[c(41:50,91:100,141:150),-5]
v = disc_train_data(x,y)
v$discredata
v$cutp

sharechanxd/myNBpackage documentation built on Dec. 23, 2021, 1:21 a.m.