Weka_filters: R/Weka Filters

Description Usage Arguments Details Value References Examples

Description

R interfaces to Weka filters.

Usage

1
2

Arguments

formula

a symbolic description of a model. Note that for unsupervised filters the response can be omitted.

data

an optional data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. See model.frame for details.

control

an object of class Weka_control, or a character vector of control options, or NULL (default). Available options can be obtained on-line using the Weka Option Wizard WOW, or the Weka documentation.

Details

Normalize implements an unsupervised filter that normalizes all instances of a dataset to have a given norm. Only numeric values are considered, and the class attribute is ignored.

Discretize implements a supervised instance filter that discretizes a range of numeric attributes in the dataset into nominal attributes. Discretization is by Fayyad & Irani's MDL method (the default).

Note that these methods ignore nominal attributes, i.e., variables of class factor.

Value

A data frame.

References

U. M. Fayyad and K. B. Irani (1993). Multi-interval discretization of continuous-valued attributes for classification learning. Thirteenth International Joint Conference on Artificial Intelligence, 1022–1027. Morgan Kaufmann.

I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Using a Weka data set ...
w <- read.arff(system.file("arff","weather.arff",
	       package = "RWeka"))

## Normalize (response irrelevant)
m1 <- Normalize(~., data = w)
m1

## Discretize
m2 <- Discretize(play ~., data = w)
m2

Example output

OpenJDK 64-Bit Server VM warning: Can't detect initial thread stack location - find_vma failed
    outlook temperature  humidity windy play
1     sunny  1.00000000 0.6451613 FALSE   no
2     sunny  0.76190476 0.8064516  TRUE   no
3  overcast  0.90476190 0.6774194 FALSE  yes
4     rainy  0.28571429 1.0000000 FALSE  yes
5     rainy  0.19047619 0.4838710 FALSE  yes
6     rainy  0.04761905 0.1612903  TRUE   no
7  overcast  0.00000000 0.0000000  TRUE  yes
8     sunny  0.38095238 0.9677419 FALSE   no
9     sunny  0.23809524 0.1612903 FALSE  yes
10    rainy  0.52380952 0.4838710 FALSE  yes
11    sunny  0.52380952 0.1612903  TRUE  yes
12 overcast  0.38095238 0.8064516  TRUE  yes
13 overcast  0.80952381 0.3225806 FALSE  yes
14    rainy  0.33333333 0.8387097  TRUE   no
    outlook temperature humidity windy play
1     sunny       'All'    'All' FALSE   no
2     sunny       'All'    'All'  TRUE   no
3  overcast       'All'    'All' FALSE  yes
4     rainy       'All'    'All' FALSE  yes
5     rainy       'All'    'All' FALSE  yes
6     rainy       'All'    'All'  TRUE   no
7  overcast       'All'    'All'  TRUE  yes
8     sunny       'All'    'All' FALSE   no
9     sunny       'All'    'All' FALSE  yes
10    rainy       'All'    'All' FALSE  yes
11    sunny       'All'    'All'  TRUE  yes
12 overcast       'All'    'All'  TRUE  yes
13 overcast       'All'    'All' FALSE  yes
14    rainy       'All'    'All'  TRUE   no
Warning message:
system call failed: Cannot allocate memory 

RWeka documentation built on Feb. 3, 2020, 1:10 a.m.