Weka_interfaces: R/Weka interfaces

Description Usage Arguments Details References Examples

Description

Create an R interface to an existing Weka learner, attribute evaluator or filter, or show the available interfaces.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12

Arguments

name

a character string giving the fully qualified name of a Weka learner/filter class in JNI notation.

class

NULL (default), or a character vector giving the names of R classes the objects returned by the interface function should inherit from in addition to the default ones (for representing associators, classifiers, and clusterers).

handlers

a named list of special handler functions, see Details.

init

NULL, or a function with no arguments to be called when the interface is used for building the learner/filter, or queried for available options via WOW. Typically, this is used for loading Weka packages when interfacing functionality in these.

package

NULL (default), or a character string giving the name of the external Weka package providing the learner/filter class specified by name.

p

a character string naming a Weka package to be loaded via WPM.

Details

make_Weka_associator and make_Weka_clusterer create an R function providing an interface to a Weka association learner or a Weka clusterer, respectively. This interface function has formals x and control = NULL, representing the training instances and control options to be employed. Objects created by these interface functions always inherit from classes Weka_associator and Weka_clusterer, respectively, and have at least suitable print methods. Fitted clusterers also have a predict method.

make_Weka_classifier creates an interface function for a Weka classifier, with formals formula, data, subset, na.action, and control (default: none), where the first four have the “usual” meanings for statistical modeling functions in R, and the last again specifies the control options to be employed by the Weka learner. Objects created by these interfaces always inherit from class Weka_classifier, and have at least suitable print and predict methods.

make_Weka_filter creates an interface function for a Weka filter, with formals formula, data, subset, na.action, and control = NULL, where the first four have the “usual” meanings for statistical modeling functions in R, and the last again specifies the control options to be employed by the Weka filter. Note that the response variable can be omitted from formula if the filter is “unsupervised”. Objects created by these interface functions are (currently) always of class data.frame.

make_Weka_attribute_evaluator creates an interface function for a Weka attribute evaluation class which implements the AttributeEvaluator interface, with formals as for the classifier interface functions.

Certain aspects of the interface function can be customized by providing handlers. Currently, only control handlers (functions given as the control component of the list of handlers) are used for processing the given control arguments before passing them to the Weka classifier. This is used, e.g., by the meta learners to allow the specification of registered base learners by their “base names” (rather their full Weka/Java class names).

In addition to creating interface functions, the interfaces are registered (under the name of the Weka class interfaced), which in particular allows the Weka Option Wizard (WOW) to conveniently give on-line information about available control options for the interfaces.

list_Weka_interfaces lists the available interfaces.

Finally, make_Weka_package_loader generates init hooks for loading required and already installed Weka packages.

It is straightforward to register new interfaces in addition to the ones package RWeka provides by default.

References

K. Hornik, C. Buchta, and A. Zeileis (2009). Open-source machine learning: R meets Weka. Computational Statistics, 24/2, 225–232. doi: 10.1007/s00180-008-0119-7.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Create an interface to Weka's Naive Bayes classifier.
NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayes")
## Note that this has a very useful print method:
NB
## And we can use the Weka Option Wizard for finding out more:
WOW(NB)
## And actually use the interface ...
if(require("e1071", quietly = TRUE) &&
   require("mlbench", quietly = TRUE)) {
    data("HouseVotes84", package = "mlbench")
    model <- NB(Class ~ ., data = HouseVotes84)
    predict(model, HouseVotes84[1:10, -1])
    predict(model, HouseVotes84[1:10, -1], type = "prob")
}
## (Compare this to David Meyer's naiveBayes() in package 'e1071'.)

Example output

An R interface to Weka class 'weka.classifiers.bayes.NaiveBayes', which
has information

  Class for a Naive Bayes classifier using estimator classes. Numeric
  estimator precision values are chosen based on analysis of the
  training data. For this reason, the classifier is not an
  UpdateableClassifier (which in typical usage are initialized with
  zero training instances) -- if you need the UpdateableClassifier
  functionality, use the NaiveBayesUpdateable classifier. The
  NaiveBayesUpdateable classifier will use a default precision of 0.1
  for numeric attributes when buildClassifier is called with zero
  training instances.

  For more information on Naive Bayes classifiers, see

  George H. John, Pat Langley: Estimating Continuous Distributions in
  Bayesian Classifiers. In: Eleventh Conference on Uncertainty in
  Artificial Intelligence, San Mateo, 338-345, 1995.

  BibTeX:

  @INPROCEEDINGS{John1995,
    publisher = {Morgan Kaufmann},
    pages = {338-345},
    year = {1995},
    booktitle = {Eleventh Conference on Uncertainty in Artificial
      Intelligence},
    title = {Estimating Continuous Distributions in Bayesian
      Classifiers},
    address = {San Mateo},
    author = {George H. John and Pat Langley},
  }

Argument list:
  x(formula, data, subset, na.action, control = Weka_control(),
  options = NULL)

Returns objects inheriting from classes:
  Weka_classifier
-K      Use kernel density estimator rather than normal distribution
        for numeric attributes
-D      Use supervised discretization to process numeric attributes
-O      Display model in old format (good when there are many classes)
-output-debug-info
        If set, classifier is run in debug mode and may output
        additional info to the console
-do-not-check-capabilities
        If set, classifier capabilities are not checked before
        classifier is built (use with caution).
-num-decimal-places
        The number of decimal places for the output of numbers in the
        model (default 2).
	Number of arguments: 1.
-batch-size
        The desired batch size for batch prediction (default 100).
	Number of arguments: 1.
       democrat   republican
1  1.361394e-07 9.999999e-01
2  6.699596e-08 9.999999e-01
3  2.250184e-03 9.977498e-01
4  9.929477e-01 7.052312e-03
5  8.221067e-01 1.778933e-01
6  4.901860e-01 5.098140e-01
7  1.036287e-04 9.998964e-01
8  7.224667e-06 9.999928e-01
9  9.454311e-08 9.999999e-01
10 1.000000e+00 7.070261e-10

RWeka documentation built on Aug. 23, 2020, 5:07 p.m.