imptree: Classification Trees with Imprecise Probabilities

Description Usage Arguments Value Author(s) References See Also Examples

Description

imptree implements Abellan and Moral's tree algorithm (based on Quinlans ID3) for classification. It employes either the imprecise Dirichlet model (IDM) or nonparametric predictive inference (NPI) to generate the imprecise probability distribution of the classification variable within a node.

Usage

1
2
3
4
5
6
7
8
## S3 method for class 'formula'
imptree(formula, data = NULL, weights, control,
  method = c("IDM", "NPI", "NPIapprox"), method.param, ...)

## Default S3 method:
imptree(x, y, ...)

imptree(x, ...)

Arguments

formula

Formula describing the strucutre (class variable ~ featutre variables). Any interaction terms trigger an error.

data

Data.frame to evaluate supplied formula on. If not provided the the formula is evaluated on the calling environment

weights

Individual weight of the observations (default: 1 to each). This argument is ignored at the moment.

control

A named (partial) list according to the result of imptree_control.

method

Method applied for calculating the probability intervals of the class probability. "IDM" for the imprecise Dirichlet model (default), "NPI" for use of the nonparametric predictive inference approach and "NPIapprox" for use of the approximate algorithm obtaining maximal entropy of NPI generated probability intervals.

method.param

Named list providing the method specific parameters. See imptree_params.

...

optional parameters to be passed to the main function imptree.formula or to the call of imptree_control.

x

A data.frame or a matrix of feature variables. The columns are required to be named.

y

The classification variable as a factor.

Value

An object of class imptree, which is a list with the following components:

call

Original call to imptree

tree

Object reference to the underlying C++ tree object.

train

Training data in the form required by the workhorse C++ function.
It is an integer matrix containing the internal factor representations, adjusted for the C++ specific indexing starting at 0 and not at 1 as in R. Further attributes of the matrix, hold the names of the variables, the C++ adjusted index of the classification variabe, as well as the levels and number of levels for each variable.

formula

The formula describing the data structure

Author(s)

Paul Fink Paul.Fink@stat.uni-muenchen.de, based on algorithms by J. Abellán and S. Moral for the IDM and R. M. Baker for the NPI approach.

References

Abellán, J. and Moral, S. (2005), Upper entropy of credal sets. Applications to credal classification, International Journal of Approximate Reasoning 39, 235–255.

Strobl, C. (2005), Variable Selection in Classification Trees Based on Imprecise Probabilities, ISIPTA'05: Proceedings of the Fourth International Symposium on Imprecise Probabilities and Their Applications, 339–348.

Baker, R. M. (2010), Multinomial Nonparametric Predictive Inference: Selection, Classification and Subcategory Data.

See Also

predict.imptree for prediction, summary.imptree for summary information, imptree_params and imptree_control for arguments controlling the creation, node_imptree for accessing a specific node in the tree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data("carEvaluation")

## create a tree with IDM (s=1) to full size on
## carEvaluation, leaving the first 10 observations out
imptree(acceptance~., data = carEvaluation[-(1:10),], 
  method="IDM", method.param = list(splitmetric = "globalmax", s = 1), 
  control = list(depth = NULL, minbucket = 1)) # control args as list

## same setting as above, now passing control args in '...'
imptree(acceptance~., data = carEvaluation[-(1:10),], 
  method="IDM", method.param = list(splitmetric = "globalmax", s = 1), 
  depth = NULL, minbucket = 1)

imptree documentation built on May 1, 2019, 8:18 p.m.