cmilb: Cumulative probabilities Multiple Isotonic LogitBoost
In isoboost: Isotonic Boosting Classification Rules

Description Usage Arguments Value Note Author(s) References See Also Examples

View source: R/cmilb.R

Train and predict logitboost-based classification algorithm using multivariate isotonic regression (linear regression for no monotone features) as weak learners, based on the cumulative probabilities logistic model (see Agresti (2010)). For full details on this algorithm, see Conde et al. (2020).

cmilb(xlearn, ...)

## S3 method for class 'formula'
cmilb(formula, data, ...)

## Default S3 method:
cmilb(xlearn, ylearn, xtest = xlearn, mfinal = 100, 
monotone_constraints = rep(0, dim(xlearn)[2]), prior = NULL, ...)

`formula`	A formula of the form `groups ~ x1 + x2 + ...`. That is, the response is the class variable and the right hand side specifies the explanatory variables.
`data`	Data frame from which variables specified in `formula` are to be taken.
`xlearn`	(Required if no formula is given as the principal argument.) A data frame or matrix containing the explanatory variables.
`ylearn`	(Required if no formula is given as the principal argument.) A numeric vector or factor with numeric levels specifying the class for each observation.
`xtest`	A data frame or matrix of cases to be classified, containing the features used in `formula` or `xlearn`.
`mfinal`	Maximum number of iterations of the algorithm.
`monotone_constraints`	Numerical vector consisting of 1, 0 and -1, its length equals the number of features in `xlearn`. 1 is increasing, -1 is decreasing and 0 is no constraint.
`prior`	The prior probabilities of class membership. If unspecified, equal prior probabilities are used. If present, the probabilities must be specified in the order of the factor levels.
`...`	Arguments passed to or from other methods.

A list containing the following components:

`call`	The (matched) function call.
`trainset`	Matrix with the training set used (first columns) and the class for each observation (last column).
`prior`	Prior probabilities of class membership used.
`apparent`	Apparent error rate.
`mfinal`	Number of iterations of the algorithm.
`loglikelihood`	Log-likelihood.
`posterior`	Posterior probabilities of class membership for `xtest` set.
`class`	Labels of the class with maximal probability for `xtest` set.

This function may be called using either a formula and data frame, or a data frame and grouping variable, or a matrix and grouping variable as the first two arguments. All other arguments are optional.

Classes must be identified, either in a column of data or in the ylearn vector, by natural numbers varying from 1 to the number of classes. The number of classes must be greater than 1.

If there are missing values in either data, xlearn or ylearn, corresponding observations will be deleted.

David Conde

Agresti, A. (2010). Analysis of Ordinal Categorical Data, 2nd edition. John Wiley and Sons. New Jersey.

Conde, D., Fernandez, M. A., Rueda, C., and Salvador, B. (2020). Isotonic boosting classification rules. Advances in Data Analysis and Classification, 1-25.

asilb, amilb, csilb

data(motors)
table(motors$condition)
##  1  2  3  4 
## 83 67 70 60 

## Let us consider the first three variables as predictors
data <- motors[, 1:3]
grouping = motors$condition
## 
## Lower values of the amplitudes are expected to be 
## related to higher levels of damage severity, so 
## we can consider the following monotone constraints
monotone_constraints = rep(-1, 3)

set.seed(7964)
values <- runif(dim(data)[1])
trainsubset <- values < 0.2
obj <- cmilb(data[trainsubset, ], grouping[trainsubset], 
               data[-trainsubset, ], 20, monotone_constraints)

## Apparent error
obj$apparent
## 4.761905

## Error rate
100*mean(obj$class != grouping[-trainsubset])
## 15.77061