ddalpha.train: Train DD-Classifier

View source: R/ddalpha.train.r

ddalpha.trainR Documentation

Train DD-Classifier

Description

Trains the DD-classifier using a training sample according to given parameters. The DD-classifier is a non-parametric procedure that first transforms the training sample into the depth space calculating the depth of each point w.r.t each class (dimension of this space equals the number of classes in the training sample), and then constructs a separating rule in this depth space. If in the classification phase an object does not belong to the convex hull of at least one class (we mention such an object as an 'outsider'), it is mapped into the origin of the depth space and hence cannot be classified in the depth space. For these objects, after 'outsiderness' has been assured, an outsider treatment, i.e. a classification procedure functioning outside convex hulls of the classes is applied; it has to be trained too.

The current realization of the DD-classifier allows for several alternative outsider treatments; they involve different traditional classification methods, see 'Details' and 'Arguments' for parameters needed.

The function allows for classification with q≥ 2 classes, see aggregation.method in 'Arguments'.

Usage

ddalpha.train(formula, data, subset,
              depth = "halfspace", 
              separator = "alpha", 
              outsider.methods = "LDA", 
              outsider.settings = NULL, 
              aggregation.method = "majority",
              pretransform = NULL,
              use.convex = FALSE,     
              seed = 0,
              ...)

Arguments

formula

an object of class “formula” (or one that can be coerced to that class): a symbolic description of the model. If not found in data, the variables of the model are taken from environment.

data

Matrix or data.frame containing training sample where each of n rows is one object of the training sample where first d entries are inputs and the last entry is output (class label).

A pre-calculated DD-plot may be used as data with depth="ddplot".

subset

an optional vector specifying a subset of observations to be used in training the classifier.

depth

Character string determining which depth notion to use; the default value is "halfspace". The list of the supported depths is given in section Depths. To use a custom depth, see topic Custom Methods. To use an outsider treatment only set depth = NULL.

separator

The method used for separation on the DD-plot; can be "alpha" (the default), "polynomial", "knnlm" or "maxD". See section Separators for the description of the separators and additional parameters. To use a custom separator, see topic Custom Methods.

outsider.methods

Vector of character strings each being a name of a basic outsider method for eventual classification; possible names are: "LDA" (the default), "QDA", "kNN", "kNNAff", "depth.Mahalanobis", "RandProp", "RandEqual" and "Ignore". Each method can be specified only once, replications are ignored. By specifying treatments in such a way only a basic treatment method can be chosen (by the name), and the default settings for each of the methods are applied, see 'Details'.

outsider.settings

List containing outsider treatments each described by a list of parameters including a name, see 'Details' and 'Examples'. Each method can be used multiply with (not necessarily) different parameters, just the name should be unique, entries with the repeating names are ignored.

aggregation.method

Character string determining which method to apply to aggregate binary classification results during multiclass classification; can be "majority" (the default) or "sequent". If "majority", q(q-1)/2 (with q being the number of classes in the training sample) binary classifiers are trained, the classification results are aggregated using the majority voting, where classes with larger proportions in the training sample (eventually with the earlier entries in the data) are preferred when tied. If "sequent", q binary 'one against all'-classifiers are trained and ties during the classification are resolved as before.

pretransform

indicates if the data has to be scaled before the learning procedure. If the used depth method is affine-invariant and pretransform doesn't influence the result, the data won't be transformed (the parameter is ignored).

NULL

applies no transformation to the data

"1Mom", "1MCD"

the data is transformed with the common covariance matrix of the whole data

"NMom", "NMCD"

the data is transformed w.r.t. each class using its covariance martix. The depths w.r.t. each class are calculated using the transformed data.

for the values "1MCD", "NMCD" covMcd is used to calculate the covariance matrix, and the parameter mah.parMcd is used.

use.convex

Logical variable indicating whether outsiders should be determined exactly, i.e. as the points not contained in any of the convex hulls of the classes from the training sample (TRUE), or those having zero depth w.r.t. each class from the training sample (FALSE). For depth = "zonoid" both values give the same result.

seed

the random seed. The default value seed=0 makes no changes.

...

The parameters for the depth calculating and separation methods.

Details

Depths

For depth="ddplot" the pre-calculated DD-plot shall be passed as data.

To use a custom depth, see topic Custom Methods.

To use an outsider treatment only set depth = NULL.

The following depths are supported:

depth.halfspace for calculation of the Tukey depth.

depth.Mahalanobis for calculation of Mahalanobis depth.

depth.projection for calculation of projection depth.

depth.simplicial for calculation of simplicial depth.

depth.simplicialVolume for calculation of simplicial volume depth.

depth.spatial for calculation of spatial depth.

depth.zonoid for calculation of zonoid depth.

The additional parameters are described in the corresponding topics.

Separators

The separators classify data on the 2-dimensional space of a DD-plot built using the depths.

To use a custom separator, see topic Custom Methods.

alpha

Trains the DDα-classifier (Lange, Mosler and Mozharovskyi, 2014; Mozharovskyi, Mosler and Lange, 2015). The DDα-classifier constructs a linear separating rule in the polynomial extension of the depth space with the α-procedure (Vasil'ev, 2003); maximum degree of the polynomial products is determined via cross-validation (in the depth space).

The additional parameters:

max.degree

Maximum of the range of degrees of the polynomial depth space extension over which the α-procedure is to be cross-validated; can be 1, 2 or 3 (default).

num.chunks

Number of chunks to split data into when cross-validating the α-procedure; should be >0, and smaller than the total number of points in the two smallest classes when aggregation.method = "majority" and smaller than the total number of points in the training sample when aggregation.method = "sequent". The default value is 10.

polynomial

Trains the polynomial DD-classifier (Li, Cuesta-Albertos and Liu, 2012). The DD-classifier constructs a polynomial separating rule in the depth space; the degree of the polynomial is determined via cross-validation (in the depth space).

The additional parameters:

max.degree

Maximum of the range of degrees of the polynomial over which the separator is to be cross-validated; can be in [1:10], the default value is 3.

num.chunks

Number of chunks to split data into when cross-validating the separator; should be >0, and smaller than the total number of points in the two smallest classes when aggregation.method = "majority" and smaller than the total number of points in the training sample when aggregation.method = "sequent". The default value is 10.

knnlm

Trains the k-nearest neighbours classifier in the depth space.

The additional parameters:

knnrange

The maximal number of neighbours for kNN separation. The value is bounded by 2 and n/2.

NULL for the default value 10*(n^{1/q})+1, where n is the number of objects, q is the number of classes.

"MAX" for the maximum value n/2

maxD

The maximum depth separator classifies an object to the class that provides it the largest depth value.

Outsider treatment

An outsider treatment is a supplementary classifier for data that lie outside the convex hulls of all q training classes. Available methods are: Linear Discriminant Analysis (referred to as "LDA"), see lda; k-Nearest-Neighbor Classifier ("kNN"), see knn, knn.cv; Affine-Invariant kNN ("kNNAff"), an affine-invariant version of the kNN, suited only for binary classification (some aggregation is used with multiple classes) and not accounting for ties (at all), but very fast by that; Maximum Mahalanobis Depth Classifier ("depth.Mahalanobis"), the outsider is referred to a class w.r.t. which it has the highest depth value scaled by (approximated) priors; Proportional Randomization ("RandProp"), the outsider is referred to a class randomly with probability equal to it (approximated) prior; Equal Randomization ("RandEqual"), the outsider is referred to a class randomly, chances for each class are equal; Ignoring ("Ignore"), the outsider is not classified, the string "Ignored" is returned instead.

An outsider treatment is specified by a list containing a name and parameters:

name is a character string, name of the outsider treatment to be freely specified; should be unique; is obligatory.

method is a character string, name of the method to use, can be "LDA", "kNN", "kNNAff", "depth.Mahalanobis", "RandProp", "RandEqual" and "Ignore"; is obligatory.

priors is a numerical vector specifying prior probabilities of classes; class portions in the training sample are used by the default. priors is used in methods "LDA", "depth.Mahalanobis" and "RandProp".

knn.k is the number of the nearest neighbors taken into account; can be between 1 and the number of points in the training sample. Set to -1 (the default) to be determined by the leave-one-out cross-validation. knn.k is used in method "kNN".

knn.range is the upper bound on the range over which the leave-one-out cross-validation is performed (the lower bound is 1); can be between 2 and the number of points in the training sample -1. Set to -1 (the default) to be calculated automatically accounting for number of points and dimension. knn.range is used in method "kNN".

knnAff.methodAggregation is a character string specifying the aggregation technique for method "kNNAff"; works in the same way as the function argument aggregation.method. knnAff.methodAggregation is used in method "kNNAff".

knnAff.k is the number of the nearest neighbors taken into account; should be at least 1 and up to the number of points in the training sample when knnAff.methodAggregation = "sequent", and up to the total number of points in the training sample when knnAff.methodAggregation = "majority". Set to -1 (the default) to be determined by the leave-one-out cross-validation. knnAff.k is used in method "kNNAff".

knnAff.range is the upper bound on the range over which the leave-one-out cross-validation is performed (the lower bound is 1); should be >1 and smaller than the total number of points in the two smallest classes when knnAff.methodAggregation = "majority", and >1 and smaller than the total number of points in the training sample when knnAff.methodAggregation = "sequent". Set to -1 to be calculated automatically accounting for number of points and dimension. knnAff.range is used in method "kNNAff".

mah.estimate is a character string specifying which estimates to use when calculating the Mahalanobis depth; can be "moment" or "MCD", determining whether traditional moment or Minimum Covariance Determinant (MCD) (see covMcd) estimates for mean and covariance are used. mah.estimate is used in method "depth.Mahalanobis".

mcd.alpha is the value of the argument alpha for the function covMcd; is used in method "depth.Mahalanobis" when mah.estimate = "MCD".

Value

Trained DDα-classifier containing following - rather informative - fields:

num.points

Total number of points in the training sample.

dimension

Dimension of the original space.

depth

Character string determining which depth notion to use.

methodAggregation

Character string determining which method to apply to aggregate binary classification results.

num.chunks

Number of chunks data has been split into when cross-validating the α-procedure.

num.directions

Number of directions used for approximating the Tukey depth (when it is used).

use.convex

Logical variable indicating whether outsiders should be determined exactly when classifying.

max.degree

Maximum of the range of degrees of the polynomial depth space extension over which the α-procedure has been cross-validated.

patterns

Classes of the training sample.

num.classifiers

Number of binary classifiers trained.

outsider.methods

Treatments to be used to classify outsiders.

References

Dyckerhoff, R., Koshevoy, G., and Mosler, K. (1996). Zonoid data depth: theory and computation. In: Prat A. (ed), COMPSTAT 1996. Proceedings in computational statistics, Physica-Verlag (Heidelberg), 235–240.

Lange, T., Mosler, K., and Mozharovskyi, P. (2014). Fast nonparametric classification based on data depth. Statistical Papers 55 49–69.

Li, J., Cuesta-Albertos, J.A., and Liu, R.Y. (2012). DD-classifier: Nonparametric classification procedure based on DD-plot. Journal of the American Statistical Association 107 737–753.

Mozharovskyi, P. (2015). Contributions to Depth-based Classification and Computation of the Tukey Depth. Verlag Dr. Kovac (Hamburg).

Mozharovskyi, P., Mosler, K., and Lange, T. (2015). Classifying real-world data with the DDα-procedure. Advances in Data Analysis and Classification 9 287–314.

Vasil'ev, V.I. (2003). The reduction principle in problems of revealing regularities I. Cybernetics and Systems Analysis 39 686–694.

See Also

ddalpha.classify for classification using DD-classifier, depth. for calculation of depths, depth.space. for calculation of depth spaces, is.in.convex to check whether a point is not an outsider.

Examples

# Generate a bivariate normal location-shift classification task
# containing 200 training objects and 200 to test with
class1 <- mvrnorm(200, c(0,0), 
                  matrix(c(1,1,1,4), nrow = 2, ncol = 2, byrow = TRUE))
class2 <- mvrnorm(200, c(2,2), 
                  matrix(c(1,1,1,4), nrow = 2, ncol = 2, byrow = TRUE))
trainIndices <- c(1:100)
testIndices <- c(101:200)
propertyVars <- c(1:2)
classVar <- 3
trainData <- rbind(cbind(class1[trainIndices,], rep(1, 100)), 
                   cbind(class2[trainIndices,], rep(2, 100)))
testData <- rbind(cbind(class1[testIndices,], rep(1, 100)), 
                  cbind(class2[testIndices,], rep(2, 100)))
data <- list(train = trainData, test = testData)

# Train 1st DDalpha-classifier (default settings) 
# and get the classification error rate
ddalpha1 <- ddalpha.train(data$train)
classes1 <- ddalpha.classify(ddalpha1, data$test[,propertyVars])
cat("1. Classification error rate (defaults): ", 
    sum(unlist(classes1) != data$test[,classVar])/200, ".\n", sep = "")

# Train 2nd DDalpha-classifier (zonoid depth, maximum Mahalanobis 
# depth classifier with defaults as outsider treatment) 
# and get the classification error rate
ddalpha2 <- ddalpha.train(data$train, depth = "zonoid", 
                          outsider.methods = "depth.Mahalanobis")
classes2 <- ddalpha.classify(ddalpha2, data$test[,propertyVars], 
                               outsider.method = "depth.Mahalanobis")
cat("2. Classification error rate (depth.Mahalanobis): ", 
    sum(unlist(classes2) != data$test[,classVar])/200, ".\n", sep = "")

# Train 3rd DDalpha-classifier (100 random directions for the Tukey depth, 
# adjusted maximum Mahalanobis depth classifier 
# and equal randomization as outsider treatments) 
# and get the classification error rates
treatments <- list(list(name = "mahd1", method = "depth.Mahalanobis", 
                        mah.estimate = "MCD", mcd.alpha = 0.75, priors = c(1, 1)/2), 
                   list(name = "rand1", method = "RandEqual"))
ddalpha3 <- ddalpha.train(data$train, outsider.settings = treatments, 
                          num.direction = 100)
classes31 <- ddalpha.classify(ddalpha3, data$test[,propertyVars], 
                              outsider.method = "mahd1")
classes32 <- ddalpha.classify(ddalpha3, data$test[,propertyVars], 
                              outsider.method = "rand1")
cat("3. Classification error rate (by treatments):\n")
cat("   Error (mahd1): ", 
    sum(unlist(classes31) != data$test[,classVar])/200, ".\n", sep = "")
cat("   Error (rand1): ", 
    sum(unlist(classes32) != data$test[,classVar])/200, ".\n", sep = "")
    
# Train using some weird formula
ddalpha = ddalpha.train(
    I(mpg >= 19.2) ~ log(disp) + I(disp^2) + disp + I(disp * drat),
    data = mtcars, subset = (carb!=1), 
    depth = "Mahalanobis", separator = "alpha")
print(ddalpha) # make sure that the resulting table is what you wanted
CC = ddalpha.classify(ddalpha, mtcars)
sum((mtcars$mpg>=19.2)!= unlist(CC))/nrow(mtcars) # error rate
    
#Use the pre-calculated DD-plot
data = cbind(rbind(mvrnorm(n = 50, mu = c(0,0), Sigma = diag(2)),
                   mvrnorm(n = 50, mu = c(5,10), Sigma = diag(2)),
                   mvrnorm(n = 50, mu = c(10,0), Sigma = diag(2))),
             rep(c(1,2,3), each = 50))
plot(data[,1:2], col = (data[,3]+1))

ddplot = depth.space.Mahalanobis(data = data[,1:2], cardinalities = c(50,50,50))
ddplot = cbind(ddplot, data[,3])
ddalphaD = ddalpha.train(data = ddplot, depth = "ddplot", separator = "alpha")
c = ddalpha.classify(ddalphaD, ddplot[,1:3])
errors = sum(unlist(c) != data[,3])/nrow(data)
print(paste("Error rate: ",errors))

ddalpha = ddalpha.train(data = data, depth = "Mahalanobis", separator = "alpha")
c = ddalpha.classify(ddalpha, data[,1:2])
errors = sum(unlist(c) != data[,3])/nrow(data)
print(paste("Error rate: ",errors))

ddalpha documentation built on March 23, 2022, 9:07 a.m.