ddalpha-package: Depth-Based Classification and Calculation of Data Depth
In ddalpha: Depth-Based Classification and Calculation of Data Depth

ddalpha-package

R Documentation

Depth-Based Classification and Calculation of Data Depth

Description

The package provides many procedures for calculating the depth of points in an empirical distribution for many notions of data depth. Further it provides implementations for depth-based classification, for multivariate and functional data.

The package implements the DD\alpha-classifier (Lange, Mosler and Mozharovskyi, 2014), a nonparametric procedure for supervised binary classification with q\ge 2 classes. In the training step, the sample is first transformed into a q-dimensional cube of depth vectors, then a linear separation rule in its polynomial extension is constructed with the \alpha-procedure. The classification step involves alternative treatments of 'outsiders'.

Details

Package:	ddalpha
Type:	Package
Version:	1.3.16
Date:	2024-09-29
License:	GPL-2

Use ddalpha.train to train the DD-classifier and ddalpha.classify to classify with it. Load sample classification problems using getdata. The package contains 50 classification problems built of 33 sets of real data.

The list of the implemented multivariate depths is found in topic depth., for functional depths see depthf.. The depth representations of the multivariate data are obtained with depth.space.. Functions depth.contours and depth.contours.ddalpha build depth contours, and depth.graph builds depth graphs for two-dimensional data. Function draw.ddplot draws DD-plot for the existing DD-classifier, or for pre-calculated depth space.

The package supports user-defined depths and classifiers, see topic Custom Methods. A pre-calculated DD-plot may also be used as data, see topic ddalpha.train.

is.in.convex shows whether an object is no 'outsider', i.e. can be classified by its depth values. Outsiders are alternatively classified by LDA, kNN and maximum Mahalanobis depth as well as by random assignment.

Use compclassf.train and ddalphaf.train to train the functional DD-classifiers and compclassf.classify ddalpha.classify to classify with them. Load sample functional classification problems with dataf.*. The package contains 4 functional data sets and 2 data set generators. The functional data are visualized with plot.functional.

Author(s)

Oleksii Pokotylo, <alexey.pokotylo@gmail.com>

Pavlo Mozharovskyi, <pavlo.mozharovskyi@telecom-paris.fr>

Rainer Dyckerhoff, <rainer.dyckerhoff@statistik.uni-koeln.de>

Stanislav Nagy, <nagy@karlin.mff.cuni.cz>

References

Pokotylo, O., Mozharovskyi, P., Dyckerhoff, R. (2019). Depth and depth-based classification with R-package ddalpha. Journal of Statistical Software 91 1–46.

Lange, T., Mosler, K., and Mozharovskyi, P. (2014). Fast nonparametric classification based on data depth. Statistical Papers 55 49–69.

Lange, T., Mosler, K., and Mozharovskyi, P. (2014). DD\alpha-classification of asymmetric and fat-tailed data. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds), Data Analysis, Machine Learning and Knowledge Discovery, Springer (Berlin), 71–78.

Mosler, K. and Mozharovskyi, P. (2017). Fast DD-classification of functional data. Statistical Papers 58 1055–1089.

Mozharovskyi, P. (2015). Contributions to Depth-based Classification and Computation of the Tukey Depth. Verlag Dr. Kovac (Hamburg).

Mozharovskyi, P., Mosler, K., and Lange, T. (2015). Classifying real-world data with the DD\alpha-procedure. Advances in Data Analysis and Classification 9 287–314.

Nagy, S., Gijbels, I. and Hlubinka, D. (2017). Depth-based recognition of shape outlying functions. Journal of Computational and Graphical Statistics. To appear.

Examples

# Generate a bivariate normal location-shift classification task
# containing 200 training objects and 200 to test with
class1 <- mvrnorm(200, c(0,0), 
                  matrix(c(1,1,1,4), nrow = 2, ncol = 2, byrow = TRUE))
class2 <- mvrnorm(200, c(2,2), 
                  matrix(c(1,1,1,4), nrow = 2, ncol = 2, byrow = TRUE))
trainIndices <- c(1:100)
testIndices <- c(101:200)
propertyVars <- c(1:2)
classVar <- 3
trainData <- rbind(cbind(class1[trainIndices,], rep(1, 100)), 
                   cbind(class2[trainIndices,], rep(2, 100)))
testData <- rbind(cbind(class1[testIndices,], rep(1, 100)), 
                  cbind(class2[testIndices,], rep(2, 100)))
data <- list(train = trainData, test = testData)

# Train the DDalpha-classifier
ddalpha <- ddalpha.train(data$train)

# Classify by means of DDalpha-classifier
classes <- ddalpha.classify(ddalpha, data$test[,propertyVars])
cat("Classification error rate:", 
    sum(unlist(classes) != data$test[,classVar])/200, "\n")

# Calculate zonoid depth of top 10 testing objects w.r.t. 1st class
depths.zonoid <- depth.zonoid(data$test[1:10,propertyVars], 
                              data$train[trainIndices,propertyVars])
cat("Zonoid depths:", depths.zonoid, "\n")

# Calculate the random Tukey depth of top 10 testing objects w.r.t. 1st class
depths.halfspace <- depth.halfspace(data$test[1:10,propertyVars], 
                                        data$train[trainIndices,propertyVars])
cat("Random Tukey depths:", depths.halfspace, "\n")

# Calculate depth space with zonoid depth
dspace.zonoid <- depth.space.zonoid(data$train[,propertyVars], c(100, 100))

# Calculate depth space with the exact Tukey depth
dspace.halfspace <- depth.space.halfspace(data$train[,propertyVars], c(100, 100), exact = TRUE)

# Count outsiders
numOutsiders = sum(rowSums(is.in.convex(data$test[,propertyVars], 
                                data$train[,propertyVars], c(100, 100))) == 0)
cat(numOutsiders, "outsiders found in the testing sample.\n")

ddalpha documentation built on Oct. 1, 2024, 1:07 a.m.