vcr.knn.newdata: Carry out a k-nearest neighbor classification on new data,...

View source: R/VCR_knn.R

vcr.knn.newdataR Documentation

Carry out a k-nearest neighbor classification on new data, and prepare to visualize its results.

Description

Predicts class labels for new data by k nearest neighbors, using the output of vcr.knn.train on the training data. For cases in the new data whose given label ynew is not NA, additional output is produced for constructing graphical displays such as the classmap.

Usage

vcr.knn.newdata(Xnew, ynew = NULL, vcr.knn.train.out, LOO = FALSE)

Arguments

Xnew

If the training data was a matrix of coordinates, Xnew must be such a matrix with the same number of columns. If the training data was a set of dissimilarities, Xnew must be a rectangular matrix of dissimilarities, with each row containing the dissmilarities of a new case to all training cases. Missing values are not allowed.

ynew

factor with class membership of each new case. Can be NA for some or all cases. If NULL, is assumed to be NA everywhere.

vcr.knn.train.out

output of vcr.knn.train on the training data.

LOO

leave one out. Only used when testing this function on a subset of the training data. Default is LOO=FALSE.

Value

A list with components:

yintnew

number of the given class of each case. Can contain NA's.

ynew

given class label of each case. Can contain NA's.

levels

levels of the response, from vcr.knn.train.out.

predint

predicted class number of each case. Always exists.

pred

predicted label of each case.

altint

number of the alternative class. Among the classes different from the given class, it is the one with the highest posterior probability. Is NA for cases whose ynew is missing.

altlab

label of the alternative class. Is NA for cases whose ynew is missing.

PAC

probability of the alternative class. Is NA for cases whose ynew is missing.

fig

distance of each case i from each class g. Always exists.

farness

farness of each case from its given class. Is NA for cases whose ynew is missing.

ofarness

for each case i, its lowest fig[i,g] to any class g. Always exists.

k

the requested number of nearest neighbors, from vcr.knn.train.out.

ktrues

for each case this contains the actual number of elements in its neighborhood. This can be higher than k due to ties.

counts

a matrix with 3 columns, each row representing a case. For the neighborhood of each case it says how many members it has from the given class, the predicted class, and the alternative class. The first and third entry is NA for cases whose ynew is missing.

Author(s)

Raymaekers J., Rousseeuw P.J.

References

Raymaekers J., Rousseeuw P.J., Hubert M. (2021). Class maps for visualizing classification results. Technometrics, appeared online. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00401706.2021.1927849")}(link to open access pdf)

See Also

vcr.knn.train, classmap, silplot, stackedplot

Examples

data("data_floralbuds")
X <- data_floralbuds[, 1:6]; y <- data_floralbuds[, 7]
set.seed(12345); trainset <- sample(1:550, 275)
vcr.train <- vcr.knn.train(X[trainset, ], y[trainset], k = 5)
vcr.test <- vcr.knn.newdata(X[-trainset, ], y[-trainset], vcr.train)
confmat.vcr(vcr.train) # for comparison
confmat.vcr(vcr.test)
cols <- c("saddlebrown", "orange", "olivedrab4", "royalblue3")
stackedplot(vcr.train, classCols = cols) # for comparison
stackedplot(vcr.test, classCols = cols)
classmap(vcr.train, "bud", classCols = cols) # for comparison
classmap(vcr.test, "bud", classCols = cols)

# For more examples, we refer to the vignette:
## Not run: 
vignette("K_nearest_neighbors_examples")

## End(Not run)

classmap documentation built on April 23, 2023, 5:09 p.m.