kknn | R Documentation |
Performs k-nearest neighbor classification of a test set using a training set. For each row of the test set, the k nearest training set vectors (according to Minkowski distance) are found, and the classification is done via the maximum of summed kernel densities. In addition even ordinal and continuous variables can be predicted.
kknn(
formula = formula(train),
train,
test,
na.action = na.omit(),
k = 7,
distance = 2,
kernel = "optimal",
ykernel = NULL,
scale = TRUE,
contrasts = c(unordered = "contr.dummy", ordered = "contr.ordinal")
)
kknn.dist(learn, valid, k = 10, distance = 2)
## S3 method for class 'kknn'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'kknn'
summary(object, ...)
## S3 method for class 'kknn'
predict(object, type = c("raw", "prob"), ...)
formula |
A formula object. |
train |
Matrix or data frame of training set cases. |
test |
Matrix or data frame of test set cases. |
na.action |
A function which indicates what should happen when the data contain 'NA's. |
k |
Number of neighbors considered. |
distance |
Parameter of Minkowski distance. |
kernel |
Kernel to use. Possible choices are "rectangular" (which is standard unweighted knn), "triangular", "epanechnikov" (or beta(2,2)), "biweight" (or beta(3,3)), "triweight" (or beta(4,4)), "cos", "inv", "gaussian", "rank" and "optimal". |
ykernel |
Window width of an y-kernel, especially for prediction of ordinal classes. |
scale |
logical, scale variable to have equal sd. |
contrasts |
A vector containing the 'unordered' and 'ordered' contrasts to use. |
learn |
Matrix or data frame of training set cases. |
valid |
Matrix or data frame of test set cases. |
x |
an object used to select a method. |
digits |
minimal number of significant digits. |
... |
further arguments passed to or from other methods. |
object |
a model object for which prediction is desired. |
type |
defines the output, 'raw' returns the estimates, 'prob' returns a matrix containing the proportions of each class. |
This nearest neighbor method expands knn in several directions. First it can
be used not only for classification, but also for regression and ordinal
classification. Second it uses kernel functions to weight the neighbors
according to their distances. In fact, not only kernel functions but every
monotonic decreasing function f(x) \forall x>0
will
work fine.
The number of neighbours used for the "optimal" kernel should be [
(2(d+4)/(d+2))^(d/(d+4)) k ]
, where k is the number that would be used for
unweighted knn classification, i.e. kernel="rectangular". This factor
(2(d+4)/(d+2))^(d/(d+4))
is between 1.2 and 2 (see Samworth (2012) for
more details).
kknn
returns a list-object of class kknn
including the
components
fitted.values |
Vector of predictions. |
CL |
Matrix of classes of the k nearest neighbors. |
W |
Matrix of weights of the k nearest neighbors. |
D |
Matrix of distances of the k nearest neighbors. |
C |
Matrix of indices of the k nearest neighbors. |
prob |
Matrix of predicted class probabilities. |
response |
Type of response variable, one of continuous, nominal or ordinal. |
distance |
Parameter of Minkowski distance. |
call |
The matched call. |
terms |
The 'terms' object used. |
Klaus P. Schliep klaus.schliep@gmail.com
Klaus
Hechenbichler
Hechenbichler K. and Schliep K.P. (2004) Weighted k-Nearest-Neighbor Techniques and Ordinal Classification, Discussion Paper 399, SFB 386, Ludwig-Maximilians University Munich (\Sexpr[results=rd]{tools:::Rd_expr_doi("10.5282/ubm/epub.1769")})
Hechenbichler K. (2005) Ensemble-Techniken und ordinale Klassifikation, PhD-thesis
Samworth, R.J. (2012) Optimal weighted nearest neighbour classifiers. Annals of Statistics, 40, 2733-2763. (available from http://www.statslab.cam.ac.uk/~rjs57/Research.html)
train.kknn
library(kknn)
data(iris)
m <- dim(iris)[1]
val <- sample(1:m, size = round(m/3), replace = FALSE,
prob = rep(1/m, m))
iris.learn <- iris[-val,]
iris.valid <- iris[val,]
iris.kknn <- kknn(Species~., iris.learn, iris.valid, distance = 1,
kernel = "triangular")
summary(iris.kknn)
fit <- fitted(iris.kknn)
table(iris.valid$Species, fit)
pcol <- as.character(as.numeric(iris.valid$Species))
pairs(iris.valid[1:4], pch = pcol, col = c("green3", "red")
[(iris.valid$Species != fit)+1])
data(ionosphere)
ionosphere.learn <- ionosphere[1:200,]
ionosphere.valid <- ionosphere[-c(1:200),]
fit.kknn <- kknn(class ~ ., ionosphere.learn, ionosphere.valid)
table(ionosphere.valid$class, fit.kknn$fit)
(fit.train1 <- train.kknn(class ~ ., ionosphere.learn, kmax = 15,
kernel = c("triangular", "rectangular", "epanechnikov", "optimal"), distance = 1))
table(predict(fit.train1, ionosphere.valid), ionosphere.valid$class)
(fit.train2 <- train.kknn(class ~ ., ionosphere.learn, kmax = 15,
kernel = c("triangular", "rectangular", "epanechnikov", "optimal"), distance = 2))
table(predict(fit.train2, ionosphere.valid), ionosphere.valid$class)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.