Description Usage Arguments Details Value Note Author(s) See Also Examples
K-Nearest Neighbor prediction method which uses the
distances calculated by knn.dist
.
1 2 3 | knn.predict(train, test, y, dist.matrix, k = 1,
agg.meth = if (is.factor(y)) "majority" else "mean",
ties.meth = "min")
|
train |
indexes which specify the rows of x
provided to |
test |
indexes which specify the rows of x
provided to |
y |
responses, see details below |
dist.matrix |
the output from a call to
|
k |
the number of nearest neighbors to consider |
agg.meth |
method to combine responses of the nearest neighbors, defaults to "majority" for classification and "mean" for continuous responses |
ties.meth |
method to handle ties for the kth neighbor, the default is "min" which uses all ties, alternatives include "max" which uses none if there are ties for the k-th nearest neighbor, "random" which selects among the ties randomly and "first" which uses the ties in their order in the data |
Predictions are calculated for each test case by
aggregating the responses of the k-nearest neighbors
among the training cases. k
may be specified to be
any positive integer less than the number of training
cases, but is generally between 1 and 10.
The indexes for the training and test cases are in
reference to the order of the entire data set as it was
passed to knn.dist
.
Only responses for the training cases are used. The responses provided in y may be those for the entire data set (test and training cases), or just for the training cases.
The aggregation may be any named function. By default, classification (factored responses) will use the "majority" class function and non-factored responses will use "mean". Other options to consider include "min", "max" and "median".
The ties are handled using the rank
function. Further information may be found by examining
the ties.method
there.
a vector of predictions whose length is the number of test cases.
For the traditional scenario, classification using the
Euclidean distance on a fixed set of training cases and a
fixed set of test cases, the method
knn
is ideal. The functions
knn.dist
and knn.predict
are
intend to be used when something beyond the traditional
case is desired. For example, prediction on a continuous
y (non-classification), cross-validation for the
selection of k, or the use of an alternate distance
method are well handled.
Atina Dunlap Brooks
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | # a quick classification example
x1 <- c(rnorm(20,mean=1),rnorm(20,mean=5))
x2 <- c(rnorm(20,mean=5),rnorm(20,mean=1))
x <- cbind(x1,x2)
y <- c(rep(1,20),rep(0,20))
train <- sample(1:40,30)
# plot the training cases
plot(x1[train],x2[train],col=y[train]+1,xlab="x1",ylab="x2")
# predict the other cases
test <- (1:40)[-train]
kdist <- knn.dist(x)
preds <- knn.predict(train,test,y,kdist,k=3,agg.meth="majority")
# add the predictions to the plot
points(x1[test],x2[test],col=as.integer(preds)+1,pch="+")
# display the confusion matrix
table(y[test],preds)
# the iris example used by knn(class)
library(class)
data(iris3)
train <- rbind(iris3[1:25,,1], iris3[1:25,,2], iris3[1:25,,3])
test <- rbind(iris3[26:50,,1], iris3[26:50,,2], iris3[26:50,,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
# how to get predictions from knn(class)
pred<-knn(train, test, cl, k = 3)
# display the confusion matrix
table(pred,cl)
# how to get predictions with knn.dist and knn.predict
x <- rbind(train,test)
kdist <- knn.dist(x)
pred <- knn.predict(1:75, 76:150, cl, kdist, k=3)
# display the confusion matrix
table(pred,cl)
# note any small differences are a result of both methods
# breaking ties in majority class randomly
# 5-fold cross-validation to select k for above example
fold <- sample(1:5,75,replace=TRUE)
cvpred <- matrix(NA,nrow=75,ncol=10)
for (k in 1:10)
for (i in 1:5)
cvpred[which(fold==i),k] <- knn.predict(train=which(fold!=i),test=which(fold==i),cl,kdist,k=k)
# display misclassification rates for k=1:10
apply(cvpred,2,function(x) sum(cl!=x))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.