kNN_MLE: MLE k in kNN

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/kNN_MLE.R

Description

Uses the profile pseudolikelihood to obtain the estimate for k, the number of nearest neighbors parameter in kNN.

Usage

1
kNN_MLE(X, Y, kmax = ceiling(length(Y) * 0.5), plot = TRUE)

Arguments

X

An n-by-p matrix of covariates

Y

Outputs with Q classes

kmax

The maximum size of k

plot

if TRUE, plot the profile deviance otherwise no plot

Details

When Q=2, the glm algorithm is used to compute the profile pseudologlikelihood and for Q>2, the function multinom in nnet is used.

Value

The estimate of k obtained by maximizing the pseudolikelihood is returned. It can take any value from k=0 to k=kmax.

The result is returned invisibly if plot is TRUE.

Author(s)

A. I. McLeod Maintainer: <[email protected]>

References

Holmes, C. C. and Adams, N. M. (2003). Likelihood inference in nearest-neighbour classification models, Biometrika, 90(1), 99-112. http://biomet.oxfordjournals.org/cgi/content/abstract/90/1/99

See Also

multinom

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#Two classes example
X <- MASS::synth.tr[,1:2]
Y <- MASS::synth.tr[,3]
kNN_MLE(X=X, Y=Y, plot=FALSE)

## Not run: 
#Three classes example
library("MASS") #need lda
Y<- iris[,5]
X<- iris[,1:4]
kopt <- kNN_MLE(X, Y)
kopt
#Mis-classification rates on training data.
#Of course FLDA does better in this case.
y <- factor(Y)
ans <- class::knn(train=X, test=X, k=kopt, cl=y)
etaKNN <- sum(ans!=y)/length(y)
iris.ldf <- MASS::lda(X, y)
yfitFLDA <- MASS::predict.lda(iris.ldf, newdata=X, dimen=1)$class
etaFLDA <- sum(yfitFLDA!=y)/length(y)
eta<-c(etaFLDA, etaKNN)
names(eta)<-c("FLDA", "kNN")
eta

## End(Not run)

gencve documentation built on May 29, 2017, 7:12 p.m.