kNN_MLE: MLE k in kNN

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Uses the profile pseudolikelihood to obtain the estimate for k, the number of nearest neighbors parameter in kNN.

Usage

1
kNN_MLE(X, Y, kmax = ceiling(length(Y) * 0.5), plot = TRUE)

Arguments

X

An n-by-p matrix of covariates

Y

Outputs with Q classes

kmax

The maximum size of k

plot

if TRUE, plot the profile deviance otherwise no plot

Details

When Q=2, the glm algorithm is used to compute the profile pseudologlikelihood and for Q>2, the function multinom in nnet is used.

Value

The estimate of k obtained by maximizing the pseudolikelihood is returned. It can take any value from k=0 to k=kmax.

The result is returned invisibly if plot is TRUE.

Author(s)

A. I. McLeod Maintainer: <aimcleod@uwo.ca>

References

Holmes, C. C. and Adams, N. M. (2003). Likelihood inference in nearest-neighbour classification models, Biometrika, 90(1), 99-112. http://biomet.oxfordjournals.org/cgi/content/abstract/90/1/99

See Also

multinom

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#Two classes example
X <- MASS::synth.tr[,1:2]
Y <- MASS::synth.tr[,3]
kNN_MLE(X=X, Y=Y, plot=FALSE)

## Not run: 
#Three classes example
library("MASS") #need lda
Y<- iris[,5]
X<- iris[,1:4]
kopt <- kNN_MLE(X, Y)
kopt
#Mis-classification rates on training data.
#Of course FLDA does better in this case.
y <- factor(Y)
ans <- class::knn(train=X, test=X, k=kopt, cl=y)
etaKNN <- sum(ans!=y)/length(y)
iris.ldf <- MASS::lda(X, y)
yfitFLDA <- MASS::predict.lda(iris.ldf, newdata=X, dimen=1)$class
etaFLDA <- sum(yfitFLDA!=y)/length(y)
eta<-c(etaFLDA, etaKNN)
names(eta)<-c("FLDA", "kNN")
eta

## End(Not run)

gencve documentation built on May 2, 2019, 6:08 a.m.