Kernel discriminant analysis
Description
Kernel discriminant analysis for 1 to 6dimensional data.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  kda(x, x.group, Hs, hs, prior.prob=NULL, gridsize, xmin, xmax, supp=3.7,
eval.points, binned=FALSE, bgridsize, w, compute.cont=FALSE, approx.cont=TRUE,
kde.flag=TRUE)
Hkda(x, x.group, Hstart, bw="plugin", ...)
Hkda.diag(x, x.group, bw="plugin", ...)
hkda(x, x.group, bw="plugin", ...)
## S3 method for class 'kda'
predict(object, ..., x)
compare(x.group, est.group, by.group=FALSE)
compare.kda.cv(x, x.group, bw="plugin", prior.prob=NULL, Hstart, by.group=FALSE,
verbose=FALSE, recompute=FALSE, ...)
compare.kda.diag.cv(x, x.group, bw="plugin", prior.prob=NULL, by.group=FALSE,
verbose=FALSE, recompute=FALSE, ...)

Arguments
x 
matrix of training data values 
x.group 
vector of group labels for training data 
Hs,hs 
(stacked) matrix of bandwidth matrices/vector of scalar
bandwidths. If these are missing, 
prior.prob 
vector of prior probabilities 
gridsize 
vector of grid sizes 
xmin,xmax 
vector of minimum/maximum values for grid 
supp 
effective support for standard normal 
eval.points 
points at which estimate is evaluated 
binned 
flag for binned estimation. Default is FALSE. 
bgridsize 
vector of binning grid sizes 
w 
vector of weights. Not yet implemented. 
compute.cont 
flag for computing 1% to 99% probability contour levels. Default is FALSE. 
approx.cont 
flag for computing approximate probability contour levels. Default is TRUE. 
kde.flag 
flag for computing KDE on grid. Default is TRUE. 
object 
object of class 
bw 
bandwidth: "plugin" = plugin, "lscv" = LSCV, "scv" = SCV 
Hstart 
(stacked) matrix of initial bandwidth matrices, used in numerical optimisation 
est.group 
vector of estimated group labels 
by.group 
flag to give results also within each group 
verbose 
flag for printing progress information. Default is FALSE. 
recompute 
flag for recomputing the bandwidth matrix after excluding the ith data item 
... 
other optional parameters for bandwidth selection, see

Details
If the bandwidths Hs
are missing from kda
, then the
default bandwidths are the plugin selectors Hkda(, bw="plugin")
.
Likewise for missing hs
. Valid options for bw
are "plugin"
, "lscv"
and "scv"
which in turn call
Hpi
, Hlscv
and Hscv
.
The effective support, binning, grid size, grid range, positive
parameters are the same as kde
.
If prior probabilities are known then set prior.prob
to these.
Otherwise prior.prob=NULL
uses the sample
proportions as estimates of the prior probabilities.
As of ks 1.8.11, kda.kde
has been subsumed
into kda
, so all prior calls to kda.kde
can be replaced
by kda
. To reproduce the previous behaviour of kda
, the
command is kda(, kde.flag=FALSE)
.
Value
–A kernel discriminant analysis is an object of class kda
which is a list with fields
x 
list of data points, one for each group label 
estimate 
list of density estimates at 
eval.points 
points that the estimate is evaluated at, one for each group label 
h 
vector of bandwidths (1d only) 
H 
stacked matrix of bandwidth matrices or vector of bandwidths 
gridded 
flag for estimation on a grid 
binned 
flag for binned estimation 
w 
weights 
prior.prob 
prior probabilities 
x.group 
group labels  same as input 
x.group.estimate 
estimated group labels. If the test data

–The result from Hkda
and Hkda.diag
is a stacked matrix
of bandwidth matrices, one for each training data group. The result
from hkda
is a vector of bandwidths, one for each training group.
–The compare
functions create a comparison between the true
group labels x.group
and the estimated ones.
It returns a list with fields
cross 
crossclassification table with the rows indicating the true group and the columns the estimated group 
error 
misclassification rate (MR) 
In the case where the test data is independent of the
training data, compare
computes MR = (number of points wrongly
classified)/(total number of points). In the case where the test data
are not independent e.g.
we are classifying the training data set itself, then the cross
validated estimate of MR is more appropriate. These
are implemented as compare.kda.cv
(full bandwidth
selectors) and compare.kda.diag.cv
(for diagonal bandwidth
selectors). These functions are only available for d > 1.
If by.group=FALSE
then only the total MR rate is given. If it
is set to TRUE, then the MR rates for each class are also given
(estimated number in group divided by true number).
References
Simonoff, J. S. (1996) Smoothing Methods in Statistics. SpringerVerlag. New York
See Also
plot.kda
Examples
1 2 3 4 5 6 7 8 9  set.seed(8192)
x < c(rnorm.mixt(n=100, mus=1), rnorm.mixt(n=100, mus=1))
x.gr < rep(c(1,2), times=c(100,100))
y < c(rnorm.mixt(n=100, mus=1), rnorm.mixt(n=100, mus=1))
kda.gr < kda(x, x.gr, eval.points=y)
compare(kda.gr$x.group, kda.gr$x.group.est, by.group=TRUE)
predict(kda.gr, x=0)
## See other examples in ? plot.kda
