kda | R Documentation |
Kernel discriminant analysis (kernel classification) for 1- to d-dimensional data.
kda(x, x.group, Hs, hs, prior.prob=NULL, gridsize, xmin, xmax, supp=3.7,
eval.points, binned, bgridsize, w, compute.cont=TRUE, approx.cont=TRUE,
kde.flag=TRUE)
Hkda(x, x.group, Hstart, bw="plugin", ...)
Hkda.diag(x, x.group, bw="plugin", ...)
hkda(x, x.group, bw="plugin", ...)
## S3 method for class 'kda'
predict(object, ..., x)
compare(x.group, est.group, by.group=FALSE)
compare.kda.cv(x, x.group, bw="plugin", prior.prob=NULL, Hstart, by.group=FALSE,
verbose=FALSE, recompute=FALSE, ...)
compare.kda.diag.cv(x, x.group, bw="plugin", prior.prob=NULL, by.group=FALSE,
verbose=FALSE, recompute=FALSE, ...)
x |
matrix of training data values |
x.group |
vector of group labels for training data |
Hs , hs |
(stacked) matrix of bandwidth matrices/vector of scalar
bandwidths. If these are missing, |
prior.prob |
vector of prior probabilities |
gridsize |
vector of grid sizes |
xmin , xmax |
vector of minimum/maximum values for grid |
supp |
effective support for standard normal |
eval.points |
vector or matrix of points at which estimate is evaluated |
binned |
flag for binned estimation |
bgridsize |
vector of binning grid sizes |
w |
vector of weights. Not yet implemented. |
compute.cont |
flag for computing 1% to 99% probability contour levels. Default is TRUE. |
approx.cont |
flag for computing approximate probability contour levels. Default is TRUE. |
kde.flag |
flag for computing KDE on grid. Default is TRUE. |
object |
object of class |
bw |
bandwidth: "plugin" = plug-in, "lscv" = LSCV, "scv" = SCV |
Hstart |
(stacked) matrix of initial bandwidth matrices, used in numerical optimisation |
est.group |
vector of estimated group labels |
by.group |
flag to give results also within each group |
verbose |
flag for printing progress information. Default is FALSE. |
recompute |
flag for recomputing the bandwidth matrix after excluding the i-th data item |
... |
other optional parameters for bandwidth selection, see
|
If the bandwidths Hs
are missing from kda
, then the
default bandwidths are the plug-in selectors Hkda(, bw="plugin")
.
Likewise for missing hs
. Valid options for bw
are "plugin"
, "lscv"
and "scv"
which in turn call
Hpi
, Hlscv
and Hscv
.
The effective support, binning, grid size, grid range, positive
parameters are the same as kde
.
If prior probabilities are known then set prior.prob
to these.
Otherwise prior.prob=NULL
uses the sample
proportions as estimates of the prior probabilities.
For ks \geq
1.8.11, kda.kde
has been subsumed
into kda
, so all prior calls to kda.kde
can be replaced
by kda
. To reproduce the previous behaviour of kda
, the
command is kda(, kde.flag=FALSE)
.
–For kde.flag=TRUE
, a kernel discriminant analysis is an object of class kda
which is a list with fields
x |
list of data points, one for each group label |
estimate |
list of density estimates at |
eval.points |
vector or list of points that the estimate is evaluated at, one for each group label |
h |
vector of bandwidths (1-d only) |
H |
stacked matrix of bandwidth matrices or vector of bandwidths |
gridded |
flag for estimation on a grid |
binned |
flag for binned estimation |
w |
vector of weights |
prior.prob |
vector of prior probabilities |
x.group |
vector of group labels - same as input |
x.group.estimate |
vector of estimated group labels. If the test data
|
For kde.flag=FALSE
, which is always the case for d>3
,
then only the vector of estimated group labels is returned.
–The result from Hkda
and Hkda.diag
is a stacked matrix
of bandwidth matrices, one for each training data group. The result
from hkda
is a vector of bandwidths, one for each training group.
–The compare
functions create a comparison between the true
group labels x.group
and the estimated ones.
It returns a list with fields
cross |
cross-classification table with the rows indicating the true group and the columns the estimated group |
error |
misclassification rate (MR) |
In the case where the test data are independent of the
training data, compare
computes MR = (number of points wrongly
classified)/(total number of points). In the case where the test data
are not independent e.g.
we are classifying the training data set itself, then the cross
validated estimate of MR is more appropriate. These
are implemented as compare.kda.cv
(unconstrained bandwidth
selectors) and compare.kda.diag.cv
(for diagonal bandwidth
selectors). These functions are only available for d > 1.
If by.group=FALSE
then only the total MR rate is given. If it
is set to TRUE, then the MR rates for each class are also given
(estimated number in group divided by true number).
Simonoff, J. S. (1996) Smoothing Methods in Statistics. Springer-Verlag. New York
plot.kda
set.seed(8192)
x <- c(rnorm.mixt(n=100, mus=1), rnorm.mixt(n=100, mus=-1))
x.gr <- rep(c(1,2), times=c(100,100))
y <- c(rnorm.mixt(n=100, mus=1), rnorm.mixt(n=100, mus=-1))
y.gr <- rep(c(1,2), times=c(100,100))
kda.gr <- kda(x, x.gr)
y.gr.est <- predict(kda.gr, x=y)
compare(y.gr, y.gr.est)
## See other examples in ? plot.kda
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.