kda: Kernel Classification Rules

Description Usage Arguments Details Value References See Also Examples

View source: R/kda.R

Description

Classification using the moving window and kernel classification rules.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
kda(x, ...)

## S3 method for class 'formula'
kda(formula, data, ..., subset, na.action)

## S3 method for class 'data.frame'
kda(x, ...)

## S3 method for class 'matrix'
kda(x, grouping, ..., subset, na.action = na.fail)

## Default S3 method:
kda(x, grouping, wf = c("biweight", "cauchy", "cosine",
  "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular",
  "triangular"), bw, k, nn.only = TRUE, ...)

Arguments

x

(Required if no formula is given as principal argument.) A matrix or data.frame or Matrix containing the explanatory variables.

formula

A formula of the form groups ~ x1 + x2 + ..., that is, the response is the grouping factor and the right hand side specifies the (usually non-factor) discriminators.

data

A data.frame from which variables specified in formula are to be taken.

subset

An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)

na.action

A function to specify the action to be taken if NAs are found. The default action is first the na.action setting of options and second na.fail if that is unset. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)

grouping

(Required if no formula is given as principal argument.) A factor specifying the class membership for each observation.

wf

A window function which is used to calculate weights that are introduced into the fitting process. Either a character string or a function, e.g. wf = function(x) exp(-x). For details see the documentation for wfs.

bw

(Required only if wf is a string.) The bandwidth parameter of the window function. (See wfs.)

k

(Required only if wf is a string.) The number of nearest neighbors of the decision boundary to be used in the fitting process. (See wfs.)

nn.only

(Required only if wf is a string indicating a window function with infinite support and if k is specified.) Should only the k nearest neighbors or all observations receive positive weights? (See wfs.)

...

Further arguments. Currently unused.

Details

The kernel clasification rule is given as

hat g = arg max_g sum_{n:y_n=g} wf((x-x_n)/bw).

In the case that wf is the rectangular kernel it is also called moving window rule.

The name of the window function (wf) can be specified as a character string. In this case the window function is generated internally in predict.kda. Currently supported are "biweight", "cauchy", "cosine", "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular" and "triangular".

Moreover, it is possible to generate the window functions mentioned above in advance (see wfs) and pass them to kda.

Any other function implementing a window function can also be used as wf argument. This allows the user to try own window functions. See help on wfs for details.

If the predictor variables include factors, the formula interface must be used in order to get a correct model matrix.

Value

An object of class "kda", a list containing the following components:

x

A matrix containing the explanatory variables.

grouping

A factor specifying the class membership for each observation.

counts

The number of observations per class.

lev

The class labels (levels of grouping).

N

The number of observations.

wf

The window function used. Always a function, even if the input was a string.

bw

(Only if wf is a string or was generated by means of one of the functions documented in wfs.) The bandwidth used, NULL if bw was not specified.

k

(Only if wf is a string or was generated by means of one of the functions documented in wfs.) The number of nearest neighbors used, NULL if k was not specified.

nn.only

(Logical. Only if wf is a string or was generated by means of one of the functions documented in wfs and if k was specified.) TRUE if only the k nearest neighbors recieve a positive weight, FALSE otherwise.

adaptive

(Logical.) TRUE if the bandwidth of wf is adaptive to the local density of data points, FALSE if the bandwidth is fixed.

variant

(Only if wf is a string or if one of the weight functions documented in wfs is used, for internal use only). An integer indicating which weighting scheme is implied by bw, k and nn.only.

call

The (matched) function call.

References

Devroye, L., Gyoerfi, L. and Lugosi, A. (1996), A Probabilistic Theory of Pattern Recognition. Springer, New York.

See Also

predict.kda.

Other observation_specific majority: predict.kda

Examples

1
2
3
4
fit <- kda(Species ~ Sepal.Length + Sepal.Width, data = iris,
    wf = "gaussian", bw = 0.5)
pred <- predict(fit)
mean(pred$class != iris$Species)

schiffner/locClass documentation built on May 29, 2019, 3:39 p.m.