loclda: Localized Linear Discriminant Analysis (LocLDA)

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/loclda.R

Description

A localized version of Linear Discriminant Analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
loclda(x, ...)

## S3 method for class 'formula'
loclda(formula, data, ..., subset, na.action)

## Default S3 method:
loclda(x, grouping, weight.func = function(x) 1/exp(x), 
    k = nrow(x), weighted.apriori = TRUE, ...)

## S3 method for class 'data.frame'
 loclda(x, ...)

## S3 method for class 'matrix'
loclda(x, grouping, ..., subset, na.action)

Arguments

formula

Formula of the form ‘groups ~ x1 + x2 + ...’.

data

Data frame from which variables specified in formula are to be taken.

x

Matrix or data frame containing the explanatory variables (required, if formula is not given).

grouping

(required if no formula principal argument is given.) A factor specifying the class for each observation.

weight.func

Function used to compute local weights. Must be finite over the interval [0,1]. See Details below.

k

Number of nearest neighbours used to construct localized classification rules. See Details below.

weighted.apriori

Logical: if TRUE, class prior probabilities are computed using local weights (see Details below). If FALSE, equal priors for all classes actually occurring in the train data are used.

subset

An index vector specifying the cases to be used in the training sample.

na.action

A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit which leads to rejection of cases with missing values on any required variable.

...

Further arguments to be passed to loclda.default.

Details

This is an approach to apply the concept of localization described by Tutz and Binder (2005) to Linear Discriminant Analysis. The function loclda generates an object of class loclda (see Value below). As localization makes it necessary to build an individual decision rule for each test observation, this rule construction has to be handled by predict.loclda. For convenience, the rule building procedure is still described here.

To classify a test observation x_s, only the k nearest neighbours of x_s within the train data are used. Each of these k train observations x_i, i=1,...,k, is assigned a weight w_i according to

w_i := K ( ||x_i - x_s|| / d_k ), i=1,...,k,

where K is the weighting function given by weight.func, ||x_i - x_s|| is the euclidian distance of x_i and x_s and d_k is the euclidian distance of x_s to its k-th nearest neighbour. With these weights for each class A_g, g=1,...,G, its weighted empirical mean mu_g_hat and weighted empirical covariance matrix are computed. The estimated pooled (weighted) covariance matrix Sigma_hat is then calculated from the individual weighted empirical class covariance matrices. If weighted.apriori is TRUE (the default), prior class probabilities are estimated according to:

prior_g := [ Sum_{i=1,..,k} ( w_i * I(x_i in A_g) ) ] / [ Sum_{i=1,...,k} ( w_i ) ], g = 1,...,G,

where I is the indicator function. If FALSE, equal priors for all classes are used. In analogy to Linear Discriminant Analysis, the decision rule for x_s is

A_hat := argmax_{g in 1,...,G} (posterior_g),

where

posterior_g := prior_g * exp [ (-1/2) * t( x_s - mu_g_hat ) * Sigma_hat^(-1) * ( x_s - mu_g_hat ) ] .

If posterior_g < 1e-150 for all g in 1,...,G, posterior_g is set to 1/G for all g in 1,...,G and the test observation x_s is simply assigned to the class whose weighted mean has the lowest euclidian distance to x_s.

Value

A list of class loclda containing the following components:

call

The (matched) function call.

learn

Matrix containing the values of the explanatory variables for all train observations.

grouping

Factor specifying the class for each train observation.

weight.func

Value of the argument weight.func.

k

Value of the argument k.

weighted.apriori

Value of the argument weighted.apriori.

Author(s)

Marc Zentgraf ([email protected]) and Karsten Luebke ([email protected])

References

Tutz, G. and Binder, H. (2005): Localized classification. Statistics and Computing 15, 155-166.

See Also

predict.loclda, lda

Examples

1
2
benchB3("lda")$l1co.error
benchB3("loclda")$l1co.error

Example output

Loading required package: MASS

Error Rate in 1 th cycle:  0.667
Error Rate in 2 th cycle:  0.438
Error Rate in 3 th cycle:  0.294
Error Rate in 4 th cycle:  0.667
Error Rate in 5 th cycle:  0.344
Error Rate in 6 th cycle:  0.562
------------------------------------------
Mean Error Rate of method lda : 0.495 
[1] 0.4952002

Error Rate in 1 th cycle:  0.667
Error Rate in 2 th cycle:  0.438
Error Rate in 3 th cycle:  0.118
Error Rate in 4 th cycle:  0.583
Error Rate in 5 th cycle:  0.281
Error Rate in 6 th cycle:  0.542
------------------------------------------
Mean Error Rate of method loclda : 0.438 
[1] 0.4380106

klaR documentation built on March 19, 2018, 5:03 p.m.