wqda: Weighted Quadratic Discriminant Analysis

Description Usage Arguments Details Value See Also Examples

View source: R/wqda.R

Description

A version of Quadratic Discriminant Analysis that can deal with observation weights.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
  wqda(x, ...)

  ## S3 method for class 'formula'
 wqda(formula, data,
    weights = rep(1, nrow(data)), ..., subset, na.action)

  ## S3 method for class 'data.frame'
 wqda(x, ...)

  ## S3 method for class 'matrix'
 wqda(x, grouping,
    weights = rep(1, nrow(x)), ..., subset,
    na.action = na.fail)

  ## Default S3 method:
 wqda(x, grouping,
    weights = rep(1, nrow(x)),
    method = c("unbiased", "ML"), ...)

Arguments

formula

A formula of the form groups ~ x1 + x2 + ..., that is, the response is the grouping factor and the right hand side specifies the (non-factor) discriminators.

data

A data.frame from which variables specified in formula are to be taken.

x

(Required if no formula is given as principal argument.) A matrix or data.frame or Matrix containing the explanatory variables.

grouping

(Required if no formula is given as principal argument.) A factor specifying the class membership for each observation.

weights

Observation weights to be used in the fitting process, must be larger or equal to zero.

method

Method for scaling the pooled weighted covariance matrix, either "unbiased" or maximum-likelihood ("ML"). Defaults to "unbiased".

...

Further arguments.

subset

An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)

na.action

A function to specify the action to be taken if NAs are found. The default action is first the na.action setting of options and second na.fail if that is unset. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)

Details

The formulas for the weighted estimates of the class means, the covariance matrices and the class priors are as follows:

Normalized weights: if x_n is in class g, i.e. y_n = g

w_n* = w_n/sum_{n:y_n=g} w_n

Weighted class means:

bar x_g = sum_{n:y_n=g} w_n* x_i

Weighted class covariance matrices: method = "ML":

S_g = sum_{n:y_n=g} w_n* (x_n - bar x_g)(x_n - bar x_g)'

method = "unbiased":

S_g = sum_{n:y_n=g} w_n* (x_n - bar x_g)(x_n - bar x_g)'/(1 - sum_{n:y_n=g} w_n*^2)

Weighted prior probabilities:

p_g = ∑_{n:y_n=g} w_n/∑_n w_n

If the predictor variables include factors, the formula interface must be used in order to get a correct model matrix.

Value

An object of class "wqda", a list containing the following components:

prior

Weighted class prior probabilities.

counts

The number of observations per class.

means

Weighted estimates of class means.

covs

Weighted estimate of the class covariance matrices.

lev

The class labels (levels of grouping).

N

The number of observations.

weights

The observation weights used in the fitting process.

method

The method used for scaling the pooled weighted covariance matrix.

call

The (matched) function call.

See Also

predict.wqda, wlda.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(mlbench)
data(PimaIndiansDiabetes)

train <- sample(nrow(PimaIndiansDiabetes), 500)

# weighting observations from classes pos and neg according to their
# frequency in the data set:
ws <- as.numeric(1/table(PimaIndiansDiabetes$diabetes)
    [PimaIndiansDiabetes$diabetes])

fit <- wqda(diabetes ~ ., data = PimaIndiansDiabetes, weights = ws,
    subset = train)
pred <- predict(fit, newdata = PimaIndiansDiabetes[-train,])
mean(pred$class != PimaIndiansDiabetes$diabetes[-train])

locClass documentation built on May 2, 2019, 5:21 p.m.

Related to wqda in locClass...