majorityGenerative: Generative Majority Classifier
In schiffner/locClass: Collection of Local Classification Methods

Description Usage Arguments Details Value See Also Examples

Computes a classifier that always predicts the class with the highest weighted prior probability and also fits a multivariate normal distribution to the data.

majorityGenerative(x, ...)

## S3 method for class 'formula'
majorityGenerative(formula, data, weights = rep(1,
  nrow(data)), ..., subset, na.action)

## S3 method for class 'data.frame'
majorityGenerative(x, ...)

## S3 method for class 'matrix'
majorityGenerative(x, grouping, weights = rep(1, nrow(x)),
  ..., subset, na.action = na.fail)

## Default S3 method:
majorityGenerative(x, grouping, weights = rep(1, nrow(x)),
  method = c("unbiased", "ML"), ...)

`x`	(Required if no `formula` is given as principal argument.) A `matrix` or `data.frame` or `Matrix` containing the explanatory variables.
`...`	Further arguments.
`formula`	A `formula` of the form `groups ~ x1 + x2 + ...`, that is, the response is the grouping `factor` and the right hand side specifies the discriminators.
`data`	A `data.frame` from which variables specified in `formula` are to be taken.
`weights`	Observation weights to be used in the fitting process, must be non-negative.
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	A function to specify the action to be taken if `NA`'s are found. The default action is first the `na.action` setting of `options` and second `na.fail` if that is unset. An alternative is `na.omit`, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)
`grouping`	(Required if no `formula` is given as principal argument.) A `factor` specifying the class membership for each observation.
`method`	Method for scaling the weighted covariance matrix, either `"unbiased"` or maximum-likelihood (`"ML"`). Defaults to `"unbiased"`.

This is a helper function to integrate the majority classifier into mixture models. The formulas for the weighted estimates of the mean, the covariance matrix and the class priors are as follows:

Normalized weights:

w_n* = w_n/(sum_n w_n)

Weighted means:

bar x = sum_n w_n* x_i

Weighted covariance matrix: method = "ML":

S = sum_n w_n* (x_n - bar x)(x_n - bar x)'

method = "unbiased":

S = (sum_n w_n* (x_n - bar x)(x_n - bar x)')/(1 - sum_n (w_n*)^2)

Weighted prior probabilities:

p_g = ∑_{n:y_n=g} w_n/(∑_n w_n)

If the predictor variables include factors, the formula interface must be used in order to get a correct model matrix.

An object of class "majorityGenerative", a list containing the following components:

`prior`	Weighted class prior probabilities.
`counts`	The number of observations per class.
`mean`	Weighted estimate of the mean.
`cov`	Weighted estimate of the covariance matrix.
`lev`	The class labels (levels of `grouping`).
`N`	The number of observations.
`weights`	The observation weights used in the fitting process.
`method`	The method used for scaling the weighted covariance matrix estimate.
`call`	The (matched) function call.

Other majority: majority, predict.majorityGenerative, predict.majority

library(mlbench)
data(PimaIndiansDiabetes)

train <- sample(nrow(PimaIndiansDiabetes), 500)

# weighting observations from classes pos and neg according to their
# frequency in the data set:
ws <- as.numeric(1/table(PimaIndiansDiabetes$diabetes)
    [PimaIndiansDiabetes$diabetes])

fit <- majorityGenerative(diabetes ~ ., data = PimaIndiansDiabetes, weights = ws,
    subset = train)
pred <- predict(fit, newdata = PimaIndiansDiabetes[-train,])
mean(pred$class != PimaIndiansDiabetes$diabetes[-train])