daprob: Probabilistic DA (LDA and QDA)

View source: R/daprob.R

daprobR Documentation

Probabilistic DA (LDA and QDA)

Description

For each observation to predict, the function calculates the posterior probability that the observation belongs to a given class, using the Bayes' formula. For each of the classes, the posterior probability is computed from given priors (proportional or uniform) for the class membership and an estimate (parametric or not) of the probability density at the point of the observation conditionnally to the class. The final predicted class corresponds to the class with the highest posterior probability.

Usage


daprob(Xr, Yr, Xu, Yu = NULL, dens = dmnorm, 
  lda = TRUE, prior = c("uniform", "proportional"), ...)

Arguments

Xr

A n x p matrix or data frame of reference (= training) observations.

Yr

A vector of length n, or a n x 1 matrix, of reference (= training) responses (class membership).

Xu

A m x p matrix or data frame of new (= test) observations to be predicted.

Yu

A vector of length m, or a m x 1 matrix, of the true response (class membership). Default to NULL.

dens

A function returning the probability density of the observation conditionnally to the class. Default to dmnorm.

lda

Logical, only considered if dens = dmnorm. If TRUE (default), a gaussian LDA is implemented, otherwise a gaussian QDA is implemented.

prior

The prior probabilities of class membership. Possible values are "uniform" (default; probabilities are set equal for all the classes), "proportional" (probabilities are set equal to the observed proportions of the classes in Yr), or a vector of values defining the probabilities for each class.

...

Optionnal arguments to pass in function defined in dens.

Value

A list of outputs, such as:

y

Responses for the test data.

fit

Predictions for the test data.

r

Residuals for the test data.

References

Saporta, G., 2011. Probabilités analyse des données et statistique. Editions Technip, Paris, France.

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Examples


data(iris)

X <- iris[, 1:4]
y <- iris[, 5]
N <- nrow(X)

m <- round(.25 * N) # Test
n <- N - m          # Training
s <- sample(1:N, m)
Xr <- X[-s, ]
yr <- y[-s]
Xu <- X[s, ]
yu <- y[s]

##### LDA (homogeneous covariances)

fm <- daprob(Xr, yr, Xu, yu, dens = dmnorm)
names(fm)
headm(fm$y)
headm(fm$fit)
headm(fm$r)
fm$ni

err(fm)

##### QDA (heterogeneous covariances)

fm <- daprob(Xr, yr, Xu, yu, dens = dmnorm, lda = FALSE)
err(fm)

##### Nonparametric DA

fm <- daprob(Xr, yr, Xu, yu, dens = dkerngauss, h = .2)
err(fm)


mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.