dalm: DA using Linear Regression on the Y-Dummy table

View source: R/dalm.R

dalmR Documentation

DA using Linear Regression on the Y-Dummy table

Description

DA-LM

1- The class membership y (unidimensional variable) for the reference (= training) observations is firstly transformed (with function dummy) to a table Ydummy containing a number of nclas dummy variables, where nclas is the number of classes in y.

2- Then, a linear regression model is fitted between the X-data and each of the dummy variables (i.e. columns of the dummy table Ydummy).

3- For a given new observation, the final prediction (a class) corresponds to the dummy variable for which the prediction is the highest.

When the number of classes is higher than two, this method can be affected by a masking effect (see eg. Hastie et al. 2009, section 4.2): some class(es) can be masked (therefore not well predicted) if more than two classes are aligned in the X-space. Caution should therefore be taken about such eventual masking effects.

Row observations can eventually be weighted with a priori weights (using argument weights).

When argument weights = NULL (default), dalm is strictly equivalent to daglm(..., family = gaussian) but faster.

Usage

dalm(Xr, Yr, Xu, Yu = NULL, weights = NULL)

Arguments

Xr

A n x p matrix or data frame of reference (= training) observations.

Yr

A vector of length n, or a n x 1 matrix, of reference (= training) responses (class membership).

Xu

A m x p matrix or data frame of new (= test) observations to be predicted.

Yu

A vector of length m, or a m x 1 matrix, of the true response (class membership). Default to NULL.

weights

A vector of length n defining a priori weights to apply to the training observations. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to 1 / n).

Value

A list of outputs, such as:

y

Responses for the test data.

fit

Predictions for the test data.

r

Residuals for the test data.

References

Hastie, T., Tibshirani, R., Friedman, J., 2009. 2nd Ed. The elements of statistical learning. Data mining, inference and prediction. Springer.

Examples


data(iris)

X <- iris[, 1:4]
y <- iris[, 5]
N <- nrow(X)

m <- round(.25 * N) 
n <- N - m        
s <- sample(1:N, m)
Xr <- X[-s, ]
yr <- y[-s]
Xu <- X[s, ]
yu <- y[s]

fm <- dalm(Xr, yr, Xu, yu)
names(fm)
headm(fm$y)
headm(fm$fit)
headm(fm$r)
headm(fm$dummyfit)
fm$ni

err(fm)


mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.