daglm | R Documentation |
DA-GLM
1- The class membership y
(unidimensional variable) for the reference (= training) observations is firstly transformed (with function dummy
) to a table Ydummy
containing a number of nclas
dummy variables, where nclas
is the number of classes in y
.
2- Then, a generalized linear regression model (GLIM, using function glm
) is fitted between the X
-data and each of the dummy variables (i.e. columns of the dummy table Ydummy
).
3- For a given new observation, the final prediction (a class) corresponds to the dummy variable for which the prediction is the highest.
When the number of classes is higher than two, this method can be affected by a masking effect (see eg. Hastie et al. 2009, section 4.2): some class(es) can be masked (therefore not well predicted) if more than two classes are aligned in the X
-space. Caution should thereefore be taken about such eventual masking effects.
daglm(Xr, Yr, Xu, Yu = NULL, family = binomial(link = "logit"), weights = NULL)
Xr |
A |
Yr |
A vector of length |
Xu |
A |
Yu |
A vector of length |
family |
Specify the GLIM model used by function |
weights |
A vector of length |
A list of outputs, such as:
y |
Responses for the test data. |
fit |
Predictions for the test data. |
r |
Residuals for the test data. |
data(iris)
X <- iris[, 1:4]
y <- iris[, 5]
N <- nrow(X)
m <- round(.25 * N) # Test
n <- N - m # Training
s <- sample(1:N, m)
Xr <- X[-s, ]
yr <- y[-s]
Xu <- X[s, ]
yu <- y[s]
## Binomial model with logit link (logistic regression)
fm <- daglm(Xr, yr, Xu, yu)
names(fm)
headm(fm$y)
headm(fm$fit)
headm(fm$r)
fm$ni
err(fm)
## Gaussian model with identity link (= usual linear model)
fm <- daglm(Xr, yr, Xu, yu, family = gaussian)
err(fm)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.