dalr: Discriminant Adaptive Logistic Regression
In locClass: Collection of Local Classification Methods

Description Usage Arguments Details Value References See Also Examples

A local version of logistic regression for classification that puts increased emphasis on a good model fit near the decision boundary.

  dalr(X, ...)

  ## S3 method for class 'formula'
 dalr(formula, data, weights, ...,
    subset, na.action)

  ## S3 method for class 'data.frame'
 dalr(X, ...)

  ## S3 method for class 'matrix'
 dalr(X, Y, weights = rep(1, nrow(X)),
    intercept = TRUE, ..., subset, na.action)

  ## Default S3 method:
 dalr(X, Y, thr = 0.5,
    wf = c("biweight", "cauchy", "cosine", "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular", "triangular"),
    bw, k, nn.only = TRUE, itr = 3, intercept = TRUE,
    weights = rep(1, nrow(X)), ...)

`formula`	A formula of the form `groups ~ x1 + x2 + ...`, that is, the response is the grouping `factor` and the right hand side specifies the (non-`factor`) discriminators. Details concerning model specification are given in the documentation of `glm`.
`data`	A `data.frame` from which variables specified in `formula` are to be taken.
`X`	(Required if no `formula` is given as principal argument.) A `matrix` or `data.frame` or `Matrix` containing the explanatory variables.
`Y`	(Required if no `formula` is given as principal argument.) A `factor` specifying the class membership for each observation.
`thr`	The threshold value used to predict class membership, defaults to 0.5. See Details.
`wf`	A window function which is used to calculate weights that are introduced into the fitting process. Either a character string or a function, e.g. `wf = function(x) exp(-x)`. For details see the documentation for `wfs`.
`bw`	(Required only if `wf` is a string.) The bandwidth parameter of the window function. (See `wfs`.)
`k`	(Required only if `wf` is a string.) The number of nearest neighbors of the decision boundary to be used in the fitting process. (See `wfs`.)
`nn.only`	(Required only if `wf` is a string indicating a window function with infinite support and if `k` is specified.) Should only the `k` nearest neighbors or all observations receive positive weights? (See `wfs`.)
`itr`	Number of iterations for model fitting, defaults to 3. See also the Details section.
`intercept`	Should the model contain an intercept. passed to `glm.fit`, null.model.
`weights`	Initial observation weights (defaults to a vector of 1s).
`...`	Further arguments to `glm`. Currently "offset", "control", model, x, y, contrasts, start, etastart, mustart are supported. family is "binomial", method?. Note that some of theses arguments only make sense when using the formula method, namely: ...?
`subset`	An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
`na.action`	The default is first, any `na.action` attribute of data, second a `na.action` setting of options, and third `na.fail` if that is unset The default is first, a `na.action` setting of options, and second `na.fail` if that is unset.

Local logistic regression (Hand and Vinciotti, 2003) is a modification of the standard logistic regression approach to discrimination. For discrimination a good fit of the model is required especially near the true decision boundary. Therefore weights are introduced into the fitting process that reflect the proximity of training points to the decision boundary. Let the class levels be 0 and 1. The distance of a training observation x to the decision boundary is measured by means of the difference P(1 | x) - thr where thr is a threshold in [0,1]. Since P(1 | x) is not known in advance an iterative procedure is required. We start by fitting an unweighted logistic regression model to the data in order to obtain initial estimates of P(1 | x). These are used to calculate the observation weights. Model fitting and calculation of weights is done several times in turn. By default, the number of iterations is limited to 3.

The name of the window function (wf) can be specified as a character string. In this case the window function is generated internally in dalr. Currently supported are "biweight", "cauchy", "cosine", "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular" and "triangular".

Moreover, it is possible to generate the window functions mentioned above in advance (see wfs) and pass them to dalr.

Any other function implementing a window function can also be used as wf argument. This allows the user to try own window functions. See help on wfs for details.

Internally, glm.fit with family = binomial() is used and the weights produced using wf are passed to glm.fit via its weights argument.

If the predictor variables include factors, the formula interface must be used in order to get a correct model matrix.

Warnings about non-integer #successes in a binomial glm are expected.

An object of class "dalr" inheriting from class "glm", a list containing at least the following components:

Values of glm:

`coefficients`	A named vector of coefficients.
`residuals`	The working residuals, that is the residuals in the final iteration of the IWLS fit. Since cases with zero weights are omitted, their working residuals are `NA`.
`fitted.values`	The fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
`rank`	The numeric rank of the fitted linear model.
`family`	The `family` object used.
`linear.predictor`	The linear fit on link scale.
`deviance`	Up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.
`aic`	A version of Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameters, computed by the aic component of the family. For binomial and Poison families the dispersion is fixed at one and the number of parameters is the number of coefficients. For gaussian, Gamma and inverse gaussian families the dispersion is estimated from the residual deviance, and the number of parameters is the number of coefficients plus one. For a gaussian family the MLE of the dispersion is used so this is a valid value of AIC, but for Gamma and inverse gaussian families it is not. For families fitted by quasi-likelihood the value is NA.
`null.deviance`	The deviance for the null model, comparable with deviance. The null model will include the offset, and an intercept if there is one in the model. Note that this will be incorrect if the link function depends on the data other than through the fitted mean: specify a zero offset to force a correct calculation.
`iter`	The number of iterations of IWLS used.
`weights`	A list of length `itr + 1`. The working weights, that is the observation weights in the final iteration of the IWLS fit.
`prior.weights`	A list of length `itr + 1`. The observation weights initially supplied, the first list element is a vector of 1s if none were.
`df.residual`	The residual degrees of freedom.
`df.null`	The residual degrees of freedom for the null model.
`y`	If requested (the default) the y vector used. (It is a vector even for a binomial model.)
`x`	If requested, the model matrix.
`model`	If requested (the default), the model frame.
`converged`	Logical. Was the IWLS algorithm judged to have converged?
`boundary`	Logical. Is the fitted value on the boundary of the attainable values?
`call`	The (matched) function call.
`formula`	The formula supplied.
`terms`	The `terms` object used.
`data`	The data argument.
`offset`	The offset vector used.
`control`	The value of the control argument used.
`method`	The name of the fitter function used, currently always ~code"glm.fit".
`contrasts`	(Where relevant) the contrasts used.
`xlevels`	(Where relevant) a record of the levels of the factors used in fitting.
`na.action`	(Where relevant) information returned by `model.frame` on the special handling of NAs.

Additionally, dalr returns

`lev`	The class labels (the levels of `grouping`).
`thr`	The threshold used.
`itr`	The number of iterations used.
`wf`	The window function used. Always a function, even if the input was a string.
`bw`	(Only if `wf` is a string or was generated by means of one of the functions documented in `wfs`.) The bandwidth used, `NULL` if `bw` was not specified.
`k`	(Only if `wf` is a string or was generated by means of one of the functions documented in `wfs`.) The number of nearest neighbors used, `NULL` if `k` was not specified.
`nn.only`	(Logical. Only if `wf` is a string or was generated by means of one of the functions documented in `wfs` and if `k` was specified.) `TRUE` if only the `k` nearest neighbors recieve a positive weight, `FALSE` otherwise.
`adaptive`	(Logical.) `TRUE` if the bandwidth of `wf` is adaptive to the local density of data points, `FALSE` if the bandwidth is fixed.

Hand, D. J., Vinciotti, V. (2003), Local versus global models for classification problems: Fitting models where it matters, The American Statistician, 57(2) 124–130.

predict.dalr, glm, predict.glm.

# generate toy data set of Hand und Vinciotti (2003):
x1 <- x2 <- seq(0.1,1,0.05)
train <- expand.grid(x1 = x1, x2 = x2)
posterior <- train$x2/(train$x1 + train$x2)
y <- as.factor(sapply(posterior, function(x) sample(0:1, size = 1,
    prob = c(1-x,x))))
train <- data.frame(train, y = y)

par(mfrow = c(1,3))

# contours of true class posterior probabilities:
plot(train$x1, train$x2, col = y, pch = 19, main = "true posteriors")
contour(x1, x2, matrix(posterior, length(x1)), add = TRUE)

# 0.3-contour line fit of logistic regression:
glob.fit <- glm(y ~ ., data = train, family = "binomial")
plot(train$x1, train$x2, col = y, pch = 19, main = "global fit")
contour(x1, x2, matrix(glob.fit$fitted.values, length(x1)),
    levels = 0.3, add = TRUE)

# 0.3-contour line fit of local logistic regression:
loc.fit <- dalr(y ~ ., data = train, thr = 0.3, wf = "gaussian", bw = 0.2)
plot(train$x1, train$x2, col = y, pch = 19, main = "local fit")
contour(x1, x2, matrix(loc.fit$fitted.values, length(x1)),
    levels = 0.3, add = TRUE)


# specify wf as a character string:
dalr(y ~ ., data = train , thr = 0.3, wf = "rectangular", k = 50)

# use window function generating function:
rect <- rectangular(100)
dalr(y ~ ., data = train, thr = 0.3, wf = rect)

# specify own window function:
dalr(y ~ ., data = train, thr = 0.3, wf = function(x) exp(-10*x^2))


# generate test data set:
x1 <- runif(200, min = 0, max = 1)
x2 <- runif(200, min = 0, max = 1)
test <- data.frame(x1 = x1, x2 = x2)

pred <- predict(loc.fit, test)

prob <- test$x2/(test$x1 + test$x2)
y <- as.factor(sapply(prob, function(x) sample(0:1, size = 1,
    prob = c(1-x,x))))

mean(y != pred$class)