dalr: Discriminant Adaptive Logistic Regression

Description Usage Arguments Details Value References See Also Examples

View source: R/dalr.R

Description

A local version of logistic regression for classification that puts increased emphasis on a good model fit near the decision boundary.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
dalr(X, ...)

## S3 method for class 'formula'
dalr(formula, data, weights, ..., subset, na.action)

## S3 method for class 'data.frame'
dalr(X, ...)

## S3 method for class 'matrix'
dalr(X, Y, weights = rep(1, nrow(X)), intercept = TRUE,
  ..., subset, na.action)

## Default S3 method:
dalr(X, Y, thr = 0.5, wf = c("biweight", "cauchy",
  "cosine", "epanechnikov", "exponential", "gaussian", "optcosine",
  "rectangular", "triangular"), bw, k, nn.only = TRUE, itr = 3,
  intercept = TRUE, weights = rep(1, nrow(X)), ...)

Arguments

X

(Required if no formula is given as principal argument.) A matrix or data.frame or Matrix containing the explanatory variables.

formula

A formula of the form groups ~ x1 + x2 + ..., that is, the response is the grouping factor and the right hand side specifies the (non-factor) discriminators. Details concerning model specification are given in the documentation of glm.

data

A data.frame from which variables specified in formula are to be taken.

weights

Initial observation weights (defaults to a vector of 1s).

subset

An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)

na.action

The default is first, any na.action attribute of data, second a na.action setting of options, and third na.fail if that is unset. The default is first, a na.action setting of options, and second na.fail if that is unset.

Y

(Required if no formula is given as principal argument.) A factor specifying the class membership for each observation.

intercept

Should the model contain an intercept? Passed to glm.fit, null.model.

thr

The threshold value used to predict class membership, defaults to 0.5. See Details.

wf

A window function which is used to calculate weights that are introduced into the fitting process. Either a character string or a function, e.g. wf = function(x) exp(-x). For details see the documentation for wfs.

bw

(Required only if wf is a string.) The bandwidth parameter of the window function. (See wfs.)

k

(Required only if wf is a string.) The number of nearest neighbors of the decision boundary to be used in the fitting process. (See wfs.)

nn.only

(Required only if wf is a string indicating a window function with infinite support and if k is specified.) Should only the k nearest neighbors or all observations receive positive weights? (See wfs.)

itr

Number of iterations for model fitting, defaults to 3. See also the Details section.

...

Further arguments to glm. Currently offset, control, model, x, y, contrasts, start, etastart, mustart are supported.

Details

Local logistic regression (Hand and Vinciotti, 2003) is a modification of the standard logistic regression approach to discrimination. For discrimination a good fit of the model is required especially near the true decision boundary. Therefore weights are introduced into the fitting process that reflect the proximity of training points to the decision boundary. Let the class levels be 0 and 1. The distance of a training observation x to the decision boundary is measured by means of the difference P(1 | x) - thr where thr is a threshold in [0,1]. Since P(1 | x) is not known in advance an iterative procedure is required. We start by fitting an unweighted logistic regression model to the data in order to obtain initial estimates of P(1 | x). These are used to calculate the observation weights. Model fitting and calculation of weights is done several times in turn. By default, the number of iterations is limited to 3.

The name of the window function (wf) can be specified as a character string. In this case the window function is generated internally in dalr. Currently supported are "biweight", "cauchy", "cosine", "epanechnikov", "exponential", "gaussian", "optcosine", "rectangular" and "triangular".

Moreover, it is possible to generate the window functions mentioned above in advance (see wfs) and pass them to dalr.

Any other function implementing a window function can also be used as wf argument. This allows the user to try own window functions. See help on wfs for details.

Internally, glm.fit with family = binomial() is used and the weights produced using wf are passed to glm.fit via its weights argument.

If the predictor variables include factors, the formula interface must be used in order to get a correct model matrix.

Warnings about non-integer #successes in a binomial glm are expected.

Value

An object of class "dalr" inheriting from class "glm", a list containing at least the following components:

Values of glm:

coefficients

A named vector of coefficients.

residuals

The working residuals, that is the residuals in the final iteration of the IWLS fit. Since cases with zero weights are omitted, their working residuals are NA.

fitted.values

The fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.

rank

The numeric rank of the fitted linear model.

family

The family object used.

linear.predictor

The linear fit on link scale.

deviance

Up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero.

aic

A version of Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of parameters, computed by the aic component of the family. For binomial and poisson families the dispersion is fixed at one and the number of parameters is the number of coefficients. For gaussian, Gamma and inverse gaussian families the dispersion is estimated from the residual deviance, and the number of parameters is the number of coefficients plus one. For a gaussian family the MLE of the dispersion is used so this is a valid value of AIC, but for Gamma and inverse gaussian families it is not. For families fitted by quasi-likelihood the value is NA.

null.deviance

The deviance for the null model, comparable with deviance. The null model will include the offset, and an intercept if there is one in the model. Note that this will be incorrect if the link function depends on the data other than through the fitted mean: specify a zero offset to force a correct calculation.

iter

The number of iterations of IWLS used.

weights

A list of length itr + 1. The working weights, that is the observation weights in the final iteration of the IWLS fit.

prior.weights

A list of length itr + 1. The observation weights initially supplied, the first list element is a vector of 1s if none were specified.

df.residual

The residual degrees of freedom.

df.null

The residual degrees of freedom for the null model.

y

If requested (the default) the y vector used. (It is a vector even for a binomial model.)

x

If requested, the model matrix.

model

If requested (the default), the model frame.

converged

Logical. Was the IWLS algorithm judged to have converged?

boundary

Logical. Is the fitted value on the boundary of the attainable values?

call

The (matched) function call.

formula

The formula supplied.

terms

The terms object used.

data

The data argument.

offset

The offset vector used.

control

The value of the control argument used.

method

The name of the fitter function used, currently always "glm.fit".

contrasts

(Where relevant) the contrasts used.

xlevels

(Where relevant) a record of the levels of the factors used in fitting.

na.action

(Where relevant) information returned by model.frame on the special handling of NAs.

Additionally, dalr returns

lev

The class labels (the levels of grouping).

thr

The threshold used.

itr

The number of iterations used.

wf

The window function used. Always a function, even if the input was a string.

bw

(Only if wf is a string or was generated by means of one of the functions documented in wfs.) The bandwidth used, NULL if bw was not specified.

k

(Only if wf is a string or was generated by means of one of the functions documented in wfs.) The number of nearest neighbors used, NULL if k was not specified.

nn.only

(Logical. Only if wf is a string or was generated by means of one of the functions documented in wfs and if k was specified.) TRUE if only the k nearest neighbors recieve a positive weight, FALSE otherwise.

adaptive

(Logical.) TRUE if the bandwidth of wf is adaptive to the local density of data points, FALSE if the bandwidth is fixed.

References

Hand, D. J., Vinciotti, V. (2003), Local versus global models for classification problems: Fitting models where it matters, The American Statistician, 57(2) 124–130.

See Also

predict.dalr, glm, predict.glm.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# generate toy data set of Hand und Vinciotti (2003):
x1 <- x2 <- seq(0.1,1,0.05)
train <- expand.grid(x1 = x1, x2 = x2)
posterior <- train$x2/(train$x1 + train$x2)
y <- as.factor(sapply(posterior, function(x) sample(0:1, size = 1, 
    prob = c(1-x,x))))
train <- data.frame(train, y = y)

par(mfrow = c(1,3))

# contours of true class posterior probabilities:
plot(train$x1, train$x2, col = y, pch = 19, main = "true posteriors")
contour(x1, x2, matrix(posterior, length(x1)), add = TRUE)

# 0.3-contour line fit of logistic regression:
glob.fit <- glm(y ~ ., data = train, family = "binomial")
plot(train$x1, train$x2, col = y, pch = 19, main = "global fit")
contour(x1, x2, matrix(glob.fit$fitted.values, length(x1)), 
    levels = 0.3, add = TRUE)

# 0.3-contour line fit of local logistic regression:
loc.fit <- dalr(y ~ ., data = train, thr = 0.3, wf = "gaussian", bw = 0.2)
plot(train$x1, train$x2, col = y, pch = 19, main = "local fit")
contour(x1, x2, matrix(loc.fit$fitted.values, length(x1)), 
    levels = 0.3, add = TRUE)


# specify wf as a character string:
dalr(y ~ ., data = train , thr = 0.3, wf = "rectangular", k = 50)

# use window function generating function:
rect <- rectangular(100)
dalr(y ~ ., data = train, thr = 0.3, wf = rect)

# specify own window function:
dalr(y ~ ., data = train, thr = 0.3, wf = function(x) exp(-10*x^2)) 


# generate test data set:
x1 <- runif(200, min = 0, max = 1)              
x2 <- runif(200, min = 0, max = 1)              
test <- data.frame(x1 = x1, x2 = x2)

pred <- predict(loc.fit, test)

prob <- test$x2/(test$x1 + test$x2)
y <- as.factor(sapply(prob, function(x) sample(0:1, size = 1, 
    prob = c(1-x,x))))

mean(y != pred$class)

schiffner/locClass documentation built on May 29, 2019, 3:39 p.m.