stepCriterion.glm: Variable Selection in Generalized Linear Models

View source: R/glms.R

stepCriterion.glmR Documentation

Variable Selection in Generalized Linear Models

Description

Performs variable selection in generalized linear models using hybrid versions of forward stepwise and backward stepwise.

Usage

## S3 method for class 'glm'
stepCriterion(
  model,
  criterion = c("adjr2", "bic", "aic", "p-value", "qicu"),
  test = c("wald", "lr", "score", "gradient"),
  direction = c("forward", "backward"),
  levels = c(0.05, 0.05),
  trace = TRUE,
  scope,
  ...
)

Arguments

model

an object of the class glm.

criterion

an (optional) character string indicating the criterion which should be used to compare the candidate models. The available options are: AIC ("aic"), BIC ("bic"), adjusted deviance-based R-squared ("adjr2"), and p-value of the test test ("p-value"). By default, criterion is set to be "adjr2".

test

an (optional) character string indicating the statistical test which should be used to compare nested models. The available options are: Wald ("wald"), Rao's score ("score"), likelihood-ratio ("lr") and gradient ("gradient") tests. By default, test is set to be "wald".

direction

an (optional) character string indicating the type of procedure which should be used. The available options are: hybrid backward stepwise ("backward") and hybrid forward stepwise ("forward"). By default, direction is set to be "forward".

levels

an (optional) two-dimensional vector of values in the interval (0,1) indicating the levels at which the variables should in and out from the model. This is only appropiate if criterion="p-value". By default, levels is set to be c(0.05,0.05).

trace

an (optional) logical switch indicating if should the stepwise reports be printed. By default, trace is set to be TRUE.

scope

an (optional) list, containing components lower and upper, both formula-type objects, indicating the range of models which should be examined in the stepwise search. By default, lower is a model with no predictors and upper is the linear predictor of the model in model.

...

further arguments passed to or from other methods. For example, k, that is, the magnitude of the penalty in the AIC/QICu, which by default is set to be 2.

Details

The "hybrid forward stepwise" algorithm starts with the simplest model (which may be chosen at the argument scope, and by default, is a model whose parameters in the linear predictor, except the intercept, if any, are set to be 0), and then the candidate models are built by hierarchically including effects in the linear predictor, whose "relevance" and/or "importance" in the model fit is assessed by comparing nested models (that is, by comparing the models with and without the added effect) using a criterion previously specified. If an effect is added to the equation, this strategy may also remove any effect which, according to the previously specified criterion, no longer provides improvement in the model fit. That process continues until no more effects are included or excluded. The "hybrid backward stepwise" algorithm works similarly.

Value

a list list with components including

initial a character string indicating the linear predictor of the "initial model",
direction a character string indicating the type of procedure which was used,
criterion a character string indicating the criterion used to compare the candidate models,
final a character string indicating the linear predictor of the "final model",

References

James G., Witten D., Hastie T., Tibshirani R. (2013, page 210) An Introduction to Statistical Learning with Applications in R, Springer, New York.

See Also

stepCriterion.lm, stepCriterion.overglm, stepCriterion.glmgee

Examples

###### Example 1: Fuel consumption of automobiles
Auto <- ISLR::Auto
Auto2 <- within(Auto, origin <- factor(origin))
mod <- mpg ~ cylinders + displacement + acceleration + origin + horsepower*weight
fit1 <- glm(mod, family=inverse.gaussian("log"), data=Auto2)
stepCriterion(fit1, direction="forward", criterion="p-value", test="lr")
stepCriterion(fit1, direction="backward", criterion="bic")

###### Example 2: Patients with burn injuries
burn1000 <- aplore3::burn1000
burn1000 <- within(burn1000, death <- factor(death, levels=c("Dead","Alive")))
upper <- ~ age + gender + race + tbsa + inh_inj + flame + age*inh_inj + tbsa*inh_inj
lower <- ~ 1
fit2 <- glm(death ~ age + gender + race + tbsa + inh_inj, family=binomial("logit"), data=burn1000)
stepCriterion(fit2, direction="backward", criterion="bic", scope=list(lower=lower,upper=upper))
stepCriterion(fit2, direction="forward", criterion="p-value", test="score")

###### Example 3: Skin cancer in women
data(skincancer)
upper <- cases ~ city + age + city*age
fit3 <- glm(upper, family=poisson("log"), offset=log(population), data=skincancer)
stepCriterion(fit3, direction="backward", criterion="aic", scope=list(lower=~ 1,upper=upper))
stepCriterion(fit3, direction="forward", criterion="p-value", test="lr")

glmtoolbox documentation built on Oct. 10, 2023, 9:06 a.m.