Implements a Bootstrap procedure to investigate the variability of model selection under the stepAIC() stepwise algorithm of package MASS.

1 2 | ```
boot.stepAIC(object, data, B = 100, alpha = 0.05, direction = "backward",
k = 2, verbose = FALSE, ...)
``` |

`object` |
an object representing a model of an appropriate class; currently, |

`data` |
a |

`B` |
the number of Bootstrap samples. |

`alpha` |
the significance level. |

`direction` |
the |

`k` |
the |

`verbose` |
logical; if |

`...` |
extra arguments to |

The following procedure is replicated `B`

times:

- Step 1:
Simulate a new data-set taking a sample with replacement from the rows of

`data`

.- Step 2:
Refit the model using the data-set from Step 1.

- Step 3:
For the refitted model of Step 2 run the

`stepAIC()`

algorithm.

Summarize the results by counting how many times (out of the `B`

data-sets) each variable was selected, how
many times the estimate of the regression coefficient of each variable (out of the times it was selected) it was
statistically significant in significance level `alpha`

, and how many times the estimate of the regression
coefficient of each variable (out of the times it was selected) changed signs (see also Austin and Tu, 2004).

An object of class `BootStep`

with components

`Covariates` |
a numeric matrix containing the percentage of times each variable was selected. |

`Sign` |
a numeric matrix containing the percentage of times the regression coefficient of each variable
had sign |

`Significance` |
a numeric matrix containing the percentage of times the regression coefficient of each
variable was significant under the |

`OrigModel` |
a copy of |

`OrigStepAIC` |
the result of applying |

`direction` |
a copy of the |

`k` |
a copy of the |

`BootStepAIC` |
a list of length |

Dimitris Rizopoulos d.rizopoulos@erasmusmc.nl

Austin, P. and Tu, J. (2004). Bootstrap methods for developing predictive models, *The American Statistician*,
**58**, 131–137.

Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, 4th ed. Springer, New York.

`stepAIC`

in package MASS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ```
## lm() Example ##
n <- 350
x1 <- runif(n, -4, 4)
x2 <- runif(n, -4, 4)
x3 <- runif(n, -4, 4)
x4 <- runif(n, -4, 4)
x5 <- runif(n, -4, 4)
x6 <- runif(n, -4, 4)
x7 <- factor(sample(letters[1:3], n, rep = TRUE))
y <- 5 + 3 * x1 + 2 * x2 - 1.5 * x3 - 0.8 * x4 + rnorm(n, sd = 2.5)
data <- data.frame(y, x1, x2, x3, x4, x5, x6, x7)
rm(n, x1, x2, x3, x4, x5, x6, x7, y)
lmFit <- lm(y ~ (. - x7) * x7, data = data)
boot.stepAIC(lmFit, data)
#####################################################################
## glm() Example ##
n <- 200
x1 <- runif(n, -3, 3)
x2 <- runif(n, -3, 3)
x3 <- runif(n, -3, 3)
x4 <- runif(n, -3, 3)
x5 <- factor(sample(letters[1:2], n, rep = TRUE))
eta <- 0.1 + 1.6 * x1 - 2.5 * as.numeric(as.character(x5) == levels(x5)[1])
y1 <- rbinom(n, 1, plogis(eta))
y2 <- rbinom(n, 1, 0.6)
data <- data.frame(y1, y2, x1, x2, x3, x4, x5)
rm(n, x1, x2, x3, x4, x5, eta, y1, y2)
glmFit1 <- glm(y1 ~ x1 + x2 + x3 + x4 + x5, family = binomial, data = data)
glmFit2 <- glm(y2 ~ x1 + x2 + x3 + x4 + x5, family = binomial, data = data)
boot.stepAIC(glmFit1, data, B = 50)
boot.stepAIC(glmFit2, data, B = 50)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.