This function runs a stepwise regression, selecting and/or excluding variables based on the significance (p-value) of the statistical tests implemented in the `add1`

and `drop1`

functions of R.

stepwise(data, sp.col, var.cols, id.col = NULL, family = binomial(link="logit"), direction = "both", test.in = "Rao", test.out = "LRT", p.in = 0.05, p.out = 0.1, trace = 1, simplif = TRUE, preds = FALSE, Favourability = FALSE, Wald = FALSE)

`data` |
a data frame (or an object that can be coerced with 'as.data.frame') containing your target and predictor variables. |

`sp.col` |
name or index number of the column of 'data' that contains the response variable. |

`var.cols` |
names or index numbers of the columns of 'data' that contain the predictor variables. |

`id.col` |
(optional) name or index number of column containing the row identifiers (if defined, it will be included in the output 'predictions' data frame). |

`family` |
argument to be passed to |

`direction` |
the mode of stepwise search. Can be either "forward", "backward", or "both" (the default). |

`test.in` |
argument to pass to |

`test.out` |
argument to pass to |

`p.in` |
threshold p-value for a variable to enter the model. Defaults to 0.05. |

`p.out` |
threshold p-value for a variable to leave the model. Defaults to 0.1. |

`trace` |
if positive, information is printed to the console at each step. The default is 1, for naming each variable that was added or removed. With trace=2, the summary of the model at each step is also printed. |

`simplif` |
logical, whether to return a simpler output containing only the model object (the default), or a list with, additionally, a data frame with the variable included or excluded at each step. |

`preds` |
logical, whether to return also the predictions given by the model at each step. This argument is ignored if simplif=TRUE. |

`Favourability` |
logical, whether to convert the predictions (if preds=TRUE) with the |

`Wald` |
logical, whether to print the Wald test statistics using |

Stepwise variable selection is a way of selecting a subset of significant variables to get a simple and easily interpretable model. It is more computationally efficient than best subset selection. This function uses the R functions `add1`

for selecting and `drop1`

for excluding variables. The default parameters mimic the "Forward Selection (Conditional)" stepwise procedure implemented in the IBM SPSS software. This is a widely used (e.g. Munoz et al. 2005, Olivero et al. 2017, 2020, Garcia-Carrasco et al. 2021) but also widely criticized method for variable selection (e.g. Harrell 2001; Whittingham et al. 2006; Flom & Cassell, 2007; Smith 2018), though its AIC-based counterpart (implemented in the `step`

R function) is also not without flaws (e.g. Murtaugh 2014; Coelho et al. 2019).

If simplif=TRUE (the default), this function returns the model object obtained after the variable selection procedure. If simplif=FALSE, it returns a list with the following components:

`model` |
the model object obtained after the variable selection procedure. |

`steps` |
a data frame where each row shows the variable included or excluded at each step. |

`predictions` |
(if preds=TRUE) a data frame where each column contains the predictions of the model obtained at each step. These predictions are probabilities by default, or favourabilities if Favourability=TRUE. |

A. Marcia Barbosa

`step`

, `stepByStep`

, `modelTrim`

data(rotif.env) stepwise(data = rotif.env, sp.col = 18, var.cols = 5:17)

