stepwise | R Documentation |

This function runs a stepwise regression, selecting and/or excluding variables based on the significance (p-value) of the statistical tests implemented in the `add1`

and `drop1`

functions of R.

stepwise(data, sp.col, var.cols, id.col = NULL, family = binomial(link="logit"), direction = "both", test.in = "Rao", test.out = "LRT", p.in = 0.05, p.out = 0.1, trace = 1, simplif = TRUE, preds = FALSE, Favourability = FALSE, Wald = FALSE)

`data` |
a data frame (or an object that can be coerced with 'as.data.frame') containing your target and predictor variables. |

`sp.col` |
name or index number of the column of 'data' that contains the response variable. |

`var.cols` |
names or index numbers of the columns of 'data' that contain the predictor variables. |

`id.col` |
(optional) name or index number of column containing the row identifiers (if defined, it will be included in the output 'predictions' data frame). |

`family` |
argument to be passed to |

`direction` |
the mode of stepwise search. Can be either "forward", "backward", or "both" (the default). |

`test.in` |
argument to pass to |

`test.out` |
argument to pass to |

`p.in` |
threshold p-value for a variable to enter the model. Defaults to 0.05. |

`p.out` |
threshold p-value for a variable to leave the model. Defaults to 0.1. |

`trace` |
if positive, information is printed to the console at each step. The default is 1, for naming each variable that was added or removed. With trace=2, the summary of the model at each step is also printed. |

`simplif` |
logical, whether to return a simpler output containing only the model object (the default), or a list with, additionally, a data frame with the variable included or excluded at each step. |

`preds` |
logical, whether to return also the predictions given by the model at each step. This argument is ignored if simplif=TRUE. |

`Favourability` |
logical, whether to convert the predictions (if preds=TRUE) with the |

`Wald` |
logical, whether to print the Wald test statistics using |

Stepwise variable selection is a way of selecting a subset of significant variables to get a simple and easily interpretable model. It is more computationally efficient than best subset selection. This function uses the R functions `add1`

for selecting and `drop1`

for excluding variables. The default parameters mimic the "Forward Selection (Conditional)" stepwise procedure implemented in the IBM SPSS software. This is a widely used (e.g. Munoz et al. 2005, Olivero et al. 2017, 2020, Garcia-Carrasco et al. 2021) but also widely criticized method for variable selection (e.g. Harrell 2001; Whittingham et al. 2006; Flom & Cassell, 2007; Smith 2018), though its AIC-based counterpart (implemented in the `step`

R function) is also not without flaws (e.g. Murtaugh 2014; Coelho et al. 2019).

If simplif=TRUE (the default), this function returns the model object obtained after the variable selection procedure. If simplif=FALSE, it returns a list with the following components:

`model` |
the model object obtained after the variable selection procedure. |

`steps` |
a data frame where each row shows the variable included or excluded at each step. |

`predictions` |
(if preds=TRUE) a data frame where each column contains the predictions of the model obtained at each step. These predictions are probabilities by default, or favourabilities if Favourability=TRUE. |

A. Marcia Barbosa

Coelho M.T.P., Diniz-Filho J.A. & Rangel T.F. (2019) A parsimonious view of the parsimony principle in ecology and evolution. Ecography, 42:968-976

Flom P.L. & Cassell D.L. (2007) Stopping stepwise: Why stepwise and similar selection methods are bad, and what you should use. NESUG 2007

Garcia-Carrasco J.M., Munoz A.R., Olivero J., Segura M. & Real R. (2021) Predicting the spatio-temporal spread of West Nile virus in Europe. PLoS Neglected Tropical Diseases 15(1):e0009022

Harrell F.E. (2001) Regression modeling strategies: With applications to linear models, logistic regression, and survival analysis. Springer-Verlag, New York

Munoz, A.R., Real R., Barbosa A.M. & Vargas J.M. (2005) Modelling the distribution of Bonelli's Eagle in Spain: Implications for conservation planning. Diversity and Distributions 11: 477-486

Murtaugh P.A. (2014) In defense of P values. Ecology, 95:611-617

Olivero J., Fa J.E., Real R., Marquez A.L., Farfan M.A., Vargas J.M, Gaveau D., Salim M.A., Park D., Suter J., King S., Leendertz S.A., Sheil D. & Nasi R. (2017) Recent loss of closed forests is associated with Ebola virus disease outbreaks. Scientific Reports 7: 14291

Olivero J., Fa J.E., Farfan M.A., Marquez A.L., Real R., Juste F.J., Leendertz S.A. & Nasi R. (2020) Human activities link fruit bat presence to Ebola virus disease outbreaks. Mammal Review 50:1-10

Smith G. (2018) Step away from stepwise. Journal of Big Data 32 (https://doi.org/10.1186/s40537-018-0143-6)

Whittingham M.J., Stephens P.A., Bradbury R.B. & Freckleton R.P. (2006) Why do we still use stepwise modelling in ecology and behaviour? Journal of Animal Ecology, 75:1182-1189

`step`

, `stepByStep`

, `modelTrim`

data(rotif.env) stepwise(data = rotif.env, sp.col = 18, var.cols = 5:17)

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.