choose_on_aic: Choose best stepwise regression model based on AIC

View source: R/choose_on_aic.R

choose_on_aicR Documentation

Choose best stepwise regression model based on AIC

Description

choose_on_aic() takes a tibble generated by stepwise_regression() as input.

For each given combination of dependent variable, independent variable of interest and covariates, stepwise_regression() produces five models (original full, minimal, forward selection, stepwise selection, backward elimination).

choose_on_aic() simply uses AIC to determine the best model within each set of five models.

Usage

choose_on_aic(tibble, all.stepwise = FALSE)

Arguments

tibble

a tibble produced by stepwise_regression().

all.stepwise

a logical indicating whether model selection based on AIC should be skipped. If TRUE, the stepwise selection model is systematically chosen as the best model as long as its AIC value could be computed. Default is FALSE.

Details

When AIC could be computed for at least one the stepwise regression models (forward selection, stepwise selection, backward elimination), the best model is defined as the one with the lowest AIC value. In case the lowest AIC is shared by the stepwise selection model and other stepwise regression model(s), the stepwise selection model is chosen.

When df = 0 for the original full model and all three stepwise regression models (F and AIC could not be computed), the minimal model (dependent variable ~ independent variable of interest) is defined as the best model if and only if it has an F p value <= 0.05 AND a t p value (related to the independent variable of interest) <= 0.05 ("sign" label = TRUE).

The combinations of dependent variable, independent variable of interest and covariates for which a best model could not be defined (df = 0 for the original full model and all three stepwise regression models; F p value > 0.05 and/or a t p value > 0.05 for the minimal model) are discarded.

Note: there may be some combinations of dependent variable, independent variable of interest and covariates for which the best model i) is a stepwise regression model with "sign" label = FALSE (F p value > 0.05 and/or t p value > 0.05) while their minimal model received a "sign" label = TRUE, or ii) the independent variable of interest has been excluded from the model.

Value

A tibble containing, for each combination of dependent variable, independent variable of interest and covariates, i) the corresponding best model, ii) its associated statistics (F p value, t p value) and iii) indication as to whether the independent variable of interest is included or not.

Combinations of dependent variable, independent variable of interest and covariates for which a best model could not be defined are discarded, so the output may be shorter than the input tibble.


benvallin/banban documentation built on Sept. 29, 2023, 5:46 a.m.