LMSelect: Backward stepwise selection of a GLM

View source: R/LMSelect.R

LMSelectR Documentation

Backward stepwise selection of a GLM

Description

Performs backward stepwise selection of terms in a generalized linear model. Tests interaction terms first, and then drops them to test main effects. Main effects that are part of interaction terms will be retained, regardless of their significance as main effects

Usage

LMSelect=(modelData,responseVar,fitFamily,factors=
                       character(0),contEffects=list(),
                     interactions=character(0),
                     allInteractions=FALSE,
                     saveVars=character(0))

Arguments

modelData

A data frame containing the response variable, and all terms to be considered

responseVar

The response variable to fit in the model

fitFamily

The family to use for the generalized linear model

factors

The factors to consider in the model, specified as a vector of strings that correspond to the column names in modelData

contEffects

The continuous variables to consider in the model, specified as a list where the item names correspond to the column names in modelData and the values are integers specifying the maximum complexity of the polynomial term to fit for the variable

interactions

Specific interaction terms to consider in the model, specified as a vector of strings with interacting terms separated by a ':'

allInteractions

Whether to fit all two-way interactions between the fixed effects in the model. Default is FALSE

alpha

The threshold P value used to determine the statistical significance of terms

saveVars

Any variables in the original data frame to retain in the model data frame for later analysis

Details

The model-selection routine starts with the most complex structure possible given the specified combination of explanatory variables and their interactions, and performs backward stepwise selection to obtain the minimum adequate model. Comparison of the fit of different models is based on likelihood-ratio tests, against a specified threshold P value (alpha), which defaults to 0.05. Interaction terms are tested first, and then removed to test main effects. All main effects that are part of significant interaction terms are retained in the final model regardless of their significance as main effects.

Value

model: the final minimum adequate model

data: the dataset used in fitting the models, i.e. a subset of the original data frame, containing only the variables fit in the model, variables specified to be saved, and with any rows containing NA values removed

stats: a table of statistics relating to each term considered

final.call: the call used to generate the final model

family: the family of generalized linear model used - gaussian, poisson, binomial etc.

Author(s)

Tim Newbold <t.newbold@ucl.ac.uk>


timnewbold/StatisticalModels documentation built on Aug. 25, 2023, 4:58 p.m.