modelselect.glm: Title: Variable selection for generalized linear models
In VariableSelection: Select Variables for Linear Models

View source: R/modelselection_glm.R

modelselect.glm

R Documentation

Title: Variable selection for generalized linear models

Description

Description: use BIC to do variable selection.

Usage

modelselect.glm(
  formula,
  data,
  family,
  GA_var = 16,
  maxiterations = 2000,
  runs_til_stop = 1000,
  monitor = TRUE,
  popSize = 100,
  verbose = TRUE
)

Arguments

`formula`	an object of class "formula": a symbolic description of the model to be fitted. A typical model has the form `response ~ terms` where response is the (numeric) `response` vector and terms is a series of terms which specifies a linear predictor for `response`. A terms specification of the form `first + second` indicates all the terms in `first` together with all the terms in `second` with duplicates removed. A specification of the form `first:second` indicates the set of terms obtained by taking the interactions of all terms in `first` with all terms in `second`. The specification `first*second` indicates the cross of `first` and `second.` This is the same as `first + second + first:second`.
`data`	an data frame containing the variables in the model.
`family`	a character string naming a family function describing the error distribution to be used in the model.
`GA_var`	if the number of variables is smaller than `GA_var`, then do exhaustive model search, otherwise use genetic algorithm to do stochastic model search.
`maxiterations`	the maximum number of iterations to run before the GA search is halted.
`runs_til_stop`	the number of consecutive generations without any improvement in the best fitness value before the GA is stopped.
`monitor`	a logical defaulting to TRUE showing the evolution of the search. If monitor = FALSE, any output is suppressed.
`popSize`	the population size.
`verbose`	Logical; if TRUE, print a brief summary of results.

Value

modelselect.glm returns a list containing the following components:

models: A data frame of candidate models' BIC and posterior probabilities, sorted by decreasing posterior probability
variables: A data frame of candidate variables' posterior inclusion probabilities
data: The data with variables in the formula.

The function glm.best is used to obtain the linear fitting to the best model by posterior probability or by controlling variables' posterior inclusion probabilities.

VariableSelection documentation built on Feb. 17, 2026, 5:07 p.m.