modelselect.glm: Title: Variable selection for generalized linear models

View source: R/modelselection_glm.R

modelselect.glmR Documentation

Title: Variable selection for generalized linear models

Description

Description: use BIC to do variable selection.

Usage

modelselect.glm(
  formula,
  data,
  family,
  GA_var = 16,
  maxiterations = 2000,
  runs_til_stop = 1000,
  monitor = TRUE,
  popSize = 100,
  verbose = TRUE
)

Arguments

formula

an object of class "formula": a symbolic description of the model to be fitted. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.

data

an data frame containing the variables in the model.

family

a character string naming a family function describing the error distribution to be used in the model.

GA_var

if the number of variables is smaller than GA_var, then do exhaustive model search, otherwise use genetic algorithm to do stochastic model search.

maxiterations

the maximum number of iterations to run before the GA search is halted.

runs_til_stop

the number of consecutive generations without any improvement in the best fitness value before the GA is stopped.

monitor

a logical defaulting to TRUE showing the evolution of the search. If monitor = FALSE, any output is suppressed.

popSize

the population size.

verbose

Logical; if TRUE, print a brief summary of results.

Value

modelselect.glm returns a list containing the following components:

models

A data frame of candidate models' BIC and posterior probabilities, sorted by decreasing posterior probability

variables

A data frame of candidate variables' posterior inclusion probabilities

data

The data with variables in the formula.

The function glm.best is used to obtain the linear fitting to the best model by posterior probability or by controlling variables' posterior inclusion probabilities.


VariableSelection documentation built on Feb. 17, 2026, 5:07 p.m.