Description Usage Arguments Details Value Examples
select
implements a genetic algorithm for variable selection in regression and returns
the regression model selected by the genetic algorithm.
1 2 3 4 5 6 7 8 |
data |
The dataset to perform regression on. |
response |
A character string of the name of the response variable. |
covariates |
A character vector of names of the predictor variables (covariates). |
criterion |
AIC by default, but user can provide their own |
family |
a character string naming a family function to use in the model (passed to glm) common families include "gaussian" (identity link), "binomial" (logit link), "poisson" (log link) |
This implementation of the genetic algorithm uses generation size p = ceiling(1.5*c/2)*2 where c is the length of the chromosomes (i.e. the number of covariates to consider in the model). The parent chromosomes are selected via rank-based selection, where the probability of a chromosome being selected as parent 1 is proportional to its relative rank, = 2r/(p*(p+1)), where r is the relative rank (higher is better). Parent 1 is selected with these probabilities, and parent 2 is selected completely at random. Each chromosome is mutated with probability 1/c, which has been supported by theoretical work and empirical studies. The algorithm will stop when the objective criterion score (AIC by default) converges absolutely, i.e. when the absolute difference between the score from iteration i-1 and the score from iteration i is less than .000001, the algorithm stops and returns the best model from iteration i of the algorithm.
The regression model selected by the genetic algorithm. This is an object of class "glm" and "lm"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | data <- mtcars
response <- names(mtcars)[1]
covariates <- names(mtcars)[-1]
select(data, response, covariates)
# How to perform logistic regression with select()
response <- "am"
covariates <- c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs", "gear", "carb")
select(data, response, covariates, family = "binomial")
# You can also use another objective function instead of AIC (default)
response <- "mpg"
covariates <- c("cyl", "disp", "hp", "drat", "wt", "qsec", "vs", "am")
select(data, response, covariates, criterion="BIC")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.