Description Usage Arguments Details Examples
select implements genetic algorithms for variable selection for GLMs by optimizing package or user specified objective functions such as AIC, BIC, and logloglikelihood.
Uses functions: generate_founders
, evaluate_fitness
, and create_next_generation
.
Functions find optimal variables by using evolutationry biology concepts of natural selection, fitness, genetic crossover, and mutation. Founding generation of chromosomes is randomly generated and evaluated using an critieria such as AIC, BIC, or loglihood. Parents are selected by their fitness, and generate children chromosomes. As each generation breeds and produces new genreations, the algorithm moves towards the optimum.
1 2 3 4 5 | select(Y, X, family = "gaussian", objective_function = stats::AIC,
crossover_parents_function = crossover_parents,
crossover_method = c("method1", "method2", "method3"), pCrossover = 0.8,
start_chrom = NULL, mutation_rate = NULL, converge = TRUE,
tol = 1e-04, iter = 100, minimize = TRUE, nCores = 1L)
|
Y |
vector of response variable |
X |
a matrix or dataframe of predictor variables |
family |
a character string describing the error distribution and link function to be used in the model. Default is gaussian. |
objective_function |
function for computing objective. Default is |
crossover_parents_function |
a function for crossover between mate pairs. User can specify custom function. Default is |
crossover_method |
a character string describing crossover method. Default is multi-point crossover. See |
pCrossover |
a numeric value for he probability of crossover for each mate pair. |
start_chrom |
a numeric value for the size of the popuation of chromosomes. Default is |
mutation_rate |
a numeric value for rate of mutation. Default is 1 / (P √ C), where P is number of chromosomes, and C is number of predictors. |
converge |
a logical value indicating whether algorithm should attempt to converge or run for specified number of iterations. If |
tol |
a numeric value indicating convergence tolerance. Default is 1e-4. |
iter |
an integer specifying maximum number of generations algorithm will produce. Default is 100 |
minimize |
a logical value indicating whether optimize should be minimized (TRUE) or maximized (FALSE). |
nCores |
an integer indicating number of parallel processes to run when evaluating fitness. Default is 1, or no paralleization. See If user wants to use custom objective_function, they must use a function that is compatible with |
1. Geof H. Givens, Jennifer A. Hoeting (2013) Combinatorial Optimization (italicize). Chapter 3 of Computational Statistics (italicize).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # Simulated data
rm(list = ls())
set.seed(1111)
# simulate data for gaussian GLM
library(simrel)
library(GA)
n <- 100 # number obs
p <- 10 # number predictors
m <- 2 # number relevant latent components
q <- 5 # number relevant predictors
gamma <- 0.2 # speed of decline in eigenvalues
R2 <- 0.5 # theoretical R-squared according to the true linear model
relpos <- base::sample(1:p, m, replace = FALSE) # positions of m
dat <- simrel::simrel(n, p, m, q, relpos, gamma, R2) # generate data
x <- dat$X
y <- dat$Y
## Not run: sim_GA <- GA:select(y, x, family = "gaussian", objective_function = stats::AIC,
crossover_method = "method1", pCrossover = 0.8, converge = TRUE, minimize = TRUE, nCores = 1)
## End(Not run)
# mtcars
data(mtcars)
y <- mtcars$mpg
x <- mtcars[, 2:11]
## Not run: GA_mtcars <- GA:select(y, x, family = "gaussian", objective_function = stats::AIC,
crossover_method = "method1", pCrossover = 0.8, converge = TRUE, minimize = TRUE, nCores = 1)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.