select: Genetic Algorithm

Description Usage Arguments Details Value Examples

View source: R/select.R

Description

Ranked each model by its fitness, Choose parents from generations propotional to their fitness. Then do crossover and mutation, Replace a proportion G of the worst old individuals by best new individuals

Usage

1
2
3
select(X, y, C = ncol(X), family = gaussian, selection = "tournament",
  K = 2, randomness = TRUE, P = 2 * ncol(X), G = 1/P, n_splits = 2,
  op = NULL, fit_func = AIC, max_iter = 100, parallel = TRUE, ...)

Arguments

X:

dataframe containing vairables in the model

y:

vector targeted variable

C:

The length of chromosomes, i.e. the maximum number of possible predictors.

family:

a description of the error distribution and link function to be used in glm.

selection:

selection mechanism. Can be either "proportional" or "tournament".

K:

size of each round of selection when using tournament selection. Must be an integer smaller than generation size.

randomness:

if TURE, one parent will be selected randomly

P:

population size

G:

proportion of worst-performing parents the user wishes to replace by best offspring

n_splits:

number of crossover points to use in breeding

op:

An optional, user-specified genetic operator function to carry out the breeding.

fit_func:

Function for fitness measurement. Default is AIC.

max_iter:

how many iterations to run before stopping

Details

First, the algorithm setups up the first generation of P models by randomly selecting features for each member of the generation. Once that was completed, the algorithm calculates the fitness of each model inside the generation and rank all the models by their fitness. The algorithm repeats this step till we reach the max number of iterations. Once this is complete, the feature set corresponding to the lowest AIC is returned.

Value

The best individual seen over all iterations. The best individual is characterized as the feature set that best explains the data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
x <- mtcars[-1]
y <- unlist(mtcars[1])
select(x, y, selection = "tournament", K = 5, randomness=TRUE, G=0.8)
set.seed(1)
n <- 500
C <- 40
X <- matrix(rnorm(n * C), nrow = n)
beta <- c(88, 0.1, 123, 4563, 1.23, 20)
y <- X[ ,1:6] %*% beta
colnames(X) <- c(paste("real", 1:6, sep = ""),
                 paste("noi", 1:34, sep = ""))
o1 <- select(X, y, nsplits = 3, max_iter = 10)
o2 <- select(X, y, selection = "proportional", n_splits = 3)

kunaljaydesai/GA documentation built on May 28, 2019, 7:38 a.m.