select: Uses a genetic algorithm for variable selection in either lm...

Description Usage Arguments Details Value Examples

Description

Uses a genetic algorithm for variable selection in either lm or glm models

Usage

1
select(dat)

Arguments

dat

data frame containing the predictors in the model. First column should be the response variable.

P

number of chromosomes, same as the size of generation. If not specified, the default is set to be 1.5 * C where C is chromosome length.

numGens

total number of generations, default to be 100.

G

the proportion of the current generation to be replaced by the offspring to construct the next generation, should be a numeric number in the range of (0, 1], default to be 0.2 which is 20 percent.

fitnessFunction

fitness function that takes in an lm or glm model and returns a numerical fitness of that model. Users can choose AIC or BIC or even define by themselves, but need to make sure a lower fitness scores indicates the corresponding model is better.

method

the selection mechanism that user wants to apply to select parents, can be choosen from 1 to 3; 1 indicates selecting both parents with probability proportional to ranking; 2 indicates selecting one parent with probability proportional to ranking and one parent randomly, and 3 indicates selecting with method of tournament selection; default is method 1.

model

the linear model that user wants to use to fit in the data, can be either lm or glm; default to be lm.

K

number of groups to partition the population into; default is 2.

verbose

logical; if TRUE (default) prints the progress of algorithm

...

additional arguments to pass to regression model

Details

The algorithm: (1) First initializes population, For g generations; do: (2) calculates fitness of models and selects parent pairs to breed (3) breeds the parent pairs, obtain the children (4) replaces the least fit individuals in current generation with the children to obtain the next generation

Value

Returns a list with the fittest model and the corresponding fitness score, together with a matrix of the population fitness across generations (useful for plotting)

Examples

1

mindyyang/GA-R-package- documentation built on May 12, 2019, 12:31 a.m.