Description Usage Arguments Details Value Examples
Performs the genetic algorithm for regression with the specified arguments and fitness function to return the optimal solution to the regression problem.
1 2 3 4 |
Y |
The response vector. |
X |
The feature matrix. |
regType |
The model of an appropriate class ("lm" and "glm"). This is used as the initial model in the genetic search. The default model is 'lm'. |
family |
The family to be passed into 'glm'. The family can be 'binomial', 'gaussian', 'gamma', 'inverse.gaussian', 'possion', 'quasi', 'quasibinomial', 'quasipoisson'. The default family is "gaussian" family. |
fitness |
The fitness function to describe the fitness of all chromosomes in the generation. The default fitness function is negative 'AIC'. Please make sure your user inputted fitness function has high values for good candidates. |
ranked |
Logical; if TRUE parents are selected based on the rank of fitness values. The default is TRUE. See page 80 in Givens/Hoeting. |
selectionType |
the type of selection mechanism ("oneprop" and "twoprop"), which describes the process by which parents are chosen to produce offspring. The default selection mechanism is 'twoprop'. See page 76 of Givens/Hoeting. |
elitism |
Logical; if TRUE the fittest individual to survive at each generation. The default is 'TRUE'. |
crossoverType |
The type of crossover operation ("single" and "multiple"), which describe the process of generating offsprings by combing part of the genetic information from their parents. The default is 'single'. |
numCrossover |
The number of splits for type 'multiple'. The default number is 2. |
mutationRate |
The rate of mutation, which indicates the probability of mutating for each gene. Mutation is a genetic operation that changes an offspring chromosome by randomly introducing one or more alleles in loci. The default rate is 0.01. |
maxIter |
The maximum number of iterations to run before the GA search is halted. The defaulte number is 100. |
P |
The population size. |
seed |
The seed for reproducibility. |
VERBOSE |
Logical; if 'TRUE' prints which generation the algorithm is currently at. |
The Genetic Algorithms (GAs) are stochastic search algorithms that mimic the process of Darwinian natural selection. GAs simulate the biological evolution, where breeding among highly fit organisms ensures desirable attributes be passed to future generations, thereby providing a set of increasingly good candidate solutions to the optimization.
The select function enables the application of genetic algorithms to problems where the decision variables are encoded as "binary".
Selection mechanism mimics the process by which parents are chosen to produce offspring. Crossover and mutation operations are used to produce offspring chromosomes from chosen parent chromosomes.
Rank-based method is applied here to prevent GAs convergence to a poor local optimum, and parents are chosen based on the rank of values of negative AIC function. Any R function, which takes as input an individual string representing a potential solution, that returns a numerical value describing its "fitness" is allowable to perform as a fitness function.
The population size is in the range of the chromosome length to two times of chromosome length, though this can be overridden by the user. In this function, the default for the population size is twice of chromosome length, which is the number of columns of the feature matrix.
A list of the final candidate where the list contains
variables: The variables that were
selected in the final population
fit: The lm or glm object of the fit with the
above variables.
fitness : The value of the fitness.
fitnessType : The type of fitness.
lengthElitism : If elitism was selected, the length
of elitism (gives an idea for convergence).
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.