genAlgControl | R Documentation |
The population must be large enough to allow the algorithm to explore the whole solution space. If the initial population is not diverse enough, the chance to find the global optimum is very small. Thus the more variables to choose from, the larger the population has to be.
genAlgControl(
populationSize,
numGenerations,
minVariables,
maxVariables,
elitism = 10L,
mutationProbability = 0.01,
crossover = c("single", "random"),
maxDuplicateEliminationTries = 0L,
verbosity = 0L,
badSolutionThreshold = 2,
fitnessScaling = c("none", "exp")
)
populationSize |
The number of "chromosomes" in the population (between 1 and 2^16) |
numGenerations |
The number of generations to produce (between 1 and 2^16) |
minVariables |
The minimum number of variables in the variable subset (between 0 and p - 1 where p is the total number of variables) |
maxVariables |
The maximum number of variables in the variable subset (between 1 and p, and greater than |
elitism |
The number of absolute best chromosomes to keep across all generations (between 1 and min( |
mutationProbability |
The probability of mutation (between 0 and 1) |
crossover |
The crossover type to use during mating (see details). Partial matching is performed |
maxDuplicateEliminationTries |
The maximum number of tries to eliminate duplicates
(a value of |
verbosity |
The level of verbosity. 0 means no output at all, 2 is very verbose. |
badSolutionThreshold |
The worst child must not be more than |
fitnessScaling |
How the fitness values are internally scaled before the selection probabilities are assigned to the chromosomes. See the details for possible values and their meaning. |
The initial population is generated randomly. Every chromosome uses between minVariables
and
maxVariables
(uniformly distributed).
If the mutation probability (mutationProbability
is greater than 0, a random number of
variables is added/removed according to a truncated geometric distribution to each offspring-chromosome.
The resulting distribution of the total number of variables in the subset is not uniform anymore, but almost (the smaller the
mutation probability, the more "uniform" the distribution). This should not be a problem for most
applications.
The user can choose between single
and random
crossover for the mating process. If single crossover
is used, a single position is randomly chosen that marks the position to split both parent chromosomes. The child
chromosomes are than the concatenated chromosomes from the 1st part of the 1st parent and the 2nd part of the
2nd parent resp. the 2nd part of the 1st parent and the 1st part of the 2nd parent.
Random crossover is that a random number of random positions are drawn and these positions are transferred
from one parent to the other in order to generate the children.
Elitism is a method of enhancing the GA by keeping track of very good solutions. The parameter elitism
specifies how many "very good" solutions should be kept.
Before the selection probabilities are determined, the fitness values f
of the chromosomes are
standardized to the z-scores (z = (f - mu) / sd
). Scaling the fitness values afterwards with
the exponential function can help the algorithm to faster find good solutions. When setting
fitnessScaling
to "exp"
, the (standardized) fitness z
will be scaled by exp(z)
.
This promotes good solutions to get an even higher selection probability, while bad solutions
will get an even lower selection probability.
An object of type GenAlgControl
ctrl <- genAlgControl(populationSize = 100, numGenerations = 15, minVariables = 5,
maxVariables = 12, verbosity = 1)
evaluatorSRCV <- evaluatorPLS(numReplications = 2, innerSegments = 7, testSetSize = 0.4,
numThreads = 1)
evaluatorRDCV <- evaluatorPLS(numReplications = 2, innerSegments = 5, outerSegments = 3,
numThreads = 1)
# Generate demo-data
set.seed(12345)
X <- matrix(rnorm(10000, sd = 1:5), ncol = 50, byrow = TRUE)
y <- drop(-1.2 + rowSums(X[, seq(1, 43, length = 8)]) + rnorm(nrow(X), 1.5));
resultSRCV <- genAlg(y, X, control = ctrl, evaluator = evaluatorSRCV, seed = 123)
resultRDCV <- genAlg(y, X, control = ctrl, evaluator = evaluatorRDCV, seed = 123)
subsets(resultSRCV, 1:5)
subsets(resultRDCV, 1:5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.