xegaSelectGene: Package xegaSelectGene.

xegaSelectGeneR Documentation

Package xegaSelectGene.

Description

Selection functions for genetic algorithms.

Details

The selectGene package provides selection and scaling functions for genetic algorithms. All functions of this package are independent of the gene representation.

  • Scaling functions and dispersion measures are in scaling.R

  • Selection functions are in selectGene.R. For selection functions, a transformation to index access functions is provided (a limited form of function continuation).

  • Benchmark functions for selection functions are in selectGeneBenchmark.R. Except for uniform selection, the continuation form of selection functions should be used.

  • Evaluation functions are in evalGene.R.

  • Counting and timing of function executions are provided by transformation functions in timer.R

  • Problem environments for examples and unit tests for

    • function optimization: DeJongF4.R (stochastic functions) and Parabola2D.R (delayed execution for benchmarking of parallelism, deterministic function, deterministic function with early termination check, function with random failures)

    • combinatorial optimization: newTSP.R (for the traveling salesman problem).

    • boolean function learning: newXOR.R (for the XOR problem).

Interface Scaling Functions

All scaling functions must implement the following abstract interface:

function name(fit, lF)

Parameters

  • fit A fitness vector.

  • lF Local configuration.

Return Value

Scaled fitness vector.

Interface Dispersion Measures

All dispersion measure functions must implement the following abstract interface:

function name(popstatvec)

Parameters

  • popstatvec Vector of population statistics.

    The internal state of the genetic algorithm is described by a matrix of the history of population statistics. Each row consists of 8 population statistics (mean, min, Q1, median, Q3, max, var, mad). A row is a vector of population statistics.

Return Value

Dispersion measure (real).

Interface Selection Functions

All selection functions must implement the following abstract interface:

function name(fit, lF, size)

Parameters

  • fit a vector of fitness values.

  • lF a local function list.

  • size the number of indices returned.

Return Value

A vector of indices of length size.

All selection functions are implemented WITHOUT a default assignment to lF.

A missing configuration should raise an error!

The default value of size is 1.

Constants

Some scaling and selection functions use constants which should be configured. We handle these constants by constant functions created by parm(constant). We store all of these functions in the list of local functions lF. The rationale is to reduce the number of parameters of selection functions and to provide a uniform interface for selection functions.

Table of Scaling Constants

Constant Default Used in
lF$Offset() 1 ScaleFitness()
lF$ScalingExp() 1 ScalingFitness(),
ThresholdScaleFitness()
lF$ScalingExp2() 1 ThresholdScaleFitness()
lF$ScalingThreshold() 1 ThresholdScaleFitness()
lF$RDMWeight() 1.0 ContinuousScaleFitness()
lF$DRMin() 0.5 DispersionRatio()
lF$DRMax() 2.0 DispersionRatio()
lF$ScalingDelay() 1 DispersionRatio()
State Variable Start Value Used in
lF$RDM() 1.0 ThresholdScaleFitness()
ContinuousScaleFitness()
xega::xegaRun()

Table of Selection Constants

Constant Default Used in
lF$SelectionContinuation() TRUE xegaPopulation::xegaNextPopulation()
lF$Offset() 1 SelectPropFitOnLn()
SelectPropFit()
SelectPropFitM()
SelectPropFitDiffOnLn()
SelectPropFitDiff()
SUS()
SelectLinearRankTSR()
lF$eps() 0.01 SelectPropFitDiffM()
lF$TournamentSize() 2 Tournament()
SelectTournament()
STournament()
SelectSTournament()
lF$SelectionBias() 1.5 SelectLRSelective()
lF$MaxTSR() 1.5 SelectLinearRankTSR()

Parallel/Distributed Execution

All selection functions in this package return

  1. the index of a selected gene. The configured selection function is executed each time a gene must be selected in the gene replication process. This allows a parallelization/distribution of the complete gene replication process and the fitness evaluation. However, the price to pay is a recomputation of the selection algorithms for each gene and each mate (which may be costly). The execution time of Baker's SUS function explodes when used in this way.

  2. a vector of indices of the selected genes. We compute a vector of indices for genes and their mates, and we replace the selection function with a quasi-continuation function with precomputed indices which when called, returns the next index. The selection computation is executed once for each generation without costly recomputation. The cost of selecting a gene and its mate is the cost of indexing an integer in a vector. This version is faster for almost all selection functions (Sequential computation).

    The parallelization of quasi-continuation function is not yet implemented.

Constant Functions for Configuration

The following constant functions are expected to be in the local function list lF.

  • Offset() in SelectPropFit(): Since all fitness values must be larger than 0, in case of negative fitness values, Offset() is the value of the minimum fitness value (default: 1).

  • Eps() in SelectPropFitDiff(): Eps() is a very small value to eliminate differences of 0.

  • TournamentSize() in SelectTournament(): Specifies the size of the tournament. Per default: 2.

  • SelectionBias() in SelectLinearRank(). This constant must be larger than 1.0 and usually should be set at most to 2.0. Increasing SelectionBias() increases selection pressure. Beyond 2.0, there is the danger of premature convergence.

Performance Measurement

The file Timer.R: Functions for timing and counting.

The file selectGeneBenchmark.R: A benchmark of selection functions.

Interface Function Evaluation and Methods

All evaluation functions must implement the following abstract interface:

function name(gene, lF)

Parameters

  • gene a gene.

  • lF a local function list.

Return Value

A gene.

The file evalGene.R contains different function evaluation methods.

  1. EvalGeneU() evaluates a gene unconditionally. (Default.)

  2. EvalGeneR() evaluates a gene unconditionally and allows the repair of the gene by the decoder.

  3. EvalGeneDet() memoizes the evaluation of a gene in the in the gene. Genes are evaluated only once. This leads to a performance improvement for deterministic functions.

  4. EvalGeneStoch() computes an incremental average of the value of a gene. The average converges to the true value as the number of repeated evaluations of a gene increases.

Gene Representation

A gene is a named list:

  • $gene1 the gene.

  • $fit the fitness value of the gene (for EvalGeneDet and EvalGeneU) or the mean fitness (for stochastic functions evaluated with EvalGeneStoch).

  • $evaluated has the gene been evaluated?

  • $evalFail has the evaluation of the gene failed?

  • $var the cumulative variance of the fitness of all evaluations of a gene. (For stochastic functions)

  • $sigma the standard deviation of the fitness of all evaluations of a gene. (For stochastic functions)

  • $obs the number of evaluations of a gene. (For stochastic functions)

The Architecture of the xegaX-Packages

The xegaX-packages are a family of R-packages which implement eXtended Evolutionary and Genetic Algorithms (xega). The architecture has 3 layers, namely the user interface layer, the population layer, and the gene layer:

  • The user interface layer (package xega <https://CRAN.R-project.org/package=xega> ) provides a function call interface and configuration support for several algorithms: genetic algorithms (sga), permutation-based genetic algorithms (sgPerm), derivation-free algorithms as e.g. differential evolution (sgde), grammar-based genetic programming (sgp) and grammatical evolution (sge).

  • The population layer (package xegaPopulation <https://CRAN.R-project.org/package=xegaPopulation> ) contains population-related functionality as well as support for population statistics dependent adaptive mechanisms and parallelization.

  • The gene layer is split into a representation-independent and a representation-dependent part:

    1. The representation-indendent part (package xegaSelectGene <https://CRAN.R-project.org/package=xegaSelectGene> ) is responsible for variants of selection operators, evaluation strategies for genes, and profiling and timing capabilities.

    2. The representation-dependent part consists of the following packages:

      • xegaGaGene <https://CRAN.R-project.org/package=xegaGaGene> for binary coded genetic algorithms.

      • xegaPermGene <https://CRAN.R-project.org/package=xegaPermGene> for permutation-based genetic algorithms.

      • xegaDfGene <https://CRAN.R-project.org/package=xegaDfGene> for derivation-free algorithms as e.g. differential evolution.

      • xegaGpGene <https://CRAN.R-project.org/package=xegaGpGene> for grammar-based genetic algorithms.

      • xegaGeGene <https://CRAN.R-project.org/package=xegaGaGene> for grammatical evolution algorithms.

      The packages xegaDerivationTrees and xegaBNF support the last two packages:

      • xegaBNF <https://CRAN.R-project.org/package=xegaBNF> essentially provides a grammar compiler and

      • xegaDerivationTrees <https://CRAN.R-project.org/package=xegaDerivationTrees> an abstract data type for derivation trees.

Copyright

(c) 2023 Andreas Geyer-Schulz

License

MIT

URL

<https://github.com/ageyerschulz/xegaSelectGene>

Installation

From CRAN by install.packages('xegaSelectGene')

Author(s)

Andreas Geyer-Schulz

See Also

Useful links:


xegaSelectGene documentation built on April 16, 2025, 5:12 p.m.