Description Usage Arguments Value See Also Examples
Recommends regression variables by maximizing a fitness criteria using genetic algorithms
1 2 3 |
x |
matrix of dimension n * p |
y |
vector of length n or a matrix with n rows |
model |
list - default "glm" : one of ("lm", "glm") and an optional character string specifying arguments into lm.fit() or glm.fit() |
fitMetric |
default "AIC": one of ("AIC", "BIC") or a function that takes a regression object and outputs a single number to be maximized |
maxGen |
default 200: integer specifying the maximum number of GA generations to use |
minGen |
default 10: integer specifying the number of generations without fitness improvement at which the GA algorithm will stop |
gaMethod |
list - default 'LR': one of ('TN', 'LR', 'ER','RW') and an additional numrical argument as needed. See gaSelection for details. |
pop |
default 100: integer specifying the size of the genotype pool. |
pMutate |
default 0.1: real number between 0 and 1 specifying the probability of an allele mutation |
crossParams |
numeric - default (.8, 1): c("cross probability", "max number of cross locations on a single gene") |
eliteRate |
default 0.1: Proportion of highest fitness genotypes that pass into the next generation unchanged. |
returns a list of 4 components: optimum, fitPlot, fitStats, and GA
optimum: a list of properties of the genotype acheiving max fitness
variables: the recommended set of regression variables
fitness: the achieved fitness metric
fitModel: the regression object returned by using the recommended variables
fitPlot: a plot of the mean, median, and maximum fitness over the generations
fitStats: a tibble of the values used to generate the plot
GA: a list of data associated with each generation of the genetic algorithm
fitness: the fitness measures of the current generation
elites: the fitness values and genotypes with the highest fitness
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | x <- as.matrix(read.table("data/baseball.dat", header = TRUE))[, -1]
y <- as.matrix(read.table("data/baseball.dat", header = TRUE))[, 1]
# linear regression using roulette wheel parent selection
GA <- select(X, Y, model = list("lm"), gaMethod = list("RW"))
# to return just the selected regression variables
GA$optimum$variables
# to return the regression object using the selected variables
GA$optimum$fitModel
# generalized linear regression with binomial family using tournament selection
GA <- select(X, Y, model = list("glm", "family = poisson()"))
# code for generated data linear regression example
x <- as.matrix(read.table("./data/LRdataTest"), header = TRUE)[, -1]
y <- as.matrix(read.table("./data/LRdataTest"), header = TRUE)[, 1]
n = 50
out <- sapply(1:n, FUN = function {select(x, y)$optimum})
coeffs <- sapply(seq(3, 3*n, 3), FUN = function(i) out[[i]]$coefficients)
weights <- c(unlist(sapply(1:n, FUN = function(i) coeffs[[i]])))
weights <- sapply(colnames(x), FUN = function(name) sum(abs(weights[names(weights)==name])))
barplot(weights, xlab = "Variables", ylab = "Weights")
vars <- out[[1]]
varCoeffs <- out[[3]]$coefficients
# Code for the baseball dataset example
maxFits <- matrix(0, 4, 4)
maxIters <- matrix(0, 4, 4)
method <- list(list('TN', 5), list('LR'), list('ER', 0.5), list('RW'))
fit <- c("AIC", "BIC")
for (i in 1:4) {
for (j in 1:2) {
trial <- GA::select(x, y, model = list("lm"), fitMetric = fit[j], maxGen = 500L, minGen = 50L,
gaMethod = method[[i]], pop = 500L, pMutate = 0.1, crossParams = c(0.8, 1L), eliteRate = 0.1)
iters <- length(trial$GA)
bestFit <- eval(parse(text = paste0("trial$GA$gen", iters, "$elites[1,1]")))
maxFits[i,j] <- bestFit
maxIters[i,j] <- iters
}
for (j in 3:4) {
trial <- GA::select(x, y, model = list("glm"), fitMetric = fit[j-2], maxGen = 500L, minGen = 50L,
gaMethod = method[[i]], pop = 500L, pMutate = 0.1, crossParams = c(0.8, 1L), eliteRate = 0.1)
iters <- length(trial$GA)
bestFit <- eval(parse(text = paste0("trial$GA$gen", iters, "$elites[1,1]")))
maxFits[i,j] <- bestFit
maxIters[i,j] <- iters
}
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.