Description Usage Arguments Value Examples
This is main call function to run package GA. This package is comprised of
a main execution file (select.R
) and other R files comtaining the utilities functions
called for execution. The user can enter enter a dependent variable and a dataset to execute this function.
1 2 3 | select(y, dataset, reg_method = NULL, n_iter = 200, pop_size = 2 * n, objective = "AIC",
interaction = F, most_sig = F, parent_selection = "prop", nb_groups = 4, generation_gap = 0.25,
gene_selection = "crossover", nb_pts = 1, mu = 0.3, err = 1e-6)
|
y |
(character) Column name of the dependent variable |
dataset |
(data frame)The dataset in matrix form with last column being the dependent variable. |
reg_method |
(character) "lm" or "glm". methods for fitting the data (default "lm") |
n_iter |
(int) The maximum number of iterations allowed when running GA |
pop_size |
(int) The number of individuals per generation (default 2 * number of covariates). |
objective |
(character) The objective criterion to use (default "AIC"). |
interaction |
(logical) Whether to add the interaction terms to the independent variables (default F). |
most_sig |
(logical) Whether to use the most significant variables inside the first_generation function (default F). |
parent_selection |
(character) The mechanism to select parents. Selection mechanisms are "prop","prop_random", "random" or "tournament". |
nb_groups |
(int) The number of groups chosen to do using the tournament selection. (default 4) |
generation_gap |
( numeric) The proportion of the individuals to be replaced by offspring. (default 0.25) |
gene_selection |
(function) The additional selection method for choosing genes in GA. Refer to gene_selection to see the required inputs and the desired form of output. If left unspecified, the algorithm uses a default function which is controlled using the gene_operator parameter. |
gene_operator |
If the user doesn't provide his own gene_selection method, then the gene_operator is used. Options are "crossover" and "random" |
nb_pts |
(int) The number of points that used in crossover (default 1) |
mu |
(numeric) The mutation rate (default 0.3) |
err |
(numeric) The convergence threshold (if the difference between last iteration and current is less than err, the algorithm stops) (default 1e-6) |
select
returns a list with elements:
List containing the following:
variables
: The names of variables that selected
indices
: The indices of the variables selected
linear_model
: a lm
or glm
object
iterations
: number of iterations until getting the selection
objective
: the value of objective function of the returned model
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | select("mpg", mtcars)
select("crim", Boston)
simulation <- function(c, n, beta_0, beta, sigma){
c: number of variables c = 10
n: total number of observations
X <- matrix(rep(round(runif(c, min = 1, max = 10)),n) + rnorm(c*n, mean = 0, sd = 0.2),
nrow = n, byrow = T)
X_names <- paste0("X", 1:c)
X_data <- as.data.frame(X)
colnames(X_data) <- X_names
Y <- rowSums(t(beta*t(X))) + beta_0 + rnorm(n, mean = 0, sd = sigma)
return(cbind(X_data, Y))
}
test_data <- simulation(10, 100, 1,sample(c(round(runif(10/2, min = 2, max = 10)), rep(0,5)), replace = F), 1)
select(names(test_data)[length(names(test_data))], test_data, reg_method="lm", n_iter =200, pop_size = 20, objective = "AIC",
interaction = F, most_sig = F, parent_selection = "prop", nb_groups = 4, generation_gap = 0.25,
gene_selection = NULL, gene_operator = "crossover", nb_pts = 1, mu = 0.3, err = 1e-6)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.