This is main call function to run package GA. This package is comprised of
a main execution file (`select.R`

) and other R files comtaining the utilities functions
called for execution. The user can enter enter a dependent variable and a dataset to execute this function.

1 2 3 | ```
select(y, dataset, reg_method = NULL, n_iter = 200, pop_size = 2 * n, objective = "AIC",
interaction = F, most_sig = F, parent_selection = "prop", nb_groups = 4, generation_gap = 0.25,
gene_selection = "crossover", nb_pts = 1, mu = 0.3, err = 1e-6)
``` |

`y` |
(character) Column name of the dependent variable |

`dataset` |
(data frame)The dataset in matrix form with last column being the dependent variable. |

`reg_method` |
(character) "lm" or "glm". methods for fitting the data (default "lm") |

`n_iter` |
(int) The maximum number of iterations allowed when running GA |

`pop_size` |
(int) The number of individuals per generation (default 2 * number of covariates). |

`objective` |
(character) The objective criterion to use (default "AIC"). |

`interaction` |
(logical) Whether to add the interaction terms to the independent variables (default F). |

`most_sig` |
(logical) Whether to use the most significant variables inside the first_generation function (default F). |

`parent_selection` |
(character) The mechanism to select parents. Selection mechanisms are "prop","prop_random", "random" or "tournament". |

`nb_groups` |
(int) The number of groups chosen to do using the tournament selection. (default 4) |

`generation_gap` |
( numeric) The proportion of the individuals to be replaced by offspring. (default 0.25) |

`gene_selection` |
(function) The additional selection method for choosing genes in GA. Refer to gene_selection to see the required inputs and the desired form of output. If left unspecified, the algorithm uses a default function which is controlled using the gene_operator parameter. |

`gene_operator` |
If the user doesn't provide his own gene_selection method, then the gene_operator is used. Options are "crossover" and "random" |

`nb_pts` |
(int) The number of points that used in crossover (default 1) |

`mu` |
(numeric) The mutation rate (default 0.3) |

`err` |
(numeric) The convergence threshold (if the difference between last iteration and current is less than err, the algorithm stops) (default 1e-6) |

`select`

returns a list with elements:

List containing the following:

`variables`

: The names of variables that selected`indices`

: The indices of the variables selected`linear_model`

: a`lm`

or`glm`

object

`iterations`

: number of iterations until getting the selection`objective`

: the value of objective function of the returned model

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
select("mpg", mtcars)
select("crim", Boston)
simulation <- function(c, n, beta_0, beta, sigma){
c: number of variables c = 10
n: total number of observations
X <- matrix(rep(round(runif(c, min = 1, max = 10)),n) + rnorm(c*n, mean = 0, sd = 0.2),
nrow = n, byrow = T)
X_names <- paste0("X", 1:c)
X_data <- as.data.frame(X)
colnames(X_data) <- X_names
Y <- rowSums(t(beta*t(X))) + beta_0 + rnorm(n, mean = 0, sd = sigma)
return(cbind(X_data, Y))
}
test_data <- simulation(10, 100, 1,sample(c(round(runif(10/2, min = 2, max = 10)), rep(0,5)), replace = F), 1)
select(names(test_data)[length(names(test_data))], test_data, reg_method="lm", n_iter =200, pop_size = 20, objective = "AIC",
interaction = F, most_sig = F, parent_selection = "prop", nb_groups = 4, generation_gap = 0.25,
gene_selection = NULL, gene_operator = "crossover", nb_pts = 1, mu = 0.3, err = 1e-6)
``` |

