estimate_program: Estimate a program (R function) from data.

Description Usage Arguments Value

View source: R/estimate_program.R

Description

estimate_program uses rgp's rgp::typedGeneticProgramming.

Usage

1
2
3
4
5
6
7
estimate_program(formula, data, subset = NULL, loss = c("log_lik", "rmse",
  "identity", "identity_multi_class"), identity_outcome_type = c("integer",
  "character", "factor"), link = c("logit", "probit", "cauchit", "identity"),
  func_list = list("+", "-", "*", "divide", "exp", ">", "<", "logn", "sqrtn",
  "&", "|", "!", "ifelse2", "one_rnorm"), mins = 10, steps = 2000,
  repeats = 2, parallel = FALSE, cores = NULL,
  enable_complexity = FALSE, lambda = 50, crossover_probability = 0.5)

Arguments

formula

formula used to create the data.frame needed, ensuring that the outcome variable, the variable to the left of "~", is the first column in the data.frame and the following columns are the predictor variables.

data

a data.frame with named columns containing the variables in formula. Neither a matrix nor an array will be accepted. We use the formula to turn this into a data.frame where (i.) the first column is named "outcome" and has the outcome variable we are evolving programs to predict; (ii.) all other columns (there must be at least one other column) that follow the "outcome" column are named columns containing the predictor variables, which can be of any type. The order matters: when using the evolved program for predictions the (named) arguments to the function will be the predictor variables in the order they are supplied to this estimate_program() function.

subset

a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector (see [.data.frame) for the rows of data or if that is not supplied, a data frame made up of the variables used in formula.

loss

Optional Character vector length one

identity_outcome_type

Optional Character vector length one only needed if loss=="identity".

link

Optional Character vector length one

func_list

Optional List where each element is a length one character vector.

mins

Optional Integer vector length one

steps

Optional Integer vector length one

repeats

Optional Integer vector length one sepcifying how many times to repeat the model fitting

parallel

Optional Logical vector length one. Default is parallel = FALSE; parallel = TRUE can be slower if the data set is small relative to the numner of population evolutions desired

cores

Optional Integer vector length one

enable_complexity

Optional logical vector length one

lambda

Optional integer vector length one for the number of children rgp::typedGeneticProgramming() creates in each generation

crossover_probability

Optional numeric vector length one (default == 0.5) for rgp::typedGeneticProgramming()

Value

The function returns an S4 object. See estimate_program for the details of the slots (objects) that this type of object will have.


JohnNay/agp documentation built on May 7, 2019, noon