estimate_abm: Estimate an ABM
In JohnNay/eat: Empirical Agent Training Software for Data-Driven Modeling

Using estimate_abm one can use their data and their abm function they are using for cv_abm to estimate an ABM via optimization of its global abm parameters or their specification. Then this can be used for analysis.

estimate_abm(data, features, Formula, agg_patterns, abm_simulate, abm_vars,
  iters, tseries_len, verbose = TRUE, tp = rep(tseries_len,
  nrow(agg_patterns)), package = c("caretglm", "caretglmnet", "glm",
  "caretnnet", "caretdnn"), sampling = FALSE, sampling_size = 1000,
  STAT = c("mean", "median"), abm_optim = c("GA", "DE"),
  optimize_abm_par = FALSE, parallel_training = FALSE)

`data`	`data.frame` with each row (obervational unit) being an individual decision. With a column named "group" specifying which group of `agg_patterns` each obseravtion is in, and a column named "period" specifying at what time period each behavior was taken.
`features`	`list` of the variables (columns in `data`) to be used in the prediction `Formula`. As many elements in the `list` as we want discrete models for different times. Each element of the `list` is a `character vector`, with each element of the `character vector` being a feature to use for training an individual-level model.
`Formula`	`list` where each element is a length one character vector that specifies a formula, e.g. `"y ~ x"`. The character vector makes sense in the context of the `features` and `data`. There are as many elements in the list as there are discrete models for different times.
`agg_patterns`	data.frame with rows (observational unit) being the group and columns: (a.) those aggregate level variables needed for the prediction with the specified `formula` (with same names as the variables in the formula); (b.) a column named "action" with the proportion of the relevant outcome action taken in that group; (c.) columns named `paste(seq(tseries_len))` with the mean/median levels (`STAT`) of the action for each time period.
`abm_simulate`	function with these arguments: `model, features, parameters, tuning_parameters, iterations, time_len, STAT = c("mean", "median")`. Where `model` is the output of `training`. Output of the function is a list with three named elements: `dynamics, action_avg, simdata`. Where `dynamics` is a numeric vector length `tseries_len`, `action_avg` is a numeric vector length one, and `simdata` is a `data.frame` with the numeric results of the simulation.
`abm_vars`	a list with either (1.) a numeric vector named "lower" AND a numeric vector named "upper" each the length of the number of tuning_params of ABM (the names of the elements of these vecs should be the names of the variables and they should be in the same order that the `abm_simulate` function uses them); or (2.) a numeric vector named "value" the length of the number of tuning_params of the ABM (variables should be in the same order that the `abm_simulate` function uses them). Either provide lower and upper elements of the list or provide a value element of the list.
`iters`	numeric vector length one specifying number of iterations to simulate ABM for.
`tseries_len`	numeric vector length one specifying maximum number of time periods to use for model training and testing. If some groups have less than the maximum then you need to provide a vector to the `tp` argument.
`verbose`	optional logical vector length one, default is `TRUE`.
`tp`	optional numeric vector length number of rows of `agg_patterns` specifying how long the time series for each group should be. Default is `rep(tseries_len, nrow(agg_patterns))`.
`package`	optional character vector length one, default is `"caretglm", "caretglmnet", "glm", "caretnnet", "caretdnn"`.
`sampling`	optional logical vector length one, default is `FALSE`. If `sampling == TRUE`, we sample equal numbers of observations from each 'group' to reduce potential problems with the final estimated model being too affected by groups with more observations.
`sampling_size`	optional numeric vector length one specifying how many observations from each group that `training` should sample to train the model, default is 1000. Only applicable when `sampling` argument is set to `TRUE`.
`STAT`	optional character vector length one, default is `c("mean", "median")`.
`abm_optim`	optional character vector length one, default is `c("GA", "DE")`.
`optimize_abm_par`	optional logical vector length one, default is `FALSE`. This is passed to the optimization algorithm.
`parallel_training`	optional logical vector length one, default is `FALSE`. This is passed to `training`.

Returns a function that has three arguments: parameters, out, iterations. If out=="action_avg" for the returned function, the average of all the actions is returned by this function; otherwise, the vector of the average for each time is returned by this function. This returns a wrapper function around their abm simulation function to be used for analysis.

JohnNay/eat documentation built on May 7, 2019, noon