estimate_abm: Estimate an ABM

Description Usage Arguments Value

View source: R/estimate_abm.R

Description

Using estimate_abm one can use their data and their abm function they are using for cv_abm to estimate an ABM via optimization of its global abm parameters or their specification. Then this can be used for analysis.

Usage

1
2
3
4
5
6
estimate_abm(data, features, Formula, agg_patterns, abm_simulate, abm_vars,
  iters, tseries_len, verbose = TRUE, tp = rep(tseries_len,
  nrow(agg_patterns)), package = c("caretglm", "caretglmnet", "glm",
  "caretnnet", "caretdnn"), sampling = FALSE, sampling_size = 1000,
  STAT = c("mean", "median"), abm_optim = c("GA", "DE"),
  optimize_abm_par = FALSE, parallel_training = FALSE)

Arguments

data

data.frame with each row (obervational unit) being an individual decision. With a column named "group" specifying which group of agg_patterns each obseravtion is in, and a column named "period" specifying at what time period each behavior was taken.

features

list of the variables (columns in data) to be used in the prediction Formula. As many elements in the list as we want discrete models for different times. Each element of the list is a character vector, with each element of the character vector being a feature to use for training an individual-level model.

Formula

list where each element is a length one character vector that specifies a formula, e.g. "y ~ x". The character vector makes sense in the context of the features and data. There are as many elements in the list as there are discrete models for different times.

agg_patterns

data.frame with rows (observational unit) being the group and columns: (a.) those aggregate level variables needed for the prediction with the specified formula (with same names as the variables in the formula); (b.) a column named "action" with the proportion of the relevant outcome action taken in that group; (c.) columns named paste(seq(tseries_len)) with the mean/median levels (STAT) of the action for each time period.

abm_simulate

function with these arguments: model, features, parameters, tuning_parameters, iterations, time_len, STAT = c("mean", "median"). Where model is the output of training. Output of the function is a list with three named elements: dynamics, action_avg, simdata. Where dynamics is a numeric vector length tseries_len, action_avg is a numeric vector length one, and simdata is a data.frame with the numeric results of the simulation.

abm_vars

a list with either (1.) a numeric vector named "lower" AND a numeric vector named "upper" each the length of the number of tuning_params of ABM (the names of the elements of these vecs should be the names of the variables and they should be in the same order that the abm_simulate function uses them); or (2.) a numeric vector named "value" the length of the number of tuning_params of the ABM (variables should be in the same order that the abm_simulate function uses them). Either provide lower and upper elements of the list or provide a value element of the list.

iters

numeric vector length one specifying number of iterations to simulate ABM for.

tseries_len

numeric vector length one specifying maximum number of time periods to use for model training and testing. If some groups have less than the maximum then you need to provide a vector to the tp argument.

verbose

optional logical vector length one, default is TRUE.

tp

optional numeric vector length number of rows of agg_patterns specifying how long the time series for each group should be. Default is rep(tseries_len, nrow(agg_patterns)).

package

optional character vector length one, default is "caretglm", "caretglmnet", "glm", "caretnnet", "caretdnn".

sampling

optional logical vector length one, default is FALSE. If sampling == TRUE, we sample equal numbers of observations from each 'group' to reduce potential problems with the final estimated model being too affected by groups with more observations.

sampling_size

optional numeric vector length one specifying how many observations from each group that training should sample to train the model, default is 1000. Only applicable when sampling argument is set to TRUE.

STAT

optional character vector length one, default is c("mean", "median").

abm_optim

optional character vector length one, default is c("GA", "DE").

optimize_abm_par

optional logical vector length one, default is FALSE. This is passed to the optimization algorithm.

parallel_training

optional logical vector length one, default is FALSE. This is passed to training.

Value

Returns a function that has three arguments: parameters, out, iterations. If out=="action_avg" for the returned function, the average of all the actions is returned by this function; otherwise, the vector of the average for each time is returned by this function. This returns a wrapper function around their abm simulation function to be used for analysis.


JohnNay/eat documentation built on May 7, 2019, noon