fit_models: Fit a stan model to multiple datasets
In Ewan-Keith/rstansim: Simulation Studies with Stan

Description Usage Arguments Value Examples

fit_models() fits a stan model across multiple datasets, collates, and returns summary information and data for all fitted models as a stansim_simulation object. All fitted models have basic reproducibility information recorded; such as parameter inits and seeds, along with parameter estimates, and simulation information such as time and date ran.

Raw stan posterior samples are not returned, rather the user specifies the estimates they wish to record (e.g. posterior percentiles, Rhat, etc.) and the parameters for which they wish to record these estimates. All data is collated into a single, tidy dataframe for further analysis.

By default the function caches completed runs as it progresses, so that progress is not lost in the case of function failure. By simply running the function with the same calls in the same working directory it will pick up where it left off. When the function terminates as expected this cache is removed.

fit_models(sim_name = paste0("Stansim_", Sys.time()), sim_data = NULL,
  stan_args = list(), calc_loo = FALSE, use_cores = 1L,
  parameters = "all", probs = c(0.025, 0.25, 0.5, 0.75, 0.975),
  estimates = c("mean", "se_mean", "sd", "n_eff", "Rhat"),
  stan_warnings = "catch", cache = TRUE, seed = floor(stats::runif(1, 1,
  1e+05)))

`sim_name`	A name attached to the `stansim_simulation` object to help identify it. It is strongly recommended that an informative name is assigned, especially if `stansim_simulation` objects are to be combined in to a `stansim_collection` object for management of results.
`sim_data`	Either an object of class `stansim_data` or a vector of strings pointing to the location of .rds files containing the simulation data. See the vignette on producing simulation data for details on the formatting of these datasets.
`stan_args`	A list of function arguments to be used by the internal `rstan::sampling()` function when fitting the models. If not specified then the `rstan::sampling()` function defaults are used.
`calc_loo`	If `TRUE` then model fit statistics will be calculated using the `loo` package. If `TRUE` there must be a valid log_lik quantity specified in the generated quantities section of the provided stan model.
`use_cores`	Number of cores to use when running in parallel. Each stan model is fitted serially regardless of the number of chains ran, as parallelisation across models is more flexible than within.
`parameters`	A character vector indicating which parameters should have estimates returned and stored from the fitted models. By default all parameters are returned, for non-scalar parameters you cannot select subsets of the parameter (e.g. must request `theta` rather than `theta[1]`).
`probs`	A numeric vector of values between 0 and 1. Corresponding quantiles will be estimated and returned for all fitted models.
`estimates`	A character vector of non-quantile estimates to be returned for each model parameter. Argument must be some subset of the default character vector.
`stan_warnings`	How warnings returned by individual `stan` instances should be handled. `"catch"` records all warnings in the returned object alongside other instance level data, `"print"` simply prints warnings to the console as the models are fit (default `stan` behaviour), and `"suppress"` suppresses all warnings without recording them.
`cache`	If `TRUE` then the results for each instance are written to a local, temporary file so that data is not lost should the function not terminate properly. This temporary data is removed upon the model terminating as expected. If `FALSE` no data is written and results are only returned upon the correct termination of the whole function. The default value of `TRUE` is recommended unless there are relevant write-permission restrictions.
`seed`	Set a seed for the function.

An S3 object of class stansim_simulation recording relevant simulation data.

## Not run: 
# specify arguments for stan
StanArgs <- list(file = '8schools.stan',
                 iter = 1000, chains = 4)

# get number of cores
core_num <- parallel::detectCores()

# get the list of data file locations
datasets <- dir("data/repo", full.names = TRUE)

# fit the model to all datasets using specified stan arguments
# store the specified estimates for all parameters
simulation <- fit_models(
  sim_name = "stansim simulation",
  sim_data = datasets,
  stan_args = StanArgs,
  calc_loo = T,
  use_cores = core_num,
  probs =  c(.025, .5, .975),
  estimates = c("mean", "n_eff", "Rhat")
)

## End(Not run)