Simulator: Simulator
In Nth-iteration-labs/contextual: Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Description Details Usage Arguments Methods See Also Examples

The entry point of any contextual simulation.

A Simulator takes, at a minimum, one or more Agent instances, a horizon (the length of an individual simulation, t = {1, ..., T}) and the number of simulations (How many times to repeat each simulation over t = {1, ..., T}, with a new seed on each repeat*).

It then runs all simulations (in parallel by default), keeping a log of all Policy and Bandit interactions in a History instance.

* Note: to be able to fairly evaluate and compare each agent's performance, and to make sure that simulations are replicable, for each separate agent, seeds are set equally and deterministically for each agent over all horizon x simulations time steps.

contextual diagram: simulator

simulator <- Simulator$new(agents,
                           horizon = 100L,
                           simulations = 100L,
                           save_context = FALSE,
                           save_theta = FALSE,
                           do_parallel = TRUE,
                           worker_max = NULL,
                           set_seed = 0,
                           save_interval = 1,
                           progress_file = FALSE,
                           log_interval = 1000,
                           include_packages = NULL,
                           t_over_sims = FALSE,
                           chunk_multiplier = 1,
                           policy_time_loop = FALSE)

agents: An Agent instance or a list of Agent instances.
horizon: integer. The number of pulls or time steps to run each agent, where t = {1, ..., T}.
simulations: integer. How many times to repeat each agent's simulation over t = {1, ..., T}, with a new seed on each repeat (itself deterministically derived from set\_seed).
save_interval: integer. Write data to historyonly every save_interval time steps. Default is 1.
save_context: logical. Save the context matrices X to the History log during a simulation?
save_theta: logical. Save the parameter list theta to the History log during a simulation?
do_parallel: logical. Run Simulator processes in parallel?
worker_max: integer. Specifies how many parallel workers are to be used. If unspecified, the amount of workers defaults to max(workers_available)-1.
t_over_sims: logical. Of use to, among others, offline Bandits. If t_over_sims is set to TRUE, the current Simulator iterates over all rows in a data set for each repeated simulation. If FALSE, it splits the data into simulations parts, and a different subset of the data for each repeat of an agent's simulation.
set_seed: integer. Sets the seed of R's random number generator for the current Simulator.
progress_file: logical. If TRUE, Simulator writes workers_progress.log, agents_progress.log and parallel.log files to the current working directory, allowing you to keep track of respectively workers, agents, and potential errors when running a Simulator in parallel mode.
log_interval: integer. Sets the log write interval. Default every 1000 time steps.
include_packages: List. List of packages that (one of) the policies depend on. If a Policy requires an R package to be loaded, this option can be used to load that package on each of the workers. Ignored if do_parallel is FALSE.
chunk_multiplier: integer By default, simulations are equally divided over available workers, and every worker saves its simulation results to a local history file which is then aggregated. Depending on workload, network bandwith, memory size and other variables it can sometimes be useful to break these workloads into smaller chunks. This can be done by setting the chunk_multiplier to some integer value, where the number of chunks will total chunk_multiplier x number_of_workers.
policy_time_loop: logical In the case of replay style bandits, a Simulator's horizon equals the number of accepted plus the number of rejected data points or samples. If policy_time_loop is TRUE, the horizon equals the number of accepted data points or samples. That is, when policy_time_loop is TRUE, a Simulator will keep running until the number of data points saved to History is equal to the Simulator's horizon.

reset(): Resets a Simulator instance to its original initialisation values.
run(): Runs a Simulator instance.
history: Active binding, read access to Simulator's History instance.

Core contextual classes: Bandit, Policy, Simulator, Agent, History, Plot

Bandit subclass examples: BasicBernoulliBandit, ContextualLogitBandit, OfflineReplayEvaluatorBandit

Policy subclass examples: EpsilonGreedyPolicy, ContextualLinTSPolicy

## Not run: 

  policy    <- EpsilonGreedyPolicy$new(epsilon = 0.1)
  bandit    <- BasicBernoulliBandit$new(weights = c(0.6, 0.1, 0.1))
  agent     <- Agent$new(policy, bandit, name = "E.G.", sparse = 0.5)

  history   <- Simulator$new(agents = agent,
                             horizon = 10,
                             simulations = 10)$run()

  summary(history)

  plot(history)

  dt <- history$get_data_table()

  df <- history$get_data_frame()

  print(history$cumulative$E.G.$cum_regret_sd)

  print(history$cumulative$E.G.$cum_regret)


## End(Not run)

Nth-iteration-labs/contextual documentation built on July 28, 2020, 1:13 p.m.

Nth-iteration-labs/contextual index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Nth-iteration-labs/contextual
Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Simulator: Simulator
In Nth-iteration-labs/contextual: Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Description

Details

Usage

Arguments

Methods

See Also

Examples

Related to Simulator in Nth-iteration-labs/contextual...

R Package Documentation

Browse R Packages

We want your feedback!

Nth-iteration-labs/contextual Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Simulator: Simulator In Nth-iteration-labs/contextual: Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Description

Details

Usage

Arguments

Methods

See Also

Examples

Related to Simulator in Nth-iteration-labs/contextual...

R Package Documentation

Browse R Packages

We want your feedback!

Nth-iteration-labs/contextual
Simulation and Analysis of Contextual Multi-Armed Bandit Policies

Simulator: Simulator
In Nth-iteration-labs/contextual: Simulation and Analysis of Contextual Multi-Armed Bandit Policies