Agent: Agent

Description Details Schematic Usage Arguments Methods See Also Examples

Description

Keeps track of one Bandit and Policy pair.

Details

Controls the running of one Bandit and Policy pair over t = {1, ..., T} looping over, consecutively, bandit$get_context(), policy$get_action(), bandit$get_reward() and policy$set_reward() for each time step t.

Schematic

contextual diagram: simulator

Usage

1
agent <- Agent$new(policy, bandit, name=NULL, sparse = 0.0)

Arguments

policy

Policy instance.

bandit

Bandit instance.

name

character; sets the name of the Agent. If NULL (default), Agent generates a name based on its Policy instance's name.

sparse

numeric; artificially reduces the data size by setting a sparsity level for the current Bandit and Policy pair. When set to a value between 0.0 (default) and 1.0 only a fraction sparse of the Bandit's data is randomly chosen to be available to improve the Agent's Policy through policy$set_reward.

Methods

new()

generates and instantializes a new Agent instance.

do_step()

advances a simulation by one time step by consecutively calling bandit$get_context(), policy$get_action(), bandit$get_reward() and policy$set_reward(). Returns a list of lists containing context, action, reward and theta.

set_t(t)

integer; sets the current time step to t.

get_t()

returns current time step t.

See Also

Core contextual classes: Bandit, Policy, Simulator, Agent, History, Plot

Bandit subclass examples: BasicBernoulliBandit, ContextualLogitBandit, OfflineReplayEvaluatorBandit

Policy subclass examples: EpsilonGreedyPolicy, ContextualLinTSPolicy

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 

  policy    <- EpsilonGreedyPolicy$new(epsilon = 0.1)
  bandit    <- BasicBernoulliBandit$new(weights = c(0.6, 0.1, 0.1))

  agent     <- Agent$new(policy, bandit, name = "E.G.", sparse = 0.5)

  history   <- Simulator$new(agents = agent,
                             horizon = 10,
                             simulations = 10)$run()

## End(Not run)

Nth-iteration-labs/contextual documentation built on March 10, 2020, 6:50 a.m.