Description Usage Arguments Value Examples
This function is aimed to simulate data in different scenarios to
compare various strategies in Multi-Armed Bandit.
Users can specify the distribution of the number of arms,
the distribution of mean reward, the distribution of the number of pulls
in one period and the stationariness to simulate different scenarios.
Relative regret is returned and average relative regret plot is returned
if needed.
See SimulateMultiplePeriods
for more details.
1 2 3 4 5 | SimulateMultipleMethods(method = "Thompson-Sampling",
method.par = list(ndraws.TS = 1000), iter, nburnin, nperiod,
reward.mean.family, reward.family, narms.family, npulls.family,
stationary = TRUE, nonstationary.type = NULL, data.par,
regret.plot = FALSE)
|
method |
A vector of character strings choosing from "Epsilon-Greedy",
"Epsilon-Decreasing", "Thompson-Sampling",
"EXP3", "UCB", "Bayes-Poisson-TS", "Greedy-Thompson-Sampling",
"EXP3-Thompson-Sampling",
"Greedy-Bayes-Poisson-TS", "EXP3-Bayes-Poisson-TS" and "HyperTS".
See |
method.par |
A list of parameters needed for different methods:
|
iter |
A positive integer specifying the number of iterations. |
nburnin |
A positive integer specifying the number of periods to allocate each arm equal traffic before applying any strategy. |
nperiod |
A positive integer specifying the number of periods to apply various strategies. |
reward.mean.family |
A character string specifying the distribution family to generate mean reward of each arm. Available distribution includes "Uniform", "Beta" and "Gaussian". |
reward.family |
A character string specifying the distribution family
of reward. Available distribution includes
"Bernoulli", "Poisson" and "Gaussian".
If "Gaussian" is chosen to be the reward distribution,
a vector of standard deviation should be provided in
|
narms.family |
A character string specifying the distribution family of the number of arms. Available distribution includes "Poisson" and "Binomial". |
npulls.family |
A character string specifying the distribution family of the number of pulls per period. For continuous distribution, the number of pulls will be rounded up. Available distribution includes "Log-Normal" and "Poisson". |
stationary |
A logic value indicating whether a stationary Multi-Armed Bandit is considered (corresponding to the case that the reward mean is unchanged). Default to be TRUE. |
nonstationary.type |
A character string indicating how the mean reward varies. Available types include "Random Walk" and "Geometric Random Walk" (reward mean follows random walk in the log scale). Default to be NULL. |
data.par |
A list of data generating parameters:
|
regret.plot |
A logic value indicating whether an average regret plot is returned. Default to be FALSE. |
a list consisting of:
regret.matrix |
A three-dimensional array with each dimension corresponding to the period, iteration and method. |
regret.plot.object |
If regret.plot = TRUE, a ggplot object is returned. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ### Compare Epsilon-Greedy and Thompson Sampling in the stationary case.
set.seed(100)
res <- SimulateMultipleMethods(
method = c("Epsilon-Greedy", "Thompson-Sampling"),
method.par = list(epsilon = 0.1, ndraws.TS = 1000),
iter = 100,
nburnin = 30,
nperiod = 180,
reward.mean.family = "Uniform",
reward.family = "Bernoulli",
narms.family = "Poisson",
npulls.family = "Log-Normal",
data.par = list(reward.mean = list(min = 0, max = 0.1),
npulls.family = list(meanlog = 3, sdlog = 1.5),
narms.family = list(lambda = 5)),
regret.plot = TRUE)
res$regret.plot.object
### Compare Epsilon-Greedy, Thompson Sampling and EXP3 in the non-stationary case.
set.seed(100)
res <- SimulateMultipleMethods(
method = c("Epsilon-Greedy", "Thompson-Sampling", "EXP3"),
method.par = list(epsilon = 0.1,
ndraws.TS = 1000,
EXP3 = list(gamma = 0, eta = 0.1)),
iter = 100,
nburnin = 30,
nperiod = 90,
reward.mean.family = "Beta",
reward.family = "Bernoulli",
narms.family = "Binomial",
npulls.family = "Log-Normal",
stationary = FALSE,
nonstationary.type = "Geometric Random Walk",
data.par = list(reward.mean = list(shape1 = 2, shape2 = 5),
npulls.family = list(meanlog = 3, sdlog = 1),
narms.family = list(size = 10, prob = 0.5),
nonstationary.family = list(sdlog = 0.05)),
regret.plot = TRUE)
res$regret.plot.object
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.