bandit_thompson-class: A Thompson sampling bandit reference class (RC) object.
In rferrali/banditr: Estimation of Multi-Armed Bandit Algorithms

bandit_thompson-class

R Documentation

A Thompson sampling bandit reference class (RC) object.

Description

A Thompson sampling bandit reference class (RC) object.

Usage

bandit_stan_lm(formula, data, gamma = 1,
               contrasts = NULL, newLevels = FALSE,
               db = NULL, path = NULL)
bandit_stan_glm(formula, data, family = c("gaussian", "binomial"),
                gamma = 1, contrasts = NULL, newLevels = FALSE,
                db = NULL, path = NULL)
bandit_stan_glmer(formula, data, family = c("gaussian", "binomial"),
                  gamma = 1, contrasts = NULL, newLevels = FALSE,
                  db = NULL, path = NULL)

Arguments

`formula`	an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model that is fitted. The response must be named `y`
`data`	a data frame (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. `data` must contain a column named `id` that uniquely identifies each observation, and a column named `y` that contains the model response.
`family`	a character string describing the error distribtion and link function to be used in the model. Can be either `"binomial"` or `"gaussian"` (the default).
`gamma`	the Thompson sampling tuning parameter. A positive scalar. Higher values of `gamma` favor exploitation over exploration.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`newLevels`	a logical value indicating whether to allow for new factor levels when adding samples. Default is FALSE.
`db`	an optional named list of arguments passed to `odbcDriverConnect`.
`path`	an optional character string naming a folder open for writing.

Details

The RC class "bandit_thompson" inherits from class "bandit". Three classes inherit from "bandit_thompson": "bandit_stan_lm", "bandit_stan_glm", and "bandit_stan_glmer", for linear, generalized linear, and mixed effect models respectively.

The introductory vignette provides a detailed explanation of Thompson sampling algorithms, and their implementation with banditr. See the Examples section.

Fields

gamma: the Thompson sampling tuning parameter.

Methods

train(..., seed = NULL) train the model using the relevant function from [rstanarm]rstanarm and all completed experiments. ... are additional parameters passed on to the relevant function in rstanarm. seed is an optional seeding value for the random number generator.

tune() currently not supported.

addSamples(df) add samples to the bandit. df is coercible to a data.frame, and can be appended to the data.frame used at creation. In particular, it contains an id column that is a primary key.

addOutcomes(y) add outcomes to the bandit. y is a named vector whose names are samples ids.

undo() cancel the last job.