This package contains functions that allow the user to investigate the performance of a particular non-parametric approach to modelling expected reward functions in the contextual multi-armed bandit (MAB) setting. We use Thompson sampling in order to explore and choose the actions, and we partition the context space in order to better approximate the true expected-reward functions.
Package details |
|
---|---|
Author | Douglas Corbin |
Maintainer | Douglas Corbin <doug.corbin@bristol.ac.uk> |
License | GPL (>= 2) |
Version | 0.1.0 |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.