Description Usage Arguments Value Examples
View source: R/UCB_rejection_sampling.R
UCB algorithme with rejection sampling method Exclud any choices which not corresponds to real exepriments in dataset Stop if something is wrong. Generate a matrix to save the results (S).
At each iteration
Calculates the arm probabilities
Choose the arm with the maximum upper bound (with alpha parameter)
Receives a reward in visitor_reward for the arm and associated iteration
Updates the results matrix S.
Returns the calculation time.
Review the estimated, actual averages and number of choices for each arm.
See also ConditionForUCB
, GenerateMatrixS
,
ProbaMaxForUCB
and PlayArm
.
Require tic
and toc
from tictoc
library
1 2 | UCB_rejection_sampling(visitorReward, K = ncol(visitorReward),
alpha = 1)
|
K |
Integer value (optional) |
alpha |
Numeric value (optional) |
visitor_reward |
Dataframe of integer or numeric values |
List of element:
S:numerical matrix of results ,
choice: choices of UCB,
proba: probability of the chosen arms,
time: time of cumputation,
theta_hat: mean estimated of each arm
theta: real mean of each arm
1 2 3 4 5 6 7 8 9 10 11 12 | ## Generates 10000 numbers from 2 binomial distributions
set.seed(4434)
K1 <- rbinom(1000, 1, 0.6)
K2 <- rbinom(1000, 1, 0.7)
## Define a dataframe of rewards
visitor_reward <- as.data.frame(cbind(K1,K2) )
#remove data
temp_list <- sample(1:nrow(visitor_reward), 500, replace = FALSE, prob = NULL)
visitor_reward$K1[temp_list] <- NA
visitor_reward$K2[-temp_list] <- NA
#run ucb on missing data
ucb_alloc <- UCB_rejection_sampling(visitor_reward,alpha = 10)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.