View source: R/experiment_utils.R
ridge_muhat_lfo_pai | R Documentation |
Computes leave-future-out ridge-basedn estimates of arm expected rewards based on provided data.
ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes, alpha = 1)
xs |
Matrix. Covariates of shape |
ws |
Integer vector. Indicates which arm was chosen for observations at each time |
yobs |
Numeric vector. Observed outcomes, length |
K |
Integer. Number of arms. Must be a positive integer. |
batch_sizes |
Integer vector. Sizes of batches in which data is processed. Must be positive integers. |
alpha |
Numeric. Ridge regression regularization parameter. Default is 1. |
A 3D array containing the expected reward estimates for each arm and each time t
, of shape [A, A, K]
.
set.seed(123)
p <- 3
K <- 5
A <- 100
xs <- matrix(runif(A * p), nrow = A, ncol = p)
ws <- sample(1:K, A, replace = TRUE)
yobs <- runif(A)
batch_sizes <- c(25, 25, 25, 25)
muhat <- ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes)
print(muhat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.