ridge_muhat_lfo_pai: Leave-future-out ridge-based estimates for arm expected...

View source: R/experiment_utils.R

ridge_muhat_lfo_paiR Documentation

Leave-future-out ridge-based estimates for arm expected rewards.

Description

Computes leave-future-out ridge-basedn estimates of arm expected rewards based on provided data.

Usage

ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes, alpha = 1)

Arguments

xs

Matrix. Covariates of shape [A, p], where A is the number of observations and p is the number of features. Must not contain NA values.

ws

Integer vector. Indicates which arm was chosen for observations at each time t. Length A. Must not contain NA values.

yobs

Numeric vector. Observed outcomes, length A. Must not contain NA values.

K

Integer. Number of arms. Must be a positive integer.

batch_sizes

Integer vector. Sizes of batches in which data is processed. Must be positive integers.

alpha

Numeric. Ridge regression regularization parameter. Default is 1.

Value

A 3D array containing the expected reward estimates for each arm and each time t, of shape [A, A, K].

Examples

set.seed(123)
p <- 3
K <- 5
A <- 100
xs <- matrix(runif(A * p), nrow = A, ncol = p)
ws <- sample(1:K, A, replace = TRUE)
yobs <- runif(A)
batch_sizes <- c(25, 25, 25, 25)
muhat <- ridge_muhat_lfo_pai(xs, ws, yobs, K, batch_sizes)
print(muhat)


banditsCI documentation built on April 12, 2025, 1:42 a.m.