aw_var: Variance of policy value estimator via non-contextual...
In banditsCI: Bandit-Based Experiments and Policy Evaluation

aw_var

R Documentation

Variance of policy value estimator via non-contextual adaptive weighting.

Description

Computes the variance of a policy value estimate based on AIPW scores, a policy matrix, and non-contextual adaptive weights.

Usage

aw_var(scores, estimate, policy, evalwts = NULL)

Arguments

`scores`	Numeric matrix. AIPW scores, shape `[A, K]`. Must not contain NA values.
`estimate`	Numeric scalar. Policy value estimate.
`policy`	Numeric matrix. Policy matrix `\pi(X_t, w)`, shape `[A, K]`. Must have the same shape as `scores` and must not contain NA values.
`evalwts`	Optional numeric vector. Non-contextual adaptive weights `h_t`, length `A`, or `NULL`.

Value

Numeric scalar. Variance of policy value estimate.

Examples

scores <- matrix(c(0.5, 0.8, 0.6,
                   0.3, 0.9, 0.2,
                   0.5, 0.7, 0.4,
                   0.8, 0.2, 0.6), ncol = 3, byrow = TRUE)
policy <- matrix(c(0.2, 0.3, 0.5,
                   0.6, 0.1, 0.3,
                   0.4, 0.5, 0.1,
                   0.2, 0.7, 0.1), ncol = 3, byrow = TRUE)
estimate <- aw_estimate(scores = scores, policy = policy, evalwts = c(0.5, 1, 0.5, 1.5))
aw_var(scores = scores, estimate = estimate, policy = policy, evalwts = c(0.5, 1, 0.5, 1.5))

banditsCI documentation built on April 12, 2025, 1:42 a.m.