BatchLinUCBDisjointPolicyEpsilon: Batch Disjoint LinUCB Policy with Epsilon-Greedy
In cramR: Cram Method for Efficient Simultaneous Learning and Evaluation

BatchLinUCBDisjointPolicyEpsilon

R Documentation

Batch Disjoint LinUCB Policy with Epsilon-Greedy

Description

Batch Disjoint LinUCB Policy with Epsilon-Greedy

Details

Implements the disjoint LinUCB algorithm with upper confidence bounds and epsilon-greedy exploration, using batched updates.

Methods

- 'initialize(alpha = 1.0, epsilon = 0.1, batch_size = 1)': Constructor. - 'set_parameters(context_params)': Initializes sufficient statistics for each arm. - 'get_action(t, context)': Selects an arm using UCB scores and epsilon-greedy rule. - 'set_reward(t, context, action, reward)': Updates statistics and refreshes model at batch intervals.

Super class

cramR::NA -> BatchLinUCBDisjointPolicyEpsilon

Public fields

alpha: Numeric, UCB exploration strength parameter.
epsilon: Numeric, probability of taking a random exploratory action.
batch_size: Integer, number of rounds per batch update.
A_cc: List of Gram matrices per arm, accumulated across batch.
b_cc: List of reward-weighted context vectors per arm.
class_name: Internal class name identifier.

Methods

Public methods

BatchLinUCBDisjointPolicyEpsilon$new()
BatchLinUCBDisjointPolicyEpsilon$set_parameters()
BatchLinUCBDisjointPolicyEpsilon$get_action()
BatchLinUCBDisjointPolicyEpsilon$set_reward()
BatchLinUCBDisjointPolicyEpsilon$clone()

Inherited methods

Method `new()`

Constructor for batched LinUCB with epsilon-greedy exploration.

Usage

BatchLinUCBDisjointPolicyEpsilon$new(alpha = 1, epsilon = 0.1, batch_size = 1)

Arguments

alpha: Numeric. UCB width parameter (exploration strength).
epsilon: Numeric. Probability of selecting a random arm.
batch_size: Integer. Number of rounds before updating parameters.

Method `set_parameters()`

Initialize arm-specific parameter containers.

Usage

BatchLinUCBDisjointPolicyEpsilon$set_parameters(context_params)

Arguments

context_params: List containing at least 'unique' (feature size) and 'k' (number of arms).

Method `get_action()`

Chooses an arm based on UCB and epsilon-greedy sampling.

Usage

BatchLinUCBDisjointPolicyEpsilon$get_action(t, context)

Arguments

t: Integer timestep.
context: List containing the context for the decision.

Returns

A list with the selected action.

Method `set_reward()`

Updates arm-specific sufficient statistics based on observed reward. Parameter updates occur only at the end of a batch.

Usage

BatchLinUCBDisjointPolicyEpsilon$set_reward(t, context, action, reward)

Arguments

t: Integer timestep.
context: Context object used for decision-making.
action: List containing the chosen action.
reward: List containing the observed reward.

Returns

Updated internal model parameters.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

BatchLinUCBDisjointPolicyEpsilon$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

cramR documentation built on Aug. 25, 2025, 1:12 a.m.

cramR index

README.md Cram Bandit" Cram Bandit Helpers" Cram Bandit Simulation" Cram ML" Cram Policy part 2" Cram Policy Simulation" Introduction & Cram Policy part 1" Quick Start with CRAM"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cramR
Cram Method for Efficient Simultaneous Learning and Evaluation

BatchLinUCBDisjointPolicyEpsilon: Batch Disjoint LinUCB Policy with Epsilon-Greedy
In cramR: Cram Method for Efficient Simultaneous Learning and Evaluation

Batch Disjoint LinUCB Policy with Epsilon-Greedy

Description

Details

Methods

Super class

Public fields

Methods

Public methods

Method `new()`

Usage

Arguments

Method `set_parameters()`

Usage

Arguments

Method `get_action()`

Usage

Arguments

Returns

Method `set_reward()`

Usage

Arguments

Returns

Method `clone()`

Usage

Arguments

Related to BatchLinUCBDisjointPolicyEpsilon in cramR...

R Package Documentation

Browse R Packages

We want your feedback!

cramR Cram Method for Efficient Simultaneous Learning and Evaluation

BatchLinUCBDisjointPolicyEpsilon: Batch Disjoint LinUCB Policy with Epsilon-Greedy In cramR: Cram Method for Efficient Simultaneous Learning and Evaluation

Batch Disjoint LinUCB Policy with Epsilon-Greedy

Description

Details

Methods

Super class

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Method set_parameters()

Usage

Arguments

Method get_action()

Usage

Arguments

Returns

Method set_reward()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Related to BatchLinUCBDisjointPolicyEpsilon in cramR...

R Package Documentation

Browse R Packages

We want your feedback!

cramR
Cram Method for Efficient Simultaneous Learning and Evaluation

BatchLinUCBDisjointPolicyEpsilon: Batch Disjoint LinUCB Policy with Epsilon-Greedy
In cramR: Cram Method for Efficient Simultaneous Learning and Evaluation

Method `new()`

Method `set_parameters()`

Method `get_action()`

Method `set_reward()`

Method `clone()`