BatchGradientQ: Batch Gradient Q-Learning
In XiaoqiLu/PhD-Thesis: Regularized Q-Learning

Description Usage Arguments Value See Also

Batch Gradient Q-Learning

BatchGradientQ(
  phis,
  discount,
  method = "FQI",
  loss = NULL,
  lambda = 0,
  alpha = 1,
  theta = NULL,
  learning_rate = 1,
  max_iter = 1000,
  tol = 0.001,
  accelerate = TRUE
)

`phis`	a list of processed outcome from `SARS2Phis()`.
`discount`	a numeric number between 0 and 1.
`method`	Q-learning method, choice of "FQI", "GGQ", and "BEM"
`loss`	loss function for evaluation, choice of "MSPBE" and "MSBE"
`lambda`	regularization coefficient
`alpha`	elastic net mixing parameter between 0 (ridge) and 1 (lasso)
`theta`	a numeric vector as model parameter.
`learning_rate`	learning rate for gradient descent
`max_iter`	maximum number of iteration
`tol`	tolerance level for convergence
`accelerate`	if `TRUE`, use accelerated proximal gradient method; otherwise use proximal gradient method.