mdp_learning: mdp learning
In boettiger-lab/mdplearning: Bayesian Learning Algorithms for Markov Decision Processes

Description Usage Arguments Value Examples

Simulate learning under the mdp policy

1
2
3

mdp_learning(transition, reward, discount, model_prior = NULL, x0,
  Tmax = 20, true_transition, observation = NULL, a0 = 1,
  model_names = NA, ...)

`transition`	list of transition matrices, one per model
`reward`	the utility matrix U(x,a) of being at state x and taking action a
`discount`	the discount factor (1 is no discounting)
`model_prior`	the prior belief over models, a numeric of length(transitions). Uniform by default
`x0`	initial state
`Tmax`	termination time for finite time calculation, ignored otherwise
`true_transition`	actual transition used to drive simulation.
`observation`	NULL by default, simulate perfect observations
`a0`	previous action before starting, irrelivant unless actions influence observations and true_observation is not null
`model_names`	optional vector of names for columns in model posterior distribution. Will be taken from names of transition list if none are provided here.
`...`	additional arguments to `mdp_compute_policy`

a list, containing: data frame "df" with the state, action and a value at each time step in the simulation, and a data.frame "posterior", in which the t'th row shows the belief state at time t.

source(system.file("examples/K_models.R", package="mdplearning"))
transition <- lapply(models, `[[`, "transition")
reward <- models[[1]]$reward

## example where true model is model 1
out <- mdp_learning(transition, reward, discount, x0 = 10,
                    Tmax = 20, true_transition = transition[[1]])
## Did we learn which one was the true model?
out$posterior[20,]

## Simulate MDP strategy under observation uncertainty
out <- mdp_learning(transition = transition, reward, discount, x0 = 10,
               true_transition = transition[[1]],
               Tmax = 20, observation = models[[1]]$observation)

boettiger-lab/mdplearning documentation built on May 13, 2019, 8:23 a.m.

boettiger-lab/mdplearning index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

boettiger-lab/mdplearning
Bayesian Learning Algorithms for Markov Decision Processes

mdp_learning: mdp learning
In boettiger-lab/mdplearning: Bayesian Learning Algorithms for Markov Decision Processes

Description

Usage

Arguments

Value

Examples

Related to mdp_learning in boettiger-lab/mdplearning...

R Package Documentation

Browse R Packages

We want your feedback!

boettiger-lab/mdplearning Bayesian Learning Algorithms for Markov Decision Processes

mdp_learning: mdp learning In boettiger-lab/mdplearning: Bayesian Learning Algorithms for Markov Decision Processes

Description

Usage

Arguments

Value

Examples

Related to mdp_learning in boettiger-lab/mdplearning...

R Package Documentation

Browse R Packages

We want your feedback!

boettiger-lab/mdplearning
Bayesian Learning Algorithms for Markov Decision Processes

mdp_learning: mdp learning
In boettiger-lab/mdplearning: Bayesian Learning Algorithms for Markov Decision Processes