Description Usage Arguments Details Value Examples
Solves discounted MDP with linear programming
1 | mdp_LP(P, R, discount)
|
P |
transition probability array. P is a 3 dimensions array [S,S,A]. Sparse matrix are not supported. |
R |
reward array. R can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S] or a 2 dimensional matrix [S,A] possibly sparse. |
discount |
discount factor. discount is a real which belongs to ]0; 1[ |
mdp_LP applies linear programming to solve discounted MDP for non-sparse matrix only.
V |
optimal value fonction. V is a S length vector |
policy |
optimal policy. policy is a S length vector. Each element is an integer corresponding to an action which maximizes the value function |
cpu_time |
CPU time used to run the program |
1 2 3 4 5 6 |
Loading required package: Matrix
Loading required package: linprog
Loading required package: lpSolve
$V
[1] 42.44186 36.04651
$policy
[1] 2 1
$time
Time difference of 0.1772285 secs
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.