Solves discounted MDP with linear programming

1 | ```
mdp_LP(P, R, discount)
``` |

`P` |
transition probability array. P is a 3 dimensions array [S,S,A]. Sparse matrix are not supported. |

`R` |
reward array. R can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S] or a 2 dimensional matrix [S,A] possibly sparse. |

`discount` |
discount factor. discount is a real which belongs to ]0; 1[ |

mdp_LP applies linear programming to solve discounted MDP for non-sparse matrix only.

`V` |
optimal value fonction. V is a S length vector |

`policy ` |
optimal policy. policy is a S length vector. Each element is an integer corresponding to an action which maximizes the value function |

`cpu_time` |
CPU time used to run the program |

1 2 3 4 5 6 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.