Description Usage Arguments Value Author(s) See Also
The policy can afterwards be recieved using functions
getPolicy
and getPolicyW
.
1 | policyIteAve(mdp, w, dur, maxIte = 100)
|
mdp |
The MDP loaded using loadMDP. |
w |
The label of the weight we optimize. |
dur |
The label of the duration/time such that discount rates can be calculated. |
maxIte |
Max number of iterations. If the model does not satisfy the unichain assumption the algorithm may loop. |
The optimal gain (g) calculated.
Lars Relund lars@relund.dk
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.