Computes reinforcement learning policy from a given state-action table Q. The policy is the decision-making function of the agent and defines the learning agent's behavior at a given time.
Variable which encodes the behavior of the agent. This can be
Returns the learned policy.
1 2 3 4 5
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.