Function defines an environment for a 2x2 gridworld example. Here an agent is intended to navigate from an arbitrary starting position to a goal position. The grid is surrounded by a wall, which makes it impossible for the agent to move off the grid. In addition, the agent faces a wall between s1 and s4. If the agent reaches the goal position, it earns a reward of 10. Crossing each square of the grid results in a negative reward of -1.
The current state.
Action to be executed.
List containing the next state and the reward.
1 2 3 4 5 6 7 8 9
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.