# MdpEnvironment: MDP Environment In reinforcelearn: Reinforcement Learning

## Description

Markov Decision Process environment.

## Arguments

 `transitions` [`array (n.states x n.states x n.actions)`] State transition array. `rewards` [`matrix (n.states x n.actions)`] Reward array. `initial.state` [`integer`] Optional starting state. If a vector is given a starting state will be randomly sampled from this vector whenever `reset` is called. Note that states are numerated starting with 0. If `initial.state = NULL` all non-terminal states are possible starting states. `...` [`any`] Arguments passed on to makeEnvironment.

## Usage

`makeEnvironment("MDP", transitions, rewards, initial.state, ...)`

## Methods

• `\$step(action)`
Take action in environment. Returns a list with `state`, `reward`, `done`.

• `\$reset()`
Resets the `done` flag of the environment and returns an initial state. Useful when starting a new episode.

• `\$visualize()`
Visualizes the environment (if there is a visualization function).

## Examples

 ```1 2 3 4 5 6 7 8``` ```# Create a Markov Decision Process. P = array(0, c(2, 2, 2)) P[, , 1] = matrix(c(0.5, 0.5, 0, 1), 2, 2, byrow = TRUE) P[, , 2] = matrix(c(0, 1, 0, 1), 2, 2, byrow = TRUE) R = matrix(c(5, 10, -1, 2), 2, 2, byrow = TRUE) env = makeEnvironment("mdp", transitions = P, rewards = R) env\$reset() env\$step(1L) ```

### Example output

``` 0
\$state
 1

\$reward
 10

\$done
 TRUE
```

reinforcelearn documentation built on May 2, 2019, 9:20 a.m.