makeEnvironment: Create reinforcement learning environment.
In markdumke/reinforceR: Reinforcement Learning

makeEnvironment

R Documentation

Create reinforcement learning environment.

Description

This function creates an environment for reinforcement learning.

Usage

makeEnvironment(class = "custom", discount = 1, ...)

Arguments

`class`	[`character(1)`] Class of environment. One of `c("custom", "mdp", "gym", "gridworld")`.
`discount`	[`numeric(1)` in (0, 1)] Discount factor.
`...`	[`any`] Arguments passed on to the specific environment.

Details

Use the step method to interact with the environment.

Note that all states and actions are numerated starting with 0!

For a detailed explanation and more examples have a look at the vignette "How to create an environment?".

Value

R6 class of class Environment.

Methods

$step(action)
Take action in environment. Returns a list with state, reward, done.
$reset()
Resets the done flag of the environment and returns an initial state. Useful when starting a new episode.
$visualize()
Visualizes the environment (if there is a visualization function).

Environments

Environment
GymEnvironment
MdpEnvironment
Gridworld
MountainCar

Examples

step = function(self, action) {
  state = list(mean = action + rnorm(1), sd = runif(1))
  reward = rnorm(1, state[[1]], state[[2]])
  done = FALSE
  list(state, reward, done)
}

reset = function(self) {
  state = list(mean = 0, sd = 1)
  state
}

env = makeEnvironment(step = step, reset = reset, discount = 0.9)
env$reset()
env$step(100)

# Create a Markov Decision Process.
P = array(0, c(2, 2, 2))
P[, , 1] = matrix(c(0.5, 0.5, 0, 1), 2, 2, byrow = TRUE)
P[, , 2] = matrix(c(0, 1, 0, 1), 2, 2, byrow = TRUE)
R = matrix(c(5, 10, -1, 2), 2, 2, byrow = TRUE)
env = makeEnvironment("mdp", transitions = P, rewards = R)

env$reset()
env$step(1L)

# Create a Gridworld.
grid = makeEnvironment("gridworld", shape = c(4, 4),
  goal.states = 15, initial.state = 0)
grid$visualize()

## Not run: 
# Create an OpenAI Gym environment.
# Make sure you have Python, gym and reticulate installed.
env = makeEnvironment("gym", gym.name = "MountainCar-v0")

# Take random actions for 200 steps.
env$reset()
for (i in 1:200) {
  action = sample(env$actions, 1)
  env$step(action)
  env$visualize()
}
env$close()

## End(Not run)

markdumke/reinforceR documentation built on Nov. 17, 2022, 12:53 a.m.