nchain_function: Nchain func

Description Usage Arguments Details Examples

Description

A simple function used as an example for Reinforcement learning.

Usage

1

Arguments

state

An integer (1:5) representing the current state you are in the chain.

action

An integer (1:2) representing what action to take. 1 equates to moving backwards along the chain, and 2 moves further down the chain.

Details

It consists of five states, and two actions. Moving right from state 1 through to state 5 offers no reward, and moving back along the chain at any point leads you back to the beginning with a small reward. There is a large reward by trying to exceed state 5. The purpose is try and teach an algorithm to learn to wait for longer term benefits. This is intended to be used in conjunction with build_gym_env

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# no reward
nchain_function(1, 2)

# small reward
nchain_function(2, 1)

# big reward
nchain_function(5, 2)

# incorporating into a new environment
nchain <- build_gym_env(func = nchain_function,
  action_space = c(1, 2),
  observation_space = 1:5)

liamgilbey/gymr documentation built on June 1, 2019, 3:56 a.m.