tilecoding: Tile Coding

Description Usage Arguments Details Value References Examples

Description

Implementation of Sutton's tile coding software version 3.

Usage

1
2
3
tiles(iht, n.tilings, state, action = integer(0))

iht(max.size)

Arguments

iht

[IHT]
A hash table created with iht.

n.tilings

[integer(1)]
Number of tilings.

state

[vector(2)]
A two-dimensional state observation. Make sure to scale the observation to unit variance before.

action

[integer(1)]
Optional: If supplied the action space will also be tiled. All distinct actions will result in different tile numbers.

max.size

[integer(1)]
Maximal size of hash table.

Details

Tile coding is a way of representing the values of a vector of continuous variables as a large binary vector with few 1s and many 0s. The binary vector is not represented explicitly, but as a list of the components that are 1s. The main step is to partition, or tile, the continuous space multiple times and select one tile from each tiling, that corresponding the the vector's value. Each tile is converted to an element in the big binary vector, and the list of the tile (element) numbers is returned as the representation of the vector's value. Tile coding is recommended as a way of applying online learning methods to domains with continuous state or action variables. [copied from manual]

See detailed manual on the web. In comparison to the Python implementation indices start with 1 instead of 0. The hash table is implemented as an environment, which is an attribute of an R6 class.

Make sure that the size of the hash table is large enough, else an error will be triggered, when trying to assign a value to a full hash table.

Value

iht creates a hash table, which can then be passed on to tiles. tiles returns an integer vector of size n.tilings with the active tile numbers.

References

Sutton and Barto (Book draft 2017): Reinforcement Learning: An Introduction

Examples

1
2
3
4
5
6
7
8
# Create hash table
hash = iht(1024)

# Partition state space using 8 tilings
tiles(hash, n.tilings = 8, state = c(3.6, 7.21))
tiles(hash, n.tilings = 8, state = c(3.7, 7.21))
tiles(hash, n.tilings = 8, state = c(4, 7))
tiles(hash, n.tilings = 8, state = c(- 37.2, 7))

markusdumke/reinforcelearn documentation built on May 31, 2019, 8:48 p.m.