Generates a random MDP problem

Description

Generates a random MDP problem

Usage

1
mdp_example_rand(S, A, is_sparse, mask)

Arguments

S

number of states. S is an integer greater than 0

A

number of actions. A is an integer greater than 0

is_sparse

(optional) used to generate sparse matrices. is_sparse is a boolean. If it is set to true, sparse matrices are generated. By default, it is set to false.

mask

(optional) indicates the possible transitions between states. mask is a [S,S] ma- trix composed of 0 and 1 elements (0 indicates a transition probability always equal to zero). By default, mask is only composed of 1.

Details

mdp_example_rand generates a transition probability matrix (P) and a reward matrix (R). Optional arguments allow to define sparse matrices and pairs of states with impossible transitions.

Value

P

transition probability array. P can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S].

R

reward array. R can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S]. Elements of R are in ]-1; 1[

Examples

1
2
3
4

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.