mdp_eval_policy_optimality: Computes sets of 'near optimal' actions for each state

Description Usage Arguments Details Value Examples

Description

Determines sets of 'near optimal' actions for all states

Usage

1
mdp_eval_policy_optimality(P, R, discount, Vpolicy)

Arguments

P

transition probability array. P can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S].

R

reward array. R can be a 3 dimensions array [S,S,A] or a list [[A]], each element containing a sparse matrix [S,S] or a 2 dimensional matrix [S,A] possibly sparse.

discount

discount factor. discount is a real number which belongs to [0; 1[.

Vpolicy

value function of the optimal policy. Vpolicy is a S length vector.

Details

For some states, the evaluation of the value function may give close results for different actions. It is interesting to identify those states for which several actions have a value function very close the optimal one (i.e. less than 0.01 different). We called this the search for near optimal actions in each state.

Value

multiple

existence of at least two 'nearly' optimal actions for a state. multiple is egal to true when at least one state has several epsilon-optimal actions, false if not.

optimal_actions

actions 'nearly' optimal for each state. optimal_actions is a [S,A] boolean matrix whose element optimal_actions(s, a) is true if the action a is 'nearly' optimal being in state s and false if not.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# With a non-sparse matrix
P <- array(0, c(2,2,2))
P[,,1] <- matrix(c(0.5, 0.5, 0.8, 0.2), 2, 2, byrow=TRUE)
P[,,2] <- matrix(c(0, 1, 0.1, 0.9), 2, 2, byrow=TRUE)
R <- matrix(c(5, 10, -1, 2), 2, 2, byrow=TRUE)
Vpolicy <- c(42.4419, 36.0465)
mdp_eval_policy_optimality(P, R, 0.9, Vpolicy)

# With a sparse matrix
P <- list()
P[[1]] <- Matrix(c(0.5, 0.5, 0.8, 0.2), 2, 2, byrow=TRUE, sparse=TRUE)
P[[2]] <- Matrix(c(0, 1, 0.1, 0.9), 2, 2, byrow=TRUE, sparse=TRUE)
mdp_eval_policy_optimality(P, R, 0.9, Vpolicy)


Search within the MDPtoolbox package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.