policy: Extract the Policy from a POMDP/MDP
In pomdp: Infrastructure for Partially Observable Markov Decision Processes (POMDP)

policy

R Documentation

Extract the Policy from a POMDP/MDP

Description

Extracts the policy from a solved POMDP/MDP.

Usage

policy(x, drop = TRUE)

Arguments

`x`	A solved POMDP or MDP object.
`drop`	logical; drop the list for converged, epoch-independent policies.

Details

A list (one entry per epoch) with the optimal policy. For converged, infinite-horizon problems solutions, a list with only the converged solution is produced. For a POMDP, the policy is a data.frame consisting of:

Part 1: The alpha vectors for the belief states (defines also the utility of the belief). The columns have the names of states.
Part 2: The last column named action contains the prescribed action.

For an MDP, the policy is a data.frame with columns for:

state: The state.
U: The state's value (discounted expected utility U) if the policy is followed
action: The prescribed action.

Value

A list with the policy for each epoch. Converged policies have only one element. If drop = TRUE then the policy is returned without a list.

Author(s)

Michael Hahsler

Examples

data("Tiger")

# Infinite horizon
sol <- solve_POMDP(model = Tiger)
sol

# policy with value function, optimal action and transitions for observations.
policy(sol)
plot_value_function(sol)

# Finite horizon (we use incremental pruning because grid does not converge)
sol <- solve_POMDP(model = Tiger, method = "incprune", 
  horizon = 3, discount = 1)
sol

policy(sol)
# Note: We see that it is initially better to listen till we make 
#       a decision in the final epoch.

# MDP policy
data(Maze)

sol <- solve_MDP(Maze)

policy(sol)

pomdp documentation built on April 3, 2025, 10:58 p.m.