policy: Extract the Policy from a POMDP/MDP

View source: R/policy.R

policyR Documentation

Extract the Policy from a POMDP/MDP

Description

Extracts the policy from a solved POMDP/MDP.

Usage

policy(x, alpha = TRUE, action = TRUE)

Arguments

x

A solved POMDP or MDP object.

alpha

logical; include the parameters of the alpha vector defining the segment (POMDP only).

action

logical; include the action for that segment (POMDP only).

Details

A list (one entry per epoch) with the optimal policy. For converged, infinite-horizon problems solutions, a list with only the converged solution is produced. For a POMDP, the policy is a data.frame consisting of:

  • Part 1: The value function with one column per state (alpha vectors).

  • Part 2: The last column contains the prescribed action.

For an MDP, the policy is a data.frame consisting of:

  • The state

  • The state's discounted expected utility U if the policy is followed

  • The prescribed action

Value

A list with the policy for each epoch.

Author(s)

Michael Hahsler

See Also

Other policy: estimate_belief_for_nodes(), optimal_action(), plot_belief_space(), plot_policy_graph(), policy_graph(), projection(), reward(), solve_POMDP(), solve_SARSOP(), value_function()

Examples

data("Tiger")

# Infinite horizon
sol <- solve_POMDP(model = Tiger)
sol

# policy with value function, optimal action and transitions for observations.
policy(sol)
plot_value_function(sol)

# Finite horizon (we use incremental pruning because grid does not converge)
sol <- solve_POMDP(model = Tiger, method = "incprune", horizon = 3, discount = 1)
sol

policy(sol)
# Note: We see that it is initially better to listen till we make a decision in the final epoch.

# MDP policy
data(Maze)

sol <- solve_MDP(Maze)

policy(sol)

pomdp documentation built on Sept. 9, 2023, 1:07 a.m.