View source: R/estimate_belief_for_nodes.R
estimate_belief_for_nodes | R Documentation |
Estimate a belief for each alpha vector (segment of the value function) which represents a node in the policy graph.
estimate_belief_for_nodes(
x,
method = "auto",
belief = NULL,
verbose = FALSE,
...
)
x |
object of class POMDP containing a solved and converged POMDP problem. |
method |
character string specifying the estimation method. Methods include
|
belief |
start belief used for method trajectories. |
verbose |
logical; show which method is used. |
... |
parameters are passed on to |
estimate_belief_for_nodes()
can estimate the belief in several ways:
Use belief points explored by the solver. Some solvers return explored belief points. We assign the belief points to the nodes and average each nodes belief.
Follow trajectories (breadth first) till all policy graph nodes have been visited and
return the encountered belief. This implementation returns the first (i.e., shallowest) belief point
that is encountered is used and no averaging is performed. parameter n
can be used to
limit the number of nodes searched.
Sample a large set of possible belief points, assigning them to the nodes and then averaging
the belief over the points assigned to each node. This will return a central belief for the node.
Additional parameters like method
and the sample size n
are passed on to sample_belief_space()
.
If no belief point is generated for a segment, then a
warning is produced. In this case, the number of sampled points can be increased.
Notes:
Each method may return a different answer. The only thing that is guaranteed is that the returned belief falls in the range where the value function segment is maximal.
If some nodes not belief points are sampled, or the node is not reachable from the initial belief,
then a vector with all NaN
s will be returned with a warning.
returns a list with matrices with a belief for each policy graph node. The list elements are the epochs and converged solutions only have a single element.
Other policy:
optimal_action()
,
plot_belief_space()
,
plot_policy_graph()
,
policy()
,
policy_graph()
,
projection()
,
reward()
,
solve_POMDP()
,
solve_SARSOP()
,
value_function()
data("Tiger")
# Infinite horizon case with converged solution
sol <- solve_POMDP(model = Tiger, method = "grid")
sol
# default method auto uses the belief points used in the algorithm (if available).
estimate_belief_for_nodes(sol, verbose = TRUE)
# use belief points obtained from trajectories
estimate_belief_for_nodes(sol, method = "trajectories", verbose = TRUE)
# use a random uniform sample
estimate_belief_for_nodes(sol, method = "random", verbose = TRUE)
# Finite horizon example with three epochs.
sol <- solve_POMDP(model = Tiger, horizon = 3)
sol
estimate_belief_for_nodes(sol)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.