interp_policy: Computes the best policy

Description Usage Arguments Details Value Author(s) Examples

View source: R/interp_policy.R

Description

Given a policy and an initial belief state, returns the action which maximizes the value function, and its corresponding value

Usage

1
  interp_policy(state_prior, alpha, alpha_action)

Arguments

state_prior

Initial belief state, vector of 2 values (belief state extant and extinct), between 0 and 1.

alpha

alpha vector

alpha_action

list of actions corresponding to the alpha vector

Details

The alpha vector and the list of actions can be computed by using the package 'sarsop' (functions pomdpsol and read_policyx)

Value

List of 2 elements : optimal value and action

Author(s)

Milad Memarzadeh

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
  ## Not run: 
    #values for Sumatran tigers
    pen <- 0.1
    p0 <- 1-pen
    pem <- 0.05816
    pm <- 1 - pem
    V <- 175.133
    Cm <- 18.784
    Cs <- 10.840
    d0 <- 0.01
    dm <- 0.01
    ds <- 0.78193
    
    #buiding the matrices of the problem
    t <- smsPOMDP::tr(p0, pm, d0, dm, ds, V, Cm, Cs) #transition matrix
    o <- smsPOMDP::obs(p0, pm, d0, dm, ds, V, Cm, Cs)#observation matrix
    r <- smsPOMDP::rew(p0, pm, d0, dm, ds, V, Cm, Cs)#reward matrix
    
    state_prior <- c(1,0) #initial belief state
    log_dir <- tempdir()
    id <- digest::digest(match.call())
    infile <- paste0(log_dir, "/", id, ".pomdpx")
    outfile <- paste0(log_dir, "/", id, ".policyx")
    stdout <- paste0(log_dir, "/", id, ".log")
    
    sarsop::write_pomdpx(t, o, r, disc, state_prior, file = infile)
    status <- sarsop::pomdpsol(infile, outfile, stdout = stdout)
    policy <- sarsop::read_policyx(file = outfile)
    output <- smsPOMDP::interp_policy(state_prior,policy$vectors,policy$action)
    
  
## End(Not run)

conservation-decisions/smsPOMDP documentation built on Oct. 27, 2020, 10:44 p.m.