This builds on dirichletSampler.Rmd, which was starting to get a bit unwieldy. Here I'm going to simulate a posterior using actual priors and data. For data, I'll start with the data = 0 test that Michail suggests. Then we'll use actual data. Right now I have this function:

library(eliciteg)
calcC <- function(data, prior){

  dd <- as.vector(data)
  pp <- as.vector(prior)
  if(length(dd) != length(pp)) stop('Data and prior lengths do not match')

  logc <- lgamma(sum(pp)) - sum(lgamma(pp)) - (lgamma(sum(pp + data)) - sum(lgamma(pp + data)))

  return(exp(logc))
}

And I have a list (allqs) with all the individual priors. We'll make some data with 1s, and show an example of how to use this with two different experts.

dat <- matrix(1, nrow = 9, ncol = 9)
calcC(dat, priorList[[1]][[1]])
calcC(dat, priorList[[1]][[6]])

Ok, well, we've learned one thing that with data = 0, than we just get 1 returned from the function, which means that we'll not choose any of the individuals with any probability. Is that right?

Anyway, to get all the c values, I have to run lapply() to get the values for each expert-prior / data combination.

cvec <- rapply(priorList, function(x) calcC(dat, x), how = 'list')

Ok, that's the c values, now we need to assemble K. For example, here's the algorithm that would work in commented code:

# Get the individual numerators for each of the 8 experts
numexp <- length(cvec[[1]])
up1 <- (1 / numexp) * unlist(cvec[[1]][1])
up2 <- (1 / numexp) * unlist(cvec[[1]][2])
up3 <- (1 / numexp) * unlist(cvec[[1]][3])
up4 <- (1 / numexp) * unlist(cvec[[1]][4])
up5 <- (1 / numexp) * unlist(cvec[[1]][5])
up6 <- (1 / numexp) * unlist(cvec[[1]][6])
up7 <- (1 / numexp) * unlist(cvec[[1]][7])
up8 <- (1 / numexp) * unlist(cvec[[1]][8])

# Get the Standard denominator, which is the same for all 8 experts
down <- (1 / numexp) * sum(unlist(cvec[[1]]))

# Calculate k_i for each individual expert
k1 <- up1 / down
k2 <- up2 / down
k3 <- up3 / down
k4 <- up4 / down
k5 <- up5 / down
k6 <- up6 / down
k7 <- up7 / down
k8 <- up8 / down

# Assemble the K vector that will be used in the multinomial sampling
K <- c(k1, k2, k3, k4, k5, k6, k7, k8)

Here's the slightly more R-ish way, in the sense that it is vectorised:

numexp <- length(cvec[[1]])
upq1 <- (1 / numexp) * unlist(cvec[[1]])
down <- (1 / numexp) * sum(unlist(cvec[[1]]))
kq1 <- upq1 / down

Now I should be able to capture this with lapply(), i.e. do it in a really R-ish way. First we right the function. What this function will do is loop over each list element, which is itself a list. Take that list, calculate and then return the values for K as a list. klist won't be recursive; instead it's a 6 element list (one for each question). Each element in the list is a length 8 vector that contains the 8 $k_i$'s for each expert. We'll use each list element when sampling the prior.

calcK <- function(cdat){
  numexp <- length(unlist(cdat))
  up <- (1 / numexp) * unlist(cdat)
  down <- (1 / numexp) * sum(unlist(cdat))
  up / down
}

klist <- lapply(cvec, function(x) calcK(x))

Next we sample $k_i$, and then to use that index idx to choose the prior, which is a list element out of priorList. Here we apply this one time for one question using Jim's algorithm.

nreg <- length(regID)

idx <- which(rmultinom(1, 1, klist[[1]]) == 1)

prior <- matrix(data = unlist(priorList[[1]][idx]), nrow = length(regID), ncol = length(regID))
di <- matrix(rgamma(nreg * nreg, 
                    shape = dat + prior, 
                    scale = 1), nreg, nreg) 

post <- di / matrix(colSums(di), nreg, nreg, byrow = T)
colnames(post) <- row.names(post) <- regID
round(post, 3)

That looks good, and now we need to sample within a Gibbs framework. What I'll do here, and elsewhere is to:

  1. Initialise an empty matrix sumh
  2. set up the number of Gibbs steps ng
  3. Within the loop, choose an index
  4. Use that index to choose a prior
  5. Sample from the Dirichlet using Jim's algorithm
  6. Add these values to sumh
  7. After the loop, summarise the posterior by dividing by ng
sumh <- post * 0
colnames(sumh) <- row.names(sumh) <- regID
question <- 1
ng <- 100
for(i in 1:ng){
  idx <- which(rmultinom(1, 1, klist[[question]]) == 1)
  # print(idx)
  prior <- matrix(data = unlist(priorList[['females']][idx]), nrow = length(regID), ncol = length(regID))
  di <- matrix(rgamma(nreg * nreg, 
                    shape = dat + prior, 
                    scale = 1), nreg, nreg) 

  post <- di / matrix(colSums(di), nreg, nreg, byrow = T)  
  sumh <- sumh + post
}

movePostprob <- sumh / ng
round(movePostprob, 3)

Right. Now I need to do it for both sexes In the Gibbs loop, we loop over months and regions. And we have both males and females to update. So let's have different male and female data:

maleDat <- matrix(data = floor(runif(n = nreg * nreg, min = 0, max = 10)), nrow = 9, ncol = 9)
femaleDat <- matrix(data = floor(runif(n = nreg * nreg, min = 0, max = 10)), nrow = 9, ncol = 9)

Now we'll try this in a more realistic Gibbs loop for both sexes; first females:

sumh <- post * 0
colnames(sumh) <- row.names(sumh) <- regID
summh <- sumh # males
question <- 1
ng <- 100
for(i in 1:ng){
  idx <- which(rmultinom(1, 1, klist[[question]]) == 1)

  prior <- matrix(data = unlist(priorList[['females']][idx]), nrow = length(regID), ncol = length(regID))
  di <- matrix(rgamma(nreg * nreg, 
                    shape = femaleDat + prior, 
                    scale = 1), nreg, nreg) 

  post <- di / matrix(colSums(di), nreg, nreg, byrow = T)  
  sumh <- sumh + post
}

femalemovePostprob <- sumh / ng
round(femalemovePostprob, 3)
question <- 2
ng <- 100
for(i in 1:ng){
  idx <- which(rmultinom(1, 1, klist[[question]]) == 1)

  prior <- matrix(data = unlist(priorList[['males']][idx]), nrow = length(regID), ncol = length(regID))
  di <- matrix(rgamma(nreg * nreg, 
                    shape = maleDat + prior, 
                    scale = 1), nreg, nreg) 

  post <- di / matrix(colSums(di), nreg, nreg, byrow = T)  
  summh <- summh + post
}

malemovePostprob <- summh / ng
round(malemovePostprob, 3)


robschick/eliciteg documentation built on May 27, 2019, 11:58 a.m.