This builds on dirichletSampler.Rmd
, which was starting to get a bit unwieldy. Here I'm going to simulate a posterior using actual priors and data. For data, I'll start with the data = 0 test that Michail suggests. Then we'll use actual data. Right now I have this function:
library(eliciteg) calcC <- function(data, prior){ dd <- as.vector(data) pp <- as.vector(prior) if(length(dd) != length(pp)) stop('Data and prior lengths do not match') logc <- lgamma(sum(pp)) - sum(lgamma(pp)) - (lgamma(sum(pp + data)) - sum(lgamma(pp + data))) return(exp(logc)) }
And I have a list (allqs
) with all the individual priors. We'll make some data with 1s, and show an example of how to use this with two different experts.
dat <- matrix(1, nrow = 9, ncol = 9) calcC(dat, priorList[[1]][[1]]) calcC(dat, priorList[[1]][[6]])
Ok, well, we've learned one thing that with data = 0, than we just get 1 returned from the function, which means that we'll not choose any of the individuals with any probability. Is that right?
Anyway, to get all the c values, I have to run lapply()
to get the values for each expert-prior / data combination.
cvec <- rapply(priorList, function(x) calcC(dat, x), how = 'list')
Ok, that's the c values, now we need to assemble K. For example, here's the algorithm that would work in commented code:
# Get the individual numerators for each of the 8 experts numexp <- length(cvec[[1]]) up1 <- (1 / numexp) * unlist(cvec[[1]][1]) up2 <- (1 / numexp) * unlist(cvec[[1]][2]) up3 <- (1 / numexp) * unlist(cvec[[1]][3]) up4 <- (1 / numexp) * unlist(cvec[[1]][4]) up5 <- (1 / numexp) * unlist(cvec[[1]][5]) up6 <- (1 / numexp) * unlist(cvec[[1]][6]) up7 <- (1 / numexp) * unlist(cvec[[1]][7]) up8 <- (1 / numexp) * unlist(cvec[[1]][8]) # Get the Standard denominator, which is the same for all 8 experts down <- (1 / numexp) * sum(unlist(cvec[[1]])) # Calculate k_i for each individual expert k1 <- up1 / down k2 <- up2 / down k3 <- up3 / down k4 <- up4 / down k5 <- up5 / down k6 <- up6 / down k7 <- up7 / down k8 <- up8 / down # Assemble the K vector that will be used in the multinomial sampling K <- c(k1, k2, k3, k4, k5, k6, k7, k8)
Here's the slightly more R-ish way, in the sense that it is vectorised:
numexp <- length(cvec[[1]]) upq1 <- (1 / numexp) * unlist(cvec[[1]]) down <- (1 / numexp) * sum(unlist(cvec[[1]])) kq1 <- upq1 / down
Now I should be able to capture this with lapply()
, i.e. do it in a really R-ish way. First we right the function. What this function will do is loop over each list element, which is itself a list. Take that list, calculate and then return the values for K as a list. klist
won't be recursive; instead it's a 6 element list (one for each question). Each element in the list is a length 8 vector that contains the 8 $k_i$'s for each expert. We'll use each list element when sampling the prior.
calcK <- function(cdat){ numexp <- length(unlist(cdat)) up <- (1 / numexp) * unlist(cdat) down <- (1 / numexp) * sum(unlist(cdat)) up / down } klist <- lapply(cvec, function(x) calcK(x))
Next we sample $k_i$, and then to use that index idx
to choose the prior, which is a list element out of priorList
. Here we apply this one time for one question using Jim's algorithm.
nreg <- length(regID) idx <- which(rmultinom(1, 1, klist[[1]]) == 1) prior <- matrix(data = unlist(priorList[[1]][idx]), nrow = length(regID), ncol = length(regID)) di <- matrix(rgamma(nreg * nreg, shape = dat + prior, scale = 1), nreg, nreg) post <- di / matrix(colSums(di), nreg, nreg, byrow = T) colnames(post) <- row.names(post) <- regID round(post, 3)
That looks good, and now we need to sample within a Gibbs framework. What I'll do here, and elsewhere is to:
sumh
ng
sumh
ng
sumh <- post * 0 colnames(sumh) <- row.names(sumh) <- regID question <- 1 ng <- 100 for(i in 1:ng){ idx <- which(rmultinom(1, 1, klist[[question]]) == 1) # print(idx) prior <- matrix(data = unlist(priorList[['females']][idx]), nrow = length(regID), ncol = length(regID)) di <- matrix(rgamma(nreg * nreg, shape = dat + prior, scale = 1), nreg, nreg) post <- di / matrix(colSums(di), nreg, nreg, byrow = T) sumh <- sumh + post } movePostprob <- sumh / ng round(movePostprob, 3)
Right. Now I need to do it for both sexes In the Gibbs loop, we loop over months and regions. And we have both males and females to update. So let's have different male and female data:
maleDat <- matrix(data = floor(runif(n = nreg * nreg, min = 0, max = 10)), nrow = 9, ncol = 9) femaleDat <- matrix(data = floor(runif(n = nreg * nreg, min = 0, max = 10)), nrow = 9, ncol = 9)
Now we'll try this in a more realistic Gibbs loop for both sexes; first females:
sumh <- post * 0 colnames(sumh) <- row.names(sumh) <- regID summh <- sumh # males question <- 1 ng <- 100 for(i in 1:ng){ idx <- which(rmultinom(1, 1, klist[[question]]) == 1) prior <- matrix(data = unlist(priorList[['females']][idx]), nrow = length(regID), ncol = length(regID)) di <- matrix(rgamma(nreg * nreg, shape = femaleDat + prior, scale = 1), nreg, nreg) post <- di / matrix(colSums(di), nreg, nreg, byrow = T) sumh <- sumh + post } femalemovePostprob <- sumh / ng round(femalemovePostprob, 3)
question <- 2 ng <- 100 for(i in 1:ng){ idx <- which(rmultinom(1, 1, klist[[question]]) == 1) prior <- matrix(data = unlist(priorList[['males']][idx]), nrow = length(regID), ncol = length(regID)) di <- matrix(rgamma(nreg * nreg, shape = maleDat + prior, scale = 1), nreg, nreg) post <- di / matrix(colSums(di), nreg, nreg, byrow = T) summh <- summh + post } malemovePostprob <- summh / ng round(malemovePostprob, 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.