In robschick/eliciteg:

Here I show how to take answers from one expert that came in the form of a Beta distribution via the shiny app (example interface here). There they highlighted how many whales make a transition, as well as their certainty around that answer. In the link provided above, the experts were asked to provide answers for three transitions:

Animals remaining in the MIDA
Animals moving south from MIDA to SEUS
Animals moving from MIDA to "Northern" regions

In this last question we lump 7 different regions north of the MIDA into one. This was done to reduce the congitive load on the experts. However, because the model from Schick et al. (2013) operates with 9 geographic regions, we need to convert from the single Beta back to 7 Beta distributions, which we will then use in the Dirichlet distribution that serves as the prior for the movement transitions. This vignette describes this process.

Necessary Background

Let's say we have three possible habitat regions and we ask for all possible transitions between and among these regions. If we do this with a set of linked sliders in the shiny app, then we can easily generate a full Dirichlet distribution. For example, from area 1 to area 2, 20 out of 100 whales move. From area 1 to area 1, 25 out of 100 whales move. Because these are linked and sum to 100, this means that 55 whales move from area 1 to area 3.

The weight a user assigns to these parameters controls the variance in the individual Dirichlet parameters. To illustrate this, let's imagine two Dirichlet distributions:

$Dir(0.2, 0.25, 0.55)$
$Dir(20, 25, 55)$

While the means of the two distributions are the same (red line), the variance is very different (as can be seen in this figure):

library(gtools)
library(reshape2)
library(ggplot2)
d1 <- rdirichlet(1000, c(2, 2.5, 5.5))
d2 <- rdirichlet(1000, c(20, 25, 55))
d1long <- melt(d1)
d2long <- melt(d2)

meanDat <- data.frame(value = c(0.2, 0.25, 0.55), Var2 = c(1, 2, 3))

p <- ggplot(d1long, aes(value, group = Var2))+
  geom_histogram()+
  facet_grid(Var2 ~ .)+
  xlim(0, 1)+
  geom_vline(data=meanDat, aes(xintercept = value), colour="red", lty=3) +
  ggtitle('Dir(0.2, 0.25, 0.55)')

p2 <- ggplot(d2long, aes(value, group = Var2))+
  geom_histogram()+
  facet_grid(Var2 ~ .)+
  xlim(0, 1)+
  geom_vline(data=meanDat, aes(xintercept = value), colour="red", lty=3) +
  ggtitle('Dir(20, 25, 55)')

p
p2

When we set up the shiny interface to ask the questions about the transitions, we combined several possible transitions into one. Now that we've done that, we need to reverse engineer it, and make the values from the one distribution match up to the values from many Dirichlet's.

We'll start with a toy example of a Beta distribution with parameters Beta(10, 30) which we want to turn into 5 Beta distributions with parameters Be(a, b).

We start with factoring them down into the 5 distributions as follows:

$$ \mu = \frac{a}{a+b} = \frac{10}{10+30} \cdot \frac{1}{5} = 0.05 $$ and

$$ a+b = \frac{40}{5} $$ which yields

$$ b = 8 - a $$

Finally, $$ a = 2, b = 6. $$

In our toy example, then, we went from 3 transitions to 7 transitions - the last 5 of which are identical. Say we have three distributions: Be(4, 20), Be(15, 10), and Be(10, 30); after dividing the third transition, those will translate to:

Be(4, 20)
Be(15, 10)
Be(2, 6)
Be(2, 6)
Be(2, 6)
Be(2, 6)
Be(2, 6)

Real Worked Example

The goal of this section is to take the thinking behind the toy example, and extend it to work with data from the actual elicitation.

Partitioning up the Beta Distributions

Let's take real answers for one of the questions above, and turn it into priors for the movement analysis. Here are one person's answers:

df <- data.frame(question = 13, gender = 'MALE', rawConf = 4, scaleConf = 19.8181, mida2mida = 40, mida2seus = 10, mida2northern = 50)
knitr::kable(df, caption = 'Table 1. Example answers from one expert for one question in the expert elicitation conducted in June 2015.')

Let's explain these variables and values just a bit for clarification:

rawConf refers to the value the person put in using the slider (range 1-100).
scaleConf refers to the number from rawConf being rescaled to a new range (1-622), because this was the range of weights we used in the original analysis (we won't use it in this analysis)
mida2mida is the number of whales (out of 100) that will remain in the MIDA during the next time period
mida2seus is the number of whales (out of 100) that will move to the SEUS during the next time period
mida2northern is the number of whales (out of 100) that will move to the northern regions during the next time period; this is the variable that needs to be split into 7 new beta distributions

How these values then translate into parameters for beta distribution is as follows. We used the following parameterisation: $\alpha = n \times m_1$ and $\beta = n - n \times m_1$, where $n$ = rawConf, and $m_1$ = mida2mida / 100.

Accordingly, for the first transtion this yields, $\alpha = 4 \times 40 / 100$, or r 4 * 40 / 100, and $\beta = 4 - 4 \times 40 / 100$, or r 4 - 4 * (40/100). So for this Beta distribution we would have Be(1.6, 2.4).

For the second transition, we'll repeat with the appropriate substitution for the new answers: $\alpha = 4 \times 10 / 100$, or r 4 * 10 / 100, and $\beta = 4 - 4 \times 10 / 100$, or r 4 - 4 * (10/100). So for this Beta distribution we would have Be(0.4, 3.6).

For the last transition, we need to calculate the Beta distribution parameters, and then divide by 7. The distribution for the transition is: $\alpha = 4 \times 50 / 100$, or r 4 * 50 / 100, and $\beta = 4 - 4 \times 50 / 100$, or r 4 - 4 * (50/100). So for this Beta distribution we would have Be(2, 2). However, we need to then divide this by 7 to get the final distributions. This would yield Be(0.2857, 0.2857) for each of these 7 transitions.

The summary table of all these transitions for the Beta parameters is now:

df <- data.frame(transition = c('mida2mida', 'mida2seus', 'mida2bof', 'mida2gom', 'mida2gsc', 'mida2jl', 'mida2ne', 'mida2nrth', 'mida2rb'), 
                 alpha = c(1.6, 0.4, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857),
                 beta = c(2.4, 3.6, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857))
knitr::kable(df, caption = 'Table 2. Marginal Beta distributions for 9 possible transitions from MIDA to other regions from January to February.')

This table shows the marginal distributions for each of the $\pi_i$ parameters of the Dirichlet distribution. In the last section, we go from the marginals to a final prior distribution.

Assembling the Dirichlet Prior

In the final analysis of the movement data, we are going to use a mixture of Dirichlet distributions that is comprised of each experts' unique Dirichlet prior distribution for a particular transition. This section describes how to take the marginal distributions given above, and turn them into the final expert-specific prior.

We need to assemble the Dirichlet comprised of 9 possible transitions from the 9 Beta distributions outlined above, e.g. $Dir(d_1, d_2, \ldots, d_9)$. In this instance $n = d_1 + d_2 + \ldots + d_9$. Each Beta distribution, $Be(a_{11}, a_{12})$ can be expressed as $Be(d_1, n - d_1)$; therefore $d_1 = a_{11}, d_2 = a_{21}$, etc. We use these definitions to construct the distribution as follows.

$$ n = 1.6 + 0.4 + 0.2857 + 0.2857 + 0.2857 + 0.2857 + 0.2857 + 0.2857 + 0.2857 = 3.999 $$

This allows us to express this experts' Dirichlet prior for these transitions as:

$$Dir(1.6, 0.4, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857, 0.2857).$$

robschick/eliciteg documentation built on May 27, 2019, 11:58 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com