Memory of Lifestresses

In this vignette, we will explain how to compute a Bayes factor for inequality-constrained hypotheses for multinomial models.

Model and Data

As example for an inequality-constrained hypothesis in multinomial models, we will use the data set, lifestresses, which is included in the package multibridge. The dataset is based on a survey study by @uhlenhuth1974symptom which among other things has dealt with experienced negative life events and lifestresses. The subset of the data which we use here summarizes when respondents have experienced a negative event in their lives in the 18 months prior to the interview. In total, the dataset contains the responses of 147 people. This subset was analyzed by @haberman1978 who wanted to show that the participants listed mainly negative events of the recent past. Furthermore, in the context of inequality-constrained hypotheses the dataset was discussed in @sarafoglou2020evaluatingPreprint.

To get an overview, we will load and visualize the data before we start the data analysis.

library(multibridge)
data(lifestresses)
lifestresses

# visualize the data
op <- par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5 , font.lab = 2, cex.axis = 1.3, bty = "n", las = 1)
  plot(lifestresses$month, lifestresses$stress.freq, col = "black", 
       pch = 1, cex = 2, type="b",
       xlim = c(0, 18), ylim = c(0, 20), 
       ylab = "", xlab = "", axes = FALSE)
  axis(1, at = seq(0, 18, length.out = 10), label = seq(0, 18, length.out = 10))
  axis(2, at = seq(0, 20, length.out = 5), label = seq(0, 20, length.out = 5), las = 1) 
  par(las = 0)
  mtext("Months Before Interview", side = 1, line = 2.5, cex = 1.5)
  mtext("Participants Reporting a\nNegative Life Event", side = 2, line = 3.0, cex = 1.5)
par(op)

The model that we will use assumes that the vector of observations $x_1, \cdots, x_K$ in the $K$ categories follow a multinomial distribution. The parameter vector of the multinomial model, $\theta_1, \cdots, \theta_K$, contains the probabilities of observing a value in a particular category; here, it reflects the probabilities of reporting a negative life event or life stress during one of the 18 months. The parameter vector $\theta_1, \cdots, \theta_K$ is drawn from a Dirichlet distribution with concentration parameters $\alpha_1, \cdots, \alpha_K$. The model can be described as follows:

\begin{align} x_1, \cdots, x_K &\sim \text{Multinomial}(\sum_{k = 1}^K x_k, \theta_1, \cdots, \theta_K) \ \theta_1, \cdots, \theta_K &\sim \text{Dirichlet}(\alpha_1, \cdots, \alpha_K) \ \end{align}

Here, we test the inequality-constrained hypothesis $\mathcal{H}_r$ that the number of reported negative life events decreases over time against the encompassing hypothesis $\mathcal{H}_e$ without constraints:

\begin{align} \mathcal{H}r &: \theta{1} > \theta_{2} > \cdots > \theta_{18} \ \mathcal{H}e &: \theta_1, \theta_2, \cdots, \theta{18}. \end{align}

To compute the Bayes factor in favor of the inequality-constrained hypothesis, $\text{BF}_{re}$, we need to specify (1) a vector containing the number of observations, (2) the inequality-constrained hypothesis, (3) a vector with concentration parameters, and (4) the labels of categories of interest (i.e., months prior to the interview). .

x <- lifestresses$stress.freq
# Test the following restricted Hypothesis:
# Hr: month1 > month2 > ... > month18 
Hr  <- paste0(1:18, collapse=">")
# Prior specification 
# We assign a uniform Dirichlet distribution, that is, we set all concentration parameters to 1
a <- rep(1, 18)
categories <- lifestresses$month

With this information, we can now conduct the analysis with the function mult_bf_informed(). Since we are interested in quantifying evidence in favor of the restricted hypothesis, we set the Bayes factor type to BFre. For reproducibility, we are also setting a seed with the argument seed:

ineq_results <- multibridge::mult_bf_informed(x=x, Hr=Hr, a=a, factor_levels=categories,
                                              bf_type = 'BFre', seed = 2020)

Summarize the Results

We can get a quick overview of the results by using the implemented summary() method:

m1 <- summary(ineq_results)
m1

Since we are dealing with many categories, one might want to assess the accuracy of the Bayes factor. This can be done, by inspecting the corresponding error of the estimate. The relative mean-square error of the Bayes factor estimate is about $r signif(m1$re2)$, which can be considered low.

More information on the accuracy can be extracted from the output directly. Information about the Bayes factor are stored in bf_list under error_measures.

ineq_results$bf_list$error_measures

This dataframe features the approximate relative mean-squared error for the Bayes factor, the approximate coefficient of variation, and the approximate percentage error.

References



Try the multibridge package in your browser

Any scripts or data that you put into this service are public.

multibridge documentation built on Nov. 1, 2022, 5:05 p.m.