test_hyp: Testing Informed Hypotheses
In Jaeoc/lmhyp: Informed Hypothesis Testing for Regression

Description Usage Arguments Details References Examples

View source: R/test_hyp.R

Test competing hypotheses about coefficients in an lm-object.

1	test_hyp(object, hyp, priorprob = 1, mcrep = 1e+06)

`object`	A regression model object fit using the `lm` function.
`hyp`	A string specifying hypotheses to be tested using the variable names of the `lm` object, or the string “exploratory”, see details.
`priorprob`	Vector of prior probabilities for the input hypotheses, by default equal.
`mcrep`	Integer specifying the number of iterations if no analytical solutions is possible. This is rare and only the case if the rank of the constraint matrix is less than its number of rows.

This function is based on a method by Mulder (2014), a modification of the fractional Bayes factor approach. In essence, it uses a number of observations equal to the number of predictors in the lm model to construct a minimally informative prior, and the remainder of the observations are then used to test the hypotheses.

Hypotheses are specified using the variable names from the lm object. If a hypothesis involves multiple variables it is usually preferable to standardize relevant variables before fitting the model with lm to facilitate interpretation. This is done simply by substracting the mean of a variable from each observation and dividing by the standard deviation. A simple option for achieving this is to use the scale function.

Multiple hypotheses can be specified at the same time by separating them with a semicolon. It is advisable to only specify competing hypotheses in this way, that is, hypotheses regarding the same variables, e.g., “X1 > 0; X1 < 0; X1 = 0”. If specifying multiple hypotheses and comparing against a value it is currently only possible to compare against the same value, i.e., “X1 = 0; X1 = 2” is not functional input. This is because the prior is centered around the input value (or zero if no input value), which is not possible in the case of several comparison values.

Parentheses can be used to compare multiple variables with the same variable or value. For example, “(X1, X2) > X3” is read as “X1 > X3 and X2 > X3”. Each variable should only be specified once in a single hypothesis.

An alternative to specifying hypotheses is to input the string “exploratory”. This will compare the hypotheses “X < 0; X = 0; X > 0” for all independent variables in the regression object, including the intercept.

For each specified hypothesis the posterior probability is output. If the hypotheses are not exhaustive (i.e., do not cover the entire parameter space) this includes the posterior probability of the complement to the input hypotheses. The complement is the hypothesis that neither of the input hypotheses is true. For example, inputting “X1 > 0; X1 < 0” gives posterior probabilities for only for these hypotheses, whereas inputting “(X1, X2) > 0” gives posterior probabilities for “(X1, X2) > 0” and “not (X1, X2) > 0”.

If not using the “exploratory” option, it is possible to specify prior probabilities for the input hypotheses. By default these are equal (priorprob = 1). Prior probabilites can both be input as probabilites, e.g., c(0.2, 0.3, 0.5) or relative weights of each hypothesis, e.g, c(2, 3, 5). If the input probabilites do not sum to 1 they will simply be normalized. Prior probabilities must be specified for all hypotheses, including the complement if one exists.

By saving the test as an object it is also possible to access the BF_matrix which compares the hypotheses directly against each other (see examples). This matrix divides the row hypothesis by each column hypothesis and, assuming equal prior probabilities, can be interpreted as “given the data, [row hypothesis] is [value] times as likely as [column hypothesis]”.

Supplementary output intended to provide a deeper understanding of the underlying method and primary output can be printed when the test has been saved as an object. Calling BF_computation illustrates the computation of the Bayes factor of each hypothesis against the unconstrained hypothesis. In the output c(E) is the prior density, c(I|E) the prior probability, c the product of these two, and columns prefixed with "f" the equivalent for the posterior. B(t,u) is the Bayes factor of hypothesis t against the unconstrained (u) and PP(t) is the posterior probability of t. Finally, calling BFu_CI provides 90% credibility intervals for those few cases where the Bayes factor was calculated numerically.

Mulder, J. (2014). Prior adjusted default Bayes factors for testing (in) equality constrained hypotheses. Computational Statistics & Data Analysis, 71, 448-463.

###Standardize variables and fit the linear model
dt <- as.data.frame(scale(mtcars[, c(1, 3:4, 6)]))
fit <- lm(mpg ~ disp + hp + wt, data = dt)

###Exploratory analysis
test_hyp(fit, "exploratory")

###Define hypotheses based on theory and test them
hyp <- "(wt, hp) > disp > 0; (wt, hp) > disp = 0"
res <- test_hyp(fit, hyp)
res

###Bayes factor comparison of hypotheses
res$BF_matrix