In ntguardian/MCHT: Bootstrap and Monte Carlo Hypothesis Testing

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

MCHT

Version r packageVersion("MCHT")

MCHT is a package implementing an interface for creating and using Monte Carlo tests. The primary function of the package is MCHTest(), which creates functions with S3 class MCHTest that perform a Monte Carlo test.

Installation

MCHT is not presently available on CRAN. You can download and install MCHT from GitHub using devtools via the R command devtools::install_github("ntguardian/MCHT").

Monte Carlo Hypothesis Testing

Monte Carlo testing is a form of hypothesis testing where the $p$-values are computed using the empirical distribution of the test statistic computed from data simulated under the null hypothesis. These tests are used when the distribution of the test statistic under the null hypothesis is intractable or difficult to compute, or as an exact test (that is, a test where the distribution used to compute $p$-values is appropriate for any sample size, not just large sample sizes).

Suppose that $s_n$ is the observed value of the test statistic and large values of $s_n$ are evidence against the null hypothesis; normally, $p$-values would be computed as $1 - F(s_n)$, where $F(s_n) = F(S_n \leq s_n)$ is the cumulative distribution functions and $S_n$ is the random variable version of $s_n$. We cannot use $F$ for some reason; it's intractable, or the $F$ provided is only appropriate for large sample sizes.

Instead of using $F$ we will use $\hat{F}N$, which is the empirical CDF of the same test statistic computed from simulated data following the distribution prescribed by the null hypothesis of the test. For the sake of simplicity in this presentation, assume that $S$ is a continuous random variable. Now our $p$-value is $1 - \hat{F}_N(s_n)$, where $\hat{F}_N(s_n) = \frac{1}{N}\sum{j = 1}^{N}I_{{\tilde{S}{n,j} \leq s_n}}$ where $I$ is the indicator function and $\tilde{S}{n,j}$ is an independent random copy of $S_n$ computed from simulated data with a sample size of $n$.

The power of these tests increase with $N$ (see [1]) but modern computers are able to simulate large $N$ quickly, so this is rarely an issue. The procedure above also assumes that there are no nuisance parameters and the distribution of $S_n$ can effectively be known precisely when the null hypothesis is true (and all other conditions of the test are met, such as distributional assumptions). A different procedure needs to be applied when nuisance parameters are not explicitly stated under the null hypothesis. [2] suggests a procedure using optimization techniques (recommending simulated annealing specifically) to adversarially select values for nuisance parameters valid under the null hypothesis that maximize the $p$-value computed from the simulated data. This procedure is often called maximized Monte Carlo (MMC) testing. That is the procedure employed here. (In fact, the tests created by MCHTest() are the tests described in [2].) Unfortunately, MMC, while conservative and exact, has much less power than if the unknown parameters were known, perhaps due to the behavior of samples under distributions with parameter values distant from the true parameter values (see [3]).

Bootstrap Hypothesis Testing

Bootstrap statistical testing is very similar to Monte Carlo testing; the key difference is that bootstrap testing uses information from the sample. For example a parametric bootstrap test would estimate the parameters of the distribution the data is assumed to follow and generate datasets from that distribution using those estimates as the actual parameter values. A permutation test (like Fisher's permutation test; see [4]) would use the original dataset values but randomly shuffle the labeles (stating which sample an observation belongs to) to generate new data sets and thus new simulated test statistics. $p$-values are essentially computed the same way.

Unlike Monte Carlo tests and MMC, these tests are not exact tests. That said, they often have good finite sample properties. (See [3].)

See the documentation for more details and references.

Examples

MCHTest() is the main function of the package and can create functions with S3 class MCHTest that perform Monte Carlo hypothesis tests.

For example, the code below creates the Monte Carlo equivalent of a $t$-test.

library(MCHT)
library(doParallel)

registerDoParallel(detectCores())  # Necessary for parallelization, and if not
                                   # done the resulting function will complain
                                   # on the first use

ts <- function(x, mu = 0) {sqrt(length(x)) * (mean(x) - mu)/sd(x)}
sg <- function(x, mu = 0) {
  x <- x + mu
  ts(x)
}
rg <- rnorm

mc.t.test <- MCHTest(ts, sg, rg, seed = 20181001, test_params = "mu", 
                     lock_alternative = FALSE,
                     method = "Monte Carlo One Sample t-Test")

The object mc.t.test() is an S3 class, and a callable function.

class(mc.t.test)

print() will print relevant information about the construction of the test.

mc.t.test

Once this object is created, we can use it for performing hypothesis tests.

dat <- c(2.3, -0.13, 1.42, 1.51, 3.43, -0.96, 0.59, 0.62, 1.28, 4.07)

t.test(dat, mu = 1, alternative = "two.sided")  # For reference
mc.t.test(dat, mu = 1)
mc.t.test(dat, mu = 1, alternative = "two.sided")

This is the simplest example; MCHTest() can create more involved Monte Carlo tests. See other documentation for details.

Planned Future Features

A function for making diagnostic-type plots for tests, such as a function creating a plot for the rejection probability function (RPF) as described in [5]
A function that accepts a MCHTest-class object and returns a function that, rather than returning a htest-class object, returns a function that will give the test statistic, simulated test statistics, and a $p$-value, in a list; could be useful for diagnostic work.

References

A. C. A. Hope, A simplified Monte Carlo test procedure, JRSSB, vol. 30 (1968) pp. 582-598
J-M Dufour, Monte Carlo tests with nuisance parameters: A general approach to finite-sample inference and nonstandard asymptotics, Journal of Econometrics, vol. 133 no. 2 (2006) pp. 443-477
J. G. MacKinnon, Bootstrap hypothesis testing in Handbook of computational econometrics (2009) pp. 183-213
R. A. Fisher, The design of experiments (1935)
R. Davidson and J. G. MacKinnon, The size distortion of bootstrap test, Econometric Theory, vol. 15 (1999) pp. 361-376