Description Usage Arguments Details Value References Examples
This function creates an MCHTest
-class object, an S3 object that
defines a bootstrap or Monte Carlo test.
1 2 3 4 5 6 7 | MCHTest(test_stat, stat_gen, rand_gen = function(n) { stats::runif(n)
}, N = 10000, seed = NULL, memoise_sample = TRUE,
pval_func = MCHT::pval, method = "Monte Carlo Test",
test_params = NULL, fixed_params = NULL, nuisance_params = NULL,
optim_control = NULL, tiebreaking = FALSE, lock_alternative = TRUE,
threshold_pval = 1, suppress_threshold_warning = FALSE,
localize_functions = FALSE, imported_objects = NULL)
|
test_stat |
A function that computes the test statistic from input data;
|
stat_gen |
A function that generates values of the test statistic when
given data; |
rand_gen |
A function generating random data, accepting a parameter
|
N |
Integer representing the number of replications of |
seed |
The random seed used to generate simulated statistic values; if
|
memoise_sample |
If |
pval_func |
A function that computes p-values from the test
statistic computed by |
method |
A string labelling the test |
test_params |
A character vector of the names of parameters with values
specified under the null hypothesis; both |
fixed_params |
A character vector of the names of parameters treated as
fixed values; this isn't needed but if these parameters
are being used then test output is more informative and
errors will be raised if |
nuisance_params |
A character vector of the names of parameters to be
treated as nuisance parameters which must be chosen
via optimization (see \insertCitedufour06MCHT);
must be parameters of |
optim_control |
A list of arguments to be passed to
|
tiebreaking |
Break ties using the method as described in
\insertCitedufour06;textualMCHT; won't work if
|
lock_alternative |
If |
threshold_pval |
A numeric value that represents a threshold
p-value that, if surpassed by the optimization
algorithm, will cause the algorithm to terminate; will
override the |
suppress_threshold_warning |
If |
localize_functions |
If |
imported_objects |
A named list of objects that will be "exposed" and
localized to the environment of the returned
function; this would be useful if
|
MCHTest
-class objects are effectively functions that accept data and
maybe some parameters and return an htest
-class object containing the
results of a Monte Carlo or bootstrap statistical test. These object will
accept datasets and perhaps some parameters and will return the results of a
test.
Bootstrap tests can be implemented when the dataset is passed as an argument
to rand_gen
(which occurs when x
is one of rand_gen
's
parameters). The only difference between a Monte Carlo test and a bootstrap
test in the context of this function is that bootstrap tests use information
from the original dataset when generating simulated test statistics, while a
Monte Carlo test does not. When the default function for computing
p-values is used, this function will perform a test similar to that
described by \insertCitemackinnon09;textualMCHT.
For Monte Carlo tests, when the default function for computing p-values
is used (see pval
), this is effectively the test described in
\insertCitedufour06;textualMCHT. This includes using simulated annealing
to find values of nuisance parameters that maximize the p-value if the
null hypothesis is true. Simulated annealing is implemented using
GenSA
from the GenSA package, and the
optim_control
parameter is used for controlling GenSA
's
behavior. We highly recommend reading GenSA
's documentation.
The threshold_pval
argument can be used for stopping the optimization
procedure when a specified p-value is reached or surpassed.
\insertCitedufour06;textualMCHT showed that p-values found using
the procedure implemented here are conservative (in the sense that they are
larger than they necessarily need to be). If the algorithm terminates early
due to surpassing a prespecified p-value, then the estimated
p-value is known to at least be the value returned, but because the
p-value is a conservative estimate of the "true" p-value, this
latter number could be smaller. Thus we cannot say much about the location of
the true p-value if the algorithm terminates early. For this reason, a
MCHTest
-class function will, by default, issue a warning if the
algorithm terminated early. However, by setting
suppress_threshold_warning
to TRUE
, this behavior can be
disabled. This recognizes the fact that even though an early termination
leads to us not being able to say much about the location of the true
p-value, we know that whatever the more accurate estimate is, we would
not reject the null hypothesis based on that result.
This function uses foreach
,
%dorng%
, and %dopar%
to
perform simulations. If the R session is not set up at the start for
parallelization, there will be an initial complaint (after which there are no
more complaints), then these functions will default to using a single core.
The example shows how to set up R to use all available cores.
Due to the way environments work in R, if functions passed to
test_stat
, stat_gen
, rand_gen
, or pval_func
depend on objects in the global namespace, and then those objects change, two
otherwise identical runs of the resulting test could produce different
results. Thus changing other objects causes side effects that may not be
desired; the resulting MCHTest
-class objects are no longer
self-contained entities. This is why the localize_functions
and
imported_objects
parameters exist. If localize_functions
is
TRUE
, all of these parameters will have their environments changed to
the environment in which the returned function belongs, and objects included
in the list passed to imported_objects
are added to this environment
as well. Doing this can allow for making the MCHTest
-class object
self-contained and immune to side effects from changes in the global
namespace. See the examples for a demonstration on how this is done.
Localizing functions is safer (even though it could cause errors to be thrown
when not done properly), so we highly recommend doing so.
(Beware of localizing functions from packages, like
runif
; if they depends on objects from the package
namespace then setting localize_functions
to TRUE
will strip
them of their package namespace and thus cause errors if they depend on other
objects in that namespace. A safer approach would be to pass these objects
in a wrapper function, like function(n) {runif(n)}
, than passing the
functions directly.)
A MCHTest
-class object, a function with parameters x
,
alternative
, and ...
, with other parameters being
passed to functions such as those passed to test_stat
and
stat_gen
, controlling what's tested and how; depending on
lock_alternative
, the alternative
argument may be
ignored
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | # Uncomment and run the following to set up parallelization
# library(doParallel)
# registerDoParallel(detectCores())
dat <- c(0.16, 1.00, 0.67, 1.28, 0.31, 1.16, 1.25, 0.93, 0.66, 0.54)
# Monte Carlo t-test for exponentially distributed data
mc.t.test <- MCHTest(test_stat = function(x, mu = 1) {
sqrt(length(x)) * (mean(x) - mu)/sd(x)
}, stat_gen = function(x, mu = 1) {
x <- x * mu
sqrt(length(x)) * (mean(x) - mu)/sd(x)
}, rand_gen = rexp, seed = 123,
method = "Monte Carlo t-Test", test_params = "mu",
lock_alternative = FALSE)
mc.t.test(dat)
mc.t.test(dat, mu = 0.1, alternative = "two.sided")
# Testing for the scale parameter of a Weibull distribution
# Two-sided test for location of scale parameter
library(MASS)
library(fitdistrplus)
# For these examples we need to be sensitive about namespaces, or we may
# discover unwanted side effects
ts <- function(x, scale = 1) {
fit_null <- coef(fitdist(x, "weibull", fix.arg = list("scale" = scale)))
kt <- fit_null[["shape"]]
l0 <- scale
fit_all <- coef(fitdist(x, "weibull"))
kh <- fit_all[["shape"]]
lh <- fit_all[["scale"]]
n <- length(x)
# Test statistic, based on the negative-log-likelihood ratio
suppressWarnings(n * ((kt - 1) * log(l0) - (kh - 1) * log(lh) -
log(kt/kh) - log(lh/l0)) - (kt - kh) * sum(log(x)) + l0^(-kt) *
sum(x^kt) - lh^(-kh) * sum(x^kh))
}
sg <- function(x, scale = 1, shape = 1) {
x <- qweibull(x, shape = shape, scale = scale)
# There is a reason why we're copying the original test statistic rather
# than just calling ts() again; it has to do with environments and making
# sure that the resulting function MCHTest() creates is independent of the
# global namespace
fit_null <- coef(fitdist(x, "weibull", fix.arg = list("scale" = scale)))
kt <- fit_null[["shape"]]
l0 <- scale
fit_all <- coef(fitdist(x, "weibull"))
kh <- fit_all[["shape"]]
lh <- fit_all[["scale"]]
n <- length(x)
# Test statistic, based on the negative-log-likelihood ratio
suppressWarnings(n * ((kt - 1) * log(l0) - (kh - 1) * log(lh) -
log(kt/kh) - log(lh/l0)) - (kt - kh) * sum(log(x)) + l0^(-kt) *
sum(x^kt) - lh^(-kh) * sum(x^kh))
# The following would have bad side effects if ts() is redefined in the
# global namespace
# ts(x, scale = scale)
}
mc.wei.scale.test.1 <- MCHTest(ts, sg, seed = 123, test_params = "scale",
nuisance_params = "shape",
optim_control = list(
lower = c("shape" = 0),
upper = c("shape" = 100),
control = list("max.time" = 10)
), threshold_pval = .2, N = 1000)
mc.wei.scale.test.1(rweibull(100, scale = 4, shape = 2), scale = 2)
# First alternative approach
sg <- function(x, scale = 1, shape = 1) {
x <- qweibull(x, shape = shape, scale = scale)
# The following works because test_stat will be a function in the namespace
# of the function MCHTest() creates
test_stat(x, scale = scale)
}
mc.wei.scale.test.2 <- MCHTest(ts, sg, seed = 123, test_params = "scale",
nuisance_params = "shape",
optim_control = list(
lower = c("shape" = 0),
upper = c("shape" = 100),
control = list("max.time" = 10)
), threshold_pval = .2, N = 1000,
localize_functions = TRUE)
mc.wei.scale.test.2(rweibull(100, scale = 4, shape = 2), scale = 2)
# Second alternative approach
sg <- function(x, scale = 1, shape = 1) {
x <- qweibull(x, shape = shape, scale = scale)
# We will add ts() to the list of imported objects under its own name, so
# this is now okay
ts(x, scale = scale)
}
mc.wei.scale.test.3 <- MCHTest(ts, sg, seed = 123, test_params = "scale",
nuisance_params = "shape",
optim_control = list(
lower = c("shape" = 0),
upper = c("shape" = 100),
control = list("max.time" = 10)
), threshold_pval = .2, N = 1000,
localize_functions = TRUE,
imported_objects = list("ts" = ts))
mc.wei.scale.test.3(rweibull(100, scale = 4, shape = 2), scale = 2)
# Bootstrap hypothesis test
# Kolmogorov-Smirnov test for Weibull distribution via parametric botstrap
# hypothesis test
ts <- function(x) {
param <- coef(fitdist(x, "weibull"))
shape <- param[['shape']]; scale <- param[['scale']]
ks.test(x, pweibull, shape = shape, scale = scale,
alternative = "two.sided")$statistic[[1]]
}
rg <- function(x) {
n <- length(x)
param <- coef(fitdist(x, "weibull"))
shape <- param[['shape']]; scale <- param[['scale']]
rweibull(n, shape = shape, scale = scale)
}
b.ks.test <- MCHTest(test_stat = ts, stat_gen = ts, rand_gen = rg,
seed = 123, N = 1000)
b.ks.test(rbeta(100, 2, 2))
# Permutation test
df <- data.frame(
val = c(rnorm(5, mean = 2, sd = 3), rnorm(10, mean = 1, sd = 2)),
group = rep(c("x", "y"), times = c(5, 10))
)
ts <- function(x) {
means <- aggregate(val ~ group, data = x, mean)
vars <- aggregate(val ~ group, data = x, var)
counts <- aggregate(val ~ group, data = x, length)
(means$val[1] - means$val[2])/sum(vars$val / sqrt(counts$val))
}
rg <- function(x) {
x$group <- sample(x$group)
x
}
permute.test <- MCHTest(ts, ts, rg, seed = 123, N = 1000,
lock_alternative = FALSE)
permute.test(df, alternative = "two.sided")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.