View source: R/adjusted_envelopes.r
GET.composite | R Documentation |
Adjusted global envelope tests for composite null hypothesis.
GET.composite(
X,
X.ls = NULL,
nsim = 499,
nsimsub = nsim,
simfun = NULL,
fitfun = NULL,
calcfun = function(X) {
X
},
testfuns = NULL,
...,
type = "erl",
alpha = 0.05,
alternative = c("two.sided", "less", "greater"),
probs = c(0.025, 0.975),
r_min = NULL,
r_max = NULL,
take_residual = FALSE,
save.cons.envelope = savefuns,
savefuns = FALSE,
verbose = TRUE,
MrkvickaEtal2017 = FALSE,
mc.cores = 1L
)
X |
An object containing the data in some form.
A |
X.ls |
A list of objects as |
nsim |
The number of simulations to be generated in the primary test.
Ignored if |
nsimsub |
Number of simulations in each basic test. There will be |
simfun |
A function for generating simulations from the null model. If given, this function
is called by |
fitfun |
A function for estimating the parameters of the null model.
The function should return the fitted model in the form that it can be directly
passed to |
calcfun |
A function for calculating a summary function from a simulation of the model.
The default is the identity function, i.e. the simulations from the model are functions themselves.
The use of |
testfuns |
A list of lists of parameters to be passed to the |
... |
Additional parameters to the |
type |
The type of the global envelope with current options for 'rank', 'erl', 'cont', 'area', 'qdir', 'st' and 'unscaled'. See details. |
alpha |
The significance level. The 100(1-alpha)% global envelope will be calculated under the 'fwer' or 'fdr' control. If a vector of values is provided, the global envelopes are calculated for each value. |
alternative |
A character string specifying the alternative hypothesis.
Must be one of the following: "two.sided" (default), "less" or "greater".
The last two options only available for types |
probs |
A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllymäki et al. (2015, 2017). |
r_min |
The minimum argument value to include in the test. |
r_max |
The maximum argument value to include in the test.
in calculating functions by the |
take_residual |
Logical. If TRUE (needed for visual reasons only) the mean of the simulated functions is reduced from the functions in each first and second stage test. |
save.cons.envelope |
Logical flag indicating whether to save the unadjusted envelope test results. |
savefuns |
Logical flag indicating whether to save all the simulated function values.
Similar to the same argument of the |
verbose |
Logical flag indicating whether to print progress reports during the simulations.
Similar to the same argument of |
MrkvickaEtal2017 |
Logical. If TRUE, type is "st" or "qdir" and several test functions are used,
then the combined scaled MAD envelope presented in Mrkvička et al. (2017) is calculated. Otherwise,
the two-step procedure described in |
mc.cores |
The number of cores to use, i.e. at most how many child processes will be run simultaneously.
Must be at least one, and parallelization requires at least two cores. On a Windows computer mc.cores must be 1
(no parallelization). For details, see |
The specification of X, X.ls, fitfun, simfun is important:
If X.ls
is provided, then the global envelope test is calculated based on
functions in these objects. X
should be a curve_set
object,
or an envelope
object of spatstat, including the observed function and simulations
from the tested model. X.ls
should be a list of curve_set
or
envelope (of R package spatstat) objects, where each component contains an "observed"
function f that has been simulated from the model fitted to the data and the simulations
that have been obtained from the same model that has been fitted to the "observed" f.
The user has the responsibility that the functions have been generated correctly,
the test is done based on these provided simulations. See the examples.
Otherwise, if simfun
and fitfun
are provided, X
can be general.
The function fitfun
is used for fitting the desired model M and the function simfun
for simulating from a fitted model M. These functions should be coupled with each other such
that the object returned by fitfun
is directly accepted as the (single) argument in
simfun
.
In the case, that the global envelope should not be calculated directly for X
(X
is
not a function), calcfun
can be used to specify how to calculate the function from
X
and from simulations generated by simfun
.
Special attention is needed in the correct specification of the functions, see examples.
Otherwise, X
should be either a fitted (point process) model object or a ppp
object of the R package spatstat.
If X
is a fitted (point process) model object of the R package spatstat,
then the simulations from this model are generated and summary functions for testing calculated
by the envelope
function of spatstat. Which summary function to use and how to calculate it,
can be passed to envelope
either in ...
or testfuns
.
Unless otherwise specified the default function of envelope
,
i.g. the K-function, is used. The argument testfuns
should be used to specify the
test functions in the case where one wants to base the test on several test functions.
If X
is a ppp
object of spatstat, then the envelope
function
is used for simulations and model fitting and the complete spatial randomness (CSR) is tested
(with intensity parameter).
For the rank envelope test, the global envelope test is the test described in Myllymäki et al. (2017) with the adjustment of Baddeley et al. (2017). For other test types, the test (also) uses the two-stage procedure of Dao and Genton (2014) with the adjustment of Baddeley et al. (2017) as descripbed in Myllymäki and Mrkvička (2020).
See examples also in saplings
.
An object of class global_envelope
or combined_global_envelope
, which can be
printed and plotted directly. See global_envelope_test
.
Baddeley, A., Hardegen, A., Lawrence, T., Milne, R. K., Nair, G. and Rakshit, S. (2017). On two-stage Monte Carlo tests of composite hypotheses. Computational Statistics and Data Analysis 114: 75-87. doi: http://dx.doi.org/10.1016/j.csda.2017.04.003
Dao, N.A. and Genton, M. (2014). A Monte Carlo adjusted goodness-of-fit test for parametric models describing spatial point patterns. Journal of Graphical and Computational Statistics 23, 497-517.
Mrkvička, T., Myllymäki, M. and Hahn, U. (2017) Multiple Monte Carlo testing, with applications in spatial point processes. Statistics & Computing 27(5): 1239-1255. doi: 10.1007/s11222-016-9683-9
Myllymäki, M., Mrkvička, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79: 381-404. doi: 10.1111/rssb.12172
Myllymäki, M. and Mrkvička, T. (2020). GET: Global envelopes in R. arXiv:1911.06583 [stat.ME]. https://doi.org/10.48550/arXiv.1911.06583
global_envelope_test
, plot.global_envelope
, saplings
# Graphical normality test (Myllymaki and Mrkvicka, 2020, Section 3.3.)
#=========================
if(require("fda.usc", quietly=TRUE)) {
data("poblenou")
dat <- poblenou[['nox']][['data']][,'H10']
n <- length(dat)
# The number of simulations
nsim <- nsimsub <- 199
set.seed(200127)
# General setup
#==============
# 1. Fit the model
mu <- mean(dat)
sigma <- sd(dat)
# 2. Simulate a sample from the fitted null model and
# compute the test vectors for data (obs) and each simulation (sim)
r <- seq(min(dat), max(dat), length=100)
obs <- stats::ecdf(dat)(r)
sim <- sapply(1:nsimsub, function(i) {
x <- rnorm(n, mean = mu, sd = sigma)
stats::ecdf(x)(r)
})
cset <- create_curve_set(list(r = r, obs = obs, sim_m = sim))
# 3. Simulate another sample from the fitted null model.
# 4. Fit the null model to each of the patterns,
# simulate a sample from the null model,
# and compute the test vectors for all.
cset.ls <- vector("list", nsim)
for(rep in 1:nsim) {
x <- rnorm(n, mean = mu, sd = sigma)
mu2 <- mean(x)
sigma2 <- sd(x)
obs2 <- stats::ecdf(x)(r)
sim2 <- sapply(1:nsimsub, function(i) {
x2 <- rnorm(n, mean = mu2, sd = sigma2)
stats::ecdf(x2)(r)
})
cset.ls[[rep]] <- create_curve_set(list(r = r, obs = obs2, sim_m = sim2))
}
# Perform the adjusted test
res <- GET.composite(X = cset, X.ls = cset.ls, type = 'erl')
plot(res) + ggplot2::labs(x = "NOx", y = "Ecdf")
}
# Example of a point pattern data
#================================
# Test the fit of a Matern cluster process.
if(require("spatstat.model", quietly=TRUE)) {
data("saplings")
saplings <- as.ppp(saplings, W = square(75))
# First choose the r-distances
rmin <- 0.3; rmax <- 10; rstep <- (rmax-rmin)/500
r <- seq(0, rmax, by = rstep)
# Number of simulations
nsim <- 19 # Increase nsim for serious analysis!
# Option 1: Give the fitted model object to GET.composite
#--------------------------------------------------------
# This can be done and is preferable when the model is
# a point process model of spatstat.
# 1. Fit the Matern cluster process to the pattern
# (using minimum contrast estimation with the K-function)
M1 <- kppm(saplings~1, clusters = "MatClust", statistic = "K")
summary(M1)
# 2. Make the adjusted global area rank envelope test using the L(r)-r function
adjenvL <- GET.composite(X = M1, nsim = nsim,
testfuns = list(L =list(fun="Lest", correction="translate",
transform=expression(.-r), r=r)), # passed to envelope
type = "area", r_min = rmin, r_max = rmax)
# Plot the test result
plot(adjenvL)
# Option 2: Generate the simulations "by yourself"
#-------------------------------------------------
# and provide them as curve_set or envelope objects
# Preferable when you want to have a control
# on the simulations yourself.
# 1. Fit the model
M1 <- kppm(saplings~1, clusters = "MatClust", statistic = "K")
# 2. Generate nsim simulations by the given function using the fitted model
X <- envelope(M1, nsim = nsim, savefuns = TRUE,
fun = "Lest", correction = "translate",
transform = expression(.-r), r = r)
plot(X)
# 3. Create another set of simulations to be used to estimate
# the second-state p-value (as proposed by Baddeley et al., 2017).
simpatterns2 <- simulate(M1, nsim = nsim)
# 4. Calculate the functions for each pattern
simf <- function(rep) {
# Fit the model to the simulated pattern Xsims[[rep]]
sim_fit <- kppm(simpatterns2[[rep]], clusters = "MatClust", statistic = "K")
# Generate nsim simulations from the fitted model
envelope(sim_fit, nsim = nsim, savefuns = TRUE,
fun = "Lest", correction = "translate",
transform = expression(.-r), r = r)
}
X.ls <- parallel::mclapply(X = 1:nsim, FUN = simf, mc.cores = 1) # list of envelope objects
# 5. Perform the adjusted test
res <- GET.composite(X = X, X.ls = X.ls, type = "area",
r_min = rmin, r_max = rmax)
plot(res)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.