NullDistribution: Specification of the Reference Distribution

NullDistributionR Documentation

Specification of the Reference Distribution

Description

Specification of the asymptotic, approximative (Monte Carlo) and exact reference distribution.

Usage

asymptotic(maxpts = 25000, abseps = 0.001, releps = 0)
approximate(nresample = 10000L, parallel = c("no", "multicore", "snow"),
            ncpus = 1L, cl = NULL, B)
exact(algorithm = c("auto", "shift", "split-up"), fact = NULL)

Arguments

maxpts

an integer, the maximum number of function values. Defaults to 25000.

abseps

a numeric, the absolute error tolerance. Defaults to 0.001.

releps

a numeric, the relative error tolerance. Defaults to 0.

nresample

a positive integer, the number of Monte Carlo replicates used for the computation of the approximative reference distribution. Defaults to 10000L.

parallel

a character, the type of parallel operation: either "no" (default), "multicore" or "snow".

ncpus

an integer, the number of processes to be used in parallel operation. Defaults to 1L.

cl

an object inheriting from class "cluster", specifying an optional parallel or snow cluster if parallel = "snow". Defaults to NULL.

B

deprecated, use nresample instead.

algorithm

a character, the algorithm used for the computation of the exact reference distribution: either "auto" (default), "shift" or "split-up".

fact

an integer to multiply the response values with. Defaults to NULL.

Details

asymptotic(), approximate() and exact() can be supplied to the distribution argument of, e.g., independence_test() to provide control of the specification of the asymptotic, approximative (Monte Carlo) and exact reference distribution, respectively.

The asymptotic reference distribution is computed using a randomised quasi-Monte Carlo method \bibcitepcoin::Genz_Bretz_2009 and is applicable to arbitrary covariance structures with dimensions up to 1000. See GenzBretz() in package mvtnorm for details on maxpts, abseps and releps.

The approximative (Monte Carlo) reference distribution is obtained by a conditional Monte Carlo procedure, i.e., by computing the test statistic for nresample random samples from all admissible permutations of the response \bf{Y} within each block \bibcitepcoin::hothorn+hornik+vandewiel:2008. By default, the distribution is computed using serial operation (parallel = "no"). The use of parallel operation is specified by setting parallel to either "multicore" (not available for MS Windows) or "snow". In the latter case, if cl = NULL (default) a cluster with ncpus processes is created on the local machine unless a default cluster has been registered (see setDefaultCluster() in package parallel) in which case that gets used instead. Alternatively, the use of an optional parallel or snow cluster can be specified by cl. See ‘Examples’ and package parallel for details on parallel operation.

The exact reference distribution, currently available for univariate two-sample problems only, is computed using either the shift algorithm \bibcitepcoin::streitberg_1984,coin::axact-dist:1986,coin::exakte-ver:1987 or the split-up algorithm \bibcitepcoin::vdwiel2001. The shift algorithm handles blocks pertaining to, e.g., pre- and post-stratification, but can only be used with positive integer-valued scores h(\bf{Y}). The split-up algorithm can be used with non-integer scores, but does not handle blocks. By default, an automatic choice is made (algorithm = "auto") but the shift and split-up algorithms can be selected by setting algorithm to "shift" or "split-up", respectively.

Note

Starting with version 1.1-0, the default for algorithm is "auto", having identical behaviour to "shift" in previous versions. In earlier versions of the package, algorithm = "shift" silently switched to the split-up algorithm if non-integer scores were detected, whereas the current version exits with a warning.

In versions prior to 1.3-0, the number of Monte Carlo replicates in approximate() was specified using the now deprecated B argument. This will be made defunct and removed in a future release. It has been replaced by the nresample argument (for consistency with the libcoin, party and partykit packages).

References

\bibshow

*

Examples

## Approximative (Monte Carlo) Cochran-Mantel-Haenszel test

## Serial operation
set.seed(123)
cmh_test(disease ~ smoking | gender, data = alzheimer,
         distribution = approximate(nresample = 100000))

## Not run: 
## Multicore with 8 processes (not for MS Windows)
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
         distribution = approximate(nresample = 100000,
                                    parallel = "multicore", ncpus = 8))

## Automatic PSOCK cluster with 4 processes
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
         distribution = approximate(nresample = 100000,
                                    parallel = "snow", ncpus = 4))

## Registered FORK cluster with 12 processes (not for MS Windows)
fork12 <- parallel::makeCluster(12, "FORK") # set-up cluster
parallel::setDefaultCluster(fork12) # register default cluster
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
         distribution = approximate(nresample = 100000,
                                    parallel = "snow"))
parallel::stopCluster(fork12) # clean-up

## User-specified PSOCK cluster with 8 processes
psock8 <- parallel::makeCluster(8, "PSOCK") # set-up cluster
set.seed(123, kind = "L'Ecuyer-CMRG")
cmh_test(disease ~ smoking | gender, data = alzheimer,
         distribution = approximate(nresample = 100000,
                                    parallel = "snow", cl = psock8))
parallel::stopCluster(psock8) # clean-up
## End(Not run)

coin documentation built on June 30, 2026, 9:06 a.m.