uncondExact2x2: Unconditional exact tests for 2x2 tables

View source: R/uncondExact2x2.R

uncondExact2x2R Documentation

Unconditional exact tests for 2x2 tables

Description

The uncondExact2x2 function tests 2x2 tables assuming two independent binomial responses. Unlike the conditional exact tests which condition on both margins of the 2x2 table (see exact2x2), these unconditional tests only condition on one margin of the 2x2 table (i.e., condition on the sample sizes of the binomial responses). This makes the calculations difficult because now there is a nuisance parameter and calculations must be done over nearly the entire nuisance parameter space.

Usage

uncondExact2x2(x1, n1, x2, n2, 
    parmtype = c("difference", "ratio", "oddsratio"), nullparm = NULL, 
    alternative = c("two.sided","less", "greater"),  
    conf.int = FALSE, conf.level = 0.95, 
    method = c("FisherAdj", "simple", "score","wald-pooled", "wald-unpooled",  "user", 
      "user-fixed"), 
    tsmethod = c("central","square"), midp = FALSE, 
    gamma = 0, EplusM=FALSE, tiebreak=FALSE,
    plotprobs = FALSE, control=ucControl(), Tfunc=NULL,...)

uncondExact2x2Pvals(n1, n2, ...)

Arguments

x1

number of events in group 1

n1

sample size in group 1

x2

number of events in group 2

n2

sample size in group 2

parmtype

type of parameter of interest, one of "difference", "ratio" or "oddsratio" (see details)

nullparm

value of the parameter of interest at null hypothesis, NULL defaults to 0 for parmtype='difference' and 1 for parmtype='ratio' or 'oddsratio'

alternative

alternative hypothesis, one of "two.sided", "less", or "greater", default is "two.sided" (see details)

conf.int

logical, calculate confidence interval?

conf.level

confidence level

method

method type, one of "FisherAdj" (default), "simple", "simpleTB", "wald-pooled", "wald-unpooled", "score", "user", or "user-fixed" (see details)

tsmethod

two-sided method, either "central" or "square" (see details)

midp

logical. Use mid-p-value method?

gamma

Beger-Boos adjustment parameter. 0 means no adjustment. (see details).

EplusM

logical, do the E+M adjustment? (see details)

tiebreak

logical, do tiebreak adjustment? (see details)

plotprobs

logical, plot probabilities?

control

list of algorithm parameters, see ucControl

Tfunc

test statistic function for ordering the sample space when method='user', ignored otherwise (see details)

...

extra arguments passed to Tfunc (for uncondExact2x2), or passed to uncondExact2x2 (for uncondExact2x2Pvals)

Details

The uncondExact2x2 function gives unconditional exact tests and confidence intervals for two independent binomial observations. The uncondExact2x2Pvals function repeatedly calls uncondExact2x2 to get the p-values for the entire sample space.

Let X1 be binomial(n1,theta1) and X2 be binomial(n2,theta2). The parmtype determines the parameter of interest: ‘difference’ is theta2 - theta1, 'ratio' is theta2/theta1, and ‘oddsratio’ is (theta2*(1-theta1))/(theta1*(1-theta2)).

The options method, parmtype, tsmethod, alternative, EplusM, and tiebreak define some built-in test statistic function, Tstat, that is used to order the sample space, using pickTstat and calcTall. The first 5 arguments of Tstat must be Tstat(X1,N1,X2,N2, delta0), where X1 and X2 must allow vectors, and delta0 is the null parameter value (but delta0 does not need to be used in the ordering). Ordering when parmtype="ratio" or parmtype="oddsratio" is only used when there is information about the parameter. So the ordering function value is not used for ordering when x1=0 and x2=0 for parmtype="ratio", and it is not used when (x1=0 and x2=0) or (x1=n1 and x2=n2) for parmtype="oddsratio".

We describe the ordering functions first for the basic case, the case when tsmethod="central" or alternative!="two.sided", EplusM=FALSE, and tiebreak=FALSE. In this basic case the ordering function, Tstat, is determined by method and parmtype:

  • method='simple' - Tstat essentially replaces theta1 with x1/n1 and theta2 with x2/n2 in the parameter definition. If parmtype=‘difference’ then Tstat(X1,N1,X2,N2,delta0) returns X2/N2-X1/N1-delta0. If parmtype='ratio' then the Tstat function returns log(X2/N2) - log(X1/N1) - log(delta0). If parmtype='oddsratio' we get log( X2*(N1-X1)/(delta0*X1*(N2-X2))).

  • method='wald-pooled' - Tstat is a Z statistic on the difference using the pooled variance (not allowed if parmtype!="difference")

  • method='wald-unpooled' - Tstat is a Z statistics on the difference using unpooled variance (not allowed if parmtype!="difference")

  • method='score' - Tstat is a Z statistic formed using score statistics, where the parameter is defined by parmtype, and the constrained maximum likelihood estimates of the parameter are calculated by constrMLE.difference, constrMLE.ratio, or constrMLE.oddsratio.

  • method='FisherAdj' - Tstat is a one-sided Fisher's 'exact' mid p-value. The mid p-value is an adjustment for ties that technically removes the 'exactness' of the Fisher's p-value...BUT, here we are only using it to order the sample space, so the results of the resulting unconditional test will still be exact.

  • method='user' - Tstat is a user supplied statistic given by Tfunc, it must be a function with the first 5 elements of its call being (X1, N1, X2, N2, delta0). The function must returns a vector of length the same as X1 and X2, where higher values suggest larger theta2 compared to theta1 (when tsmethod!="square") or higher values suggest more extreme (when tsmethod=="square" and alternative=="two.sided"). A slower algorithm that does not require monotonicity of one-sided p-values with respect to delta0 is used.

  • method='user-fixed' - For advanced users. Tstat is a user supplied statistic given by Tfunc. It should have first 5 elements as described above but its result should not change with delta0 and it must meet Barnard's convexity conditions. If these conditions are met (the conditions are not checked, since checking them will slow the algorithm), then the p-values will be monotonic in delta0 (the null parameter for a two-sided test) and we can use a faster algorithm.

In the basic case, if alternative="two.sided", the argument tsmethod="central" gives the two-sided central method. The p-value is just twice the minimum of the one-sided p-values (or 1 if the doubling is greater than 1).

Now consider cases other than the basic case. The tsmethod="square" option gives the square of the test statistic (when method="simple", "score", "wald-pooled", or "wald-unpooled") and larger values suggest rejection in either direction (unless method='user', then the user supplies any test statistic for which larger values suggest rejection).

The tiebreak=TRUE option breaks ties in a reasonable way when method="simple" (see 'details' section of calcTall). The EplusM=TRUE option performs Lloyd's (2008) E+M ordering on Tstat (see 'details' section of calcTall).

If tiebreak=TRUE and EplusM=TRUE, the tiebreak calculations are always done first.

Berger and Boos (1994) developed a very general method for calculating p-values when a nuisance parameter is present. First, calculate a (1-gamma) confidence interval for the nuisance parameter, check for the supremum over the union of the null hypothesis parameter space and that confidence interval, then add back gamma to the p-value. This adjustment is valid (in other words, applied to exact tests it still gives an adjustment that is exact). The Berger-Boos adjustment is applied when gamma>0.

When method='simple' or method='user-fixed' does a simple grid search algorithm using unirootGrid. No checks are done on the Tstat function when method='user-fixed' to make sure the simple grid search will converge to the proper answer. So method='user-fixed' should be used by advanced users only.

When midp=TRUE the mid p-value is calculated (and the associated confidence interval if conf.int=TRUE) instead of the standard p-value. Loosely speaking, the standard p-value calculates the probability of observing equal or more extreme responses, while the mid p-value calculates the probability of more extreme responses plus 1/2 the probability of equally extreme responses. The tests and confidence intervals when midp=TRUE are not exact, but give type I error rates and coverage of confidence intervals closer to the nominal values. The mid p-value was studied by Lancaster (1961), see vignette on mid p-values for details.

See Fay and Hunsberger (2021) for a review paper giving the details for these kinds of unconditional exact tests.

Value

The function uncondExact2x2Pvals returns a (n1+1) by (n2+1) matrix of p-values for all possible x1 and x2 values, while uncondExact2x2 returns a list of class 'htest' with elements:

statistic

proportion in sample 1

parameter

proportion in sample 2

p.value

p-value from test

conf.int

confidence interval on parameter given by parmtype

estimate

MLE estimate of parameter given by parmtype

null.value

null hypothesis value of parameter given by parmtype

alternative

alternative hypothesis

method

description of test

data.name

description of data

Warning

The algorithm for calculating the p-values and confidence intervals is based on a series of grid searches. Because the grid searches are often trying to optimize non-monotonic functions, the algorithm is not guaranteed to give the correct answer. At the cost of increasing computation time, better accuracy can be obtained by increasing control$nPgrid, and less often by increasing control$nCIgrid.

Author(s)

Michael P. Fay, Sally A. Hunsberger

References

Berger, R. L. and Boos, D. D. (1994). P values maximized over a confidence set for the nuisance parameter. Journal of the American Statistical Association 89 1012-1016.

Fay, M.P. and Hunsberger, S.A. (2021). Practical valid inferences for the two-sample binomial problem. Statistics Surveys 15:72-110.

Lancaster, H.O. (1961). Significance tests in discrete distributions. JASA 56: 223-234.

Lloyd, C. J. (2008). Exact p-values for discrete models obtained by estimation and maximization. Australian & New Zealand Journal of Statistics 50 329-345.

See Also

See boschloo for unconditional exact tests with ordering function based on Fisher's exact p-values.

Examples

# default uses method="FisherAdj"
uncondExact2x2(1,10,9,10, 
               parmtype="ratio")
uncondExact2x2(1,10,9,10, 
               method="score",parmtype="ratio")






exact2x2 documentation built on Feb. 16, 2023, 10:11 p.m.