props_neat: Difference of Two Proportions

View source: R/props_neat.R

props_neatR Documentation

Difference of Two Proportions

Description

Comparison of paired and unpaired proportions. For unpaired: Pearson's chi-squared test or unconditional exact test, including confidence interval (CI) for the proportion difference, and corresponding independent multinomial contingency table Bayes factor (BF). (Cohen's h and its CI are also calculated.) For paired tests, classical (asymptotic) McNemar test (optionally with mid-P as well), including confidence interval (CI) for the proportion difference.

Usage

props_neat(
  var1 = NULL,
  var2 = NULL,
  case1 = NULL,
  case2 = NULL,
  control1 = NULL,
  control2 = NULL,
  prop1 = NULL,
  prop2 = NULL,
  n1 = NULL,
  n2 = NULL,
  pair = FALSE,
  greater = NULL,
  ci = NULL,
  bf_added = FALSE,
  round_to = 3,
  exact = FALSE,
  inverse = FALSE,
  yates = FALSE,
  midp = FALSE,
  h_added = FALSE,
  for_table = FALSE,
  hush = FALSE
)

Arguments

var1

First variable containing classifications, in 'group 1', for the first proportion (see Examples). If given (strictly necessary for paired proportions), proportions will be defined using var1 and var2 (see Details). To distinguish classification ('cases' and 'controls'; e.g. positive outcomes vs. negative outcomes), any two specific characters (or numbers) can be used. However, more than two different elements (apart from NAs) will cause error.

var2

Second variable containing classifications in, 'group 2', for the second proportion, analogously to var1.

case1

Number of 'cases' (as opposed to 'controls'; e.g. positive outcomes vs. negative outcomes) in 'group 1'. As counterpart, either control numbers or sample sizes needs to be given (see Details).

case2

Number of 'cases' in 'group 2'.

control1

Number of 'controls' in 'group 1'. As counterpart, case numbers need to be given (see Details).

control2

Number of 'controls' in 'group 2'.

prop1

Proportion in 'group 1'. As counterpart, sample sizes need to be given (see Details).

prop2

Proportion in 'group 2'.

n1

Number; sample size of 'group 1'.

n2

Number; sample size of 'group 2'.

pair

Logical. Set TRUE for paired proportions (McNemar, mid-P), or FALSE (default) for unpaired (chi squared, or unconditional exact test). Note: paired data must be given in var1 and var2.

greater

NULL or string (or number); optionally specifies one-sided exact test: either "1" (case1/n1 proportion expected to be greater than case2/n2 proportion) or "2" (case2/n2 proportion expected to be greater than case1/n1 proportion). If NULL (default), the test is two-sided.

ci

Numeric; confidence level for the returned CIs (proportion difference and Cohen's h).

bf_added

Logical. If TRUE, Bayes factor is calculated and displayed. (Always two-sided!)

round_to

Number to round to the proportion statistics (difference and CIs).

exact

Logical, FALSE by default. If TRUE, unconditional exact test is calculated and displayed, otherwise the default Pearson's chi-squared test.

inverse

Logical, FALSE by default. When var1 and var2 are given to calculate proportion from, by default the factors' frequency determines which are 'cases' and which are 'controls' (so that the latter are more frequent). If the inverse argument is TRUE, it reverses the default proportion direction.

yates

Logical, FALSE by default. If TRUE, Yates' continuity correction is applied to the chi-squared (unpaired) or the McNemar (paired) test. Some authors advise this correction for certain specific cases (e.g., small sample), but evidence does not seem to support this (Pembury Smith & Ruxton, 2020).

midp

Logical, FALSE by default. If TRUE, displays an additional 'mid-P' p value (using the formula by Pembury Smith & Ruxton, 2020) for McNemar's test (Fagerland et al., 2013). This provides better control for Type I error (less false positive findings) than the classical McNemar test, while it is also probably not much less robust (Pembury Smith & Ruxton, 2020).

h_added

Logical. If TRUE, Cohen's h and its CI are calculated and displayed. (FALSE by default.)

for_table

Logical. If TRUE, omits the confidence level display from the printed text.

hush

Logical. If TRUE, prevents printing any details to console.

Details

The proportion for the two groups can be given using any of the following combinations (a) two vectors (var1 and var2), (b) cases and controls, (c) cases and sample sizes, or (d) proportions and sample sizes. Whenever multiple combinations are specified, only the first parameters (as given in the function and in the previous sentence) will be taken into account.

The Bayes factor (BF), in case of unpaired samples, is always calculated with the default r-scale of 0.707. BF supporting null hypothesis is denoted as BF01, while that supporting alternative hypothesis is denoted as BF10. When the BF is smaller than 1 (i.e., supports null hypothesis), the reciprocal is calculated (hence, BF10 = BF, but BF01 = 1/BF). When the BF is greater than or equal to 10000, scientific (exponential) form is reported for readability. (The original full BF number is available in the returned named vector as bf.)

Value

Prints exact test statistics (including proportion difference with CI, and BF) in APA style. Furthermore, when assigned, returns a named vector with the following elements: z (Z), p (p value), prop_diff (raw proportion difference), h (Cohen's h), bf (Bayes factor).

Note

Barnard's unconditional exact test is calculated via Exact::exact.test ("z-pooled").

The CI for the proportion difference in case of the exact test is calculated based on the p value, as described by Altman and Bland (2011). In case of extremely large or extremely small p values, this can be biased and misleading.

The Bayes factor is calculated via BayesFactor::contingencyTableBF, with sampleType = "indepMulti", as appropriate when both sample sizes (n1 and n2) are known in advance (as it normally happens). (For details, see contingencyTableBF, or e.g. 'Chapter 17 Bayesian statistics' in Navarro, 2019.)

References

Altman, D. G., & Bland, J. M. (2011). How to obtain the confidence interval from a P value. Bmj, 343(d2090). doi: 10.1136/bmj.d2090

Barnard, G. A. (1947). Significance tests for 2x2 tables. Biometrika, 34(1/2), 123-138. doi: 10.1093/biomet/34.1-2.123

Fagerland, M. W., Lydersen, S., & Laake, P. (2013). The McNemar test for binary matched-pairs data: Mid-p and asymptotic are better than exact conditional. BMC Medical Research Methodology, 13(1), 91. doi: 10.1186/1471-2288-13-91

Lydersen, S., Fagerland, M. W., & Laake, P. (2009). Recommended tests for association in 2x2 tables. Statistics in medicine, 28(7), 1159-1175. doi: 10.1002/sim.3531

Navarro, D. (2019). Learning statistics with R. https://learningstatisticswithr.com/

Pembury Smith, M. Q. R., & Ruxton, G. D. (2020). Effective use of the McNemar test. Behavioral Ecology and Sociobiology, 74(11), 133. doi: 10.1007/s00265-020-02916-y

Suissa, S., & Shuster, J. J. (1985). Exact unconditional sample sizes for the 2 times 2 binomial trial. Journal of the Royal Statistical Society: Series A (General), 148(4), 317-327. doi: 10.2307/2981892

Examples

# example data
set.seed(1)
outcomes_A = sample(c(rep('x', 490), rep('y', 10)))
outcomes_B = sample(c(rep('x', 400), rep('y', 100)))

# paired proportion test (McNemar)
props_neat(var1 = outcomes_A,
           var2 = outcomes_B,
           pair = TRUE)

# unpaired chi test for the same data (two independent samples assumed)
# Yates correction applied
# cf. http://www.sthda.com/english/wiki/two-proportions-z-test-in-r
props_neat(
    var1 = outcomes_A,
    var2 = outcomes_B,
    pair = FALSE,
    yates = TRUE
)

# above data given differently for unpaired test
# (no Yates corrrection)
props_neat(
    case1 = 490,
    case2 = 400,
    control1 = 10,
    control2 = 100
)

# again differently
props_neat(
    case1 = 490,
    case2 = 400,
    n1 = 500,
    n2 = 500
)

# other example data
outcomes_A2 = c(rep(1, 707), rep(0, 212),  rep(1, 256), rep(0, 144))
outcomes_B2 = c(rep(1, 707), rep(0, 212),  rep(0, 256), rep(1, 144))

# paired test
# cf. https://www.medcalc.org/manual/mcnemartest2.php
props_neat(var1 = outcomes_A2,
           var2 = outcomes_B2,
           pair = TRUE)

# show reverse proportions (otherwise the same)
props_neat(
    var1 = outcomes_A2,
    var2 = outcomes_B2,
    pair = TRUE,
    inverse = TRUE
)


# two different sample sizes
out_chi = props_neat(
    case1 = 40,
    case2 = 70,
    n1 = 150,
    n2 = 170
)

# exact test
out_exact = props_neat(
    case1 = 40,
    case2 = 70,
    n1 = 150,
    n2 = 170,
    exact = TRUE
)

# the two p values are just tiny bit different
print(out_chi) # p 0.00638942
print(out_exact) # p 0.006481884

# one-sided test
props_neat(
    case1 = 40,
    case2 = 70,
    n1 = 150,
    n2 = 170,
    greater = '2'
)


neatStats documentation built on Dec. 8, 2022, 1:13 a.m.