epi.sscompb: Sample size and power when comparing binary outcomes
In epiR: Tools for the Analysis of Epidemiological Data

epi.sscompb

R Documentation

Sample size and power when comparing binary outcomes

Description

Sample size and power when comparing binary outcomes.

Usage

epi.sscompb(N = NA, treat, control, n, power, r = 1, design = 1,
   sided.test = 2, nfractional = FALSE, conf.level = 0.95)

Arguments

`N`	scalar integer, the total number of individuals eligible for inclusion in the study. If `N = NA` the number of individuals eligible for inclusion is assumed to be infinite.
`treat`	the expected proportion for the treatment group (see below).
`control`	the expected proportion for the control group (see below).
`n`	scalar, defining the total number of subjects in the study (i.e., the number in the treatment plus the number in the control group).
`power`	scalar, the required study power.
`r`	scalar, the number in the treatment group divided by the number in the control group.
`design`	scalar, the estimated design effect.
`sided.test`	use a one- or two-sided test? Use a two-sided test if you wish to evaluate whether or not the outcome proportion in the exposed (treatment) group is greater than or less than the outcome proportion in the unexposed (control) group. Use a one-sided test to evaluate whether or not the outcome proportion in the exposed (treatment) group is greater than the outcome proportion in the unexposed (control) group.
`nfractional`	logical, return fractional sample size.
`conf.level`	scalar, defining the level of confidence in the computed result.

Details

The methodology in this function follows the approach described in Chapter 8 of Woodward (2014), pp. 295 - 329.

With this function it is assumed that one of the two proportions is known and we want to test the null hypothesis that the second proportion is equal to the first. Users are referred to the epi.sscohortc function which relates to the two-sample problem where neither proportion is known (or assumed, at least).

Because there is much more uncertainty in the two sample problem where neither proportion is known, epi.sscohortc returns much larger sample size estimates. This function (epi.sscompb) should be used in particular situations such as when a politician claims that at least 90% of the population use seatbelts and we want to see if the data supports this claim.

If a value is entered for N.psu a finite population correction factor (Cochran 1977) is applied to the calculated sample size estimate. That is:

n_{adj} = \frac{n}{1 + \left(\frac{n}{N}\right)}

Value

A list containing the following:

`n.total`	the total number of subjects required for the specified level of confidence and power, respecting the requirement for `r` times as many individuals in the treatment group compared with the control group.
`n.treat`	the total number of subjects in the treatment group for the specified level of confidence and power, respecting the requirement for `r` times as many individuals in the treatment group compared with the control group.
`n.control`	the total number of subjects in the control group for the specified level of confidence and power, respecting the requirement for `r` times as many individuals in the treatment group compared with the control group.
`power`	the power of the study given the number of study subjects, the expected effect size and level of confidence.
`lambda`	the proportion in the treatment group divided by the proportion in the control group (a risk ratio).

Note

The power of a study is its ability to demonstrate the presence of an association, given that an association actually exists.

Values need to be entered for control, n, and power to return a value for lambda. In this situation, the lower value of lambda represents the maximum detectable risk ratio that is less than 1; the upper value of lambda represents the minimum detectable risk ratio greater than 1.

See the documentation for epi.sscohortc for an example using the design facility implemented in this function.

References

Cochran WG (1977). Sampling Techniques. John Wiley and Sons, London.

Fleiss JL (1981). Statistical Methods for Rates and Proportions. Wiley, New York.

Kelsey JL, Thompson WD, Evans AS (1986). Methods in Observational Epidemiology. Oxford University Press, London, pp. 254 - 284.

Woodward M (2014). Epidemiology Study Design and Data Analysis. Chapman & Hall/CRC, New York, pp. 295 - 329.

Examples

## EXAMPLE 1 (from Woodward 2014 Example 8.12 p. 312):
## A government initiative has decided to reduce the prevalence of male  
## smoking to, at most, 30%. A sample survey is planned to test, at the 
## 0.05 level, the hypothesis that the percentage of smokers in the male 
## population is 30% against the one-sided alternative that it is greater.
## The survey should be able to find a prevalence of 32%, when it is true,
## with 0.90 power. How many men need to be sampled?

epi.sscompb(N = NA, treat = 0.30, control = 0.32, n = NA, power = 0.90, 
   r = 1, design = 1, sided.test = 1, nfractional = FALSE, conf.level = 0.95)
   
## A total of 4568 men should be sampled: 2284 in the treatment group and
## 2284 in the control group. The risk ratio (that is, the prevalence of 
## smoking in males post government initiative divided by the prevalence of 
## smoking in males pre inititative) is 0.94.


## EXAMPLE 2:
## If we sample only 2000 men (1000 in the treatment group and 1000 in the 
## control group) what is the maximum detectable risk ratio that is less 
## than 1?

epi.sscompb(N = NA, treat = NA, control = 0.32, n = 2000, power = 0.90, 
   r = 1, design = 1, sided.test = 1, nfractional = FALSE, conf.level = 0.95)

## If we sample only 2,000 men the maximum detectable risk ratio will be 0.91.

epiR documentation built on June 26, 2026, 9:07 a.m.