sampleSize_binary: Sample size for a binary exposure

View source: R/sampleSize_common.R

sampleSize_binaryR Documentation

Sample size for a binary exposure

Description

Calculates the required sample size of as case-control study with a binary exposure variable

Usage

sampleSize_binary(prev, logOR, probXeq1=NULL, distF=NULL, data=NULL, 
      size.2sided=0.05, power=0.9, cc.ratio=0.5, interval=c(-100, 100), tol=0.0001,
      n.samples=10000) 

Arguments

prev

Number between 0 and 1 giving the prevalence of disease. No default.

logOR

Vector of ordered log-odds ratios for the confounders and exposure. The last log-odds ratio in the vector is for the exposure. If the option data (below) is specified, then the order must match the order of data. No default.

probXeq1

NULL or a number between 0 and 1 giving the probability that the exposure variable is 1. If set to NULL, the the data option must be specified so that probXeq1 can be estimated. The default is NULL.

distF

NULL, a function or a character string giving the function to generate random vectors from the distribution of the confounders and exposure. The order of the returned vector must match the order of logOR. User defined functions are also allowed, provided the user-defined function has only one integer valued argument that inputs the number of random vectors to generate. For instance the header of a user-defined function called "userF" would be userF <- function(n). The default depends on other options (see details).

data

NULL, matrix, data frame or a list of type file.list that gives a sample from the distribution of the confounders and exposure. If a matrix or data frame, then the last column consists of random values for the exposure, while the other columns are for the confounders. The order of the columns must match the order of the vector logOR. The default is NULL.

size.2sided

Number between 0 and 1 giving the size of the 2-sided hypothesis test. The default is 0.05.

power

Number between 0 and 1 for the desired power of the test. The default is 0.9.

cc.ratio

Number between 0 and 1 for the proportion of cases in the case-control sample. The default is 0.5.

interval

Two element vector giving the interval to search for the estimated intercept parameter. The default is c(-100, 100).

tol

Positive value giving the stopping tolerance for the root finding method to estimate the intercept parameter. The default is 0.0001.

n.samples

Integer giving the number of random vectors to generate when the option distF is specified. The default is 10000.

Details

If there are no confounders (length(logOR) = 1), then either probXeq1 or data must be specified, where probXeq1 takes precedance. If there are confounders (length(logOR) > 1), then either data or distF must be specified, where data takes precedance.

Value

A list containing four sample sizes, where two of them are for a Wald test and two for a score test. The two sample sizes for each test correspond to the equations for n_{1} and n_{2}.

See Also

sampleSize_continuous, sampleSize_ordinal, sampleSize_data

Examples

  prev  <- 0.01
  logOR <- 0.3

  # No confounders, Prob(X=1)=0.2
  sampleSize_binary(prev, logOR, probXeq1=0.2) 

  # Generate data for a N(0,1) confounder and binary exposure
  data <- cbind(rnorm(1000), rbinom(1000, 1, 0.4))
  beta <- c(0.1, 0.2)
  sampleSize_binary(prev, beta, data=data) 

  # Define a function to generate random vectors for two confounders and the binary exposure
  f <- function(n) {cbind(rnorm(n), rbinom(n, 3, 0.5), rbinom(n, 1, 0.3))}
  logOR <- c(0.2, 0.3, 0.25)
  sampleSize_binary(prev, logOR, distF=f) 


samplesizelogisticcasecontrol documentation built on Aug. 21, 2023, 5:07 p.m.