# <h1 style='text-align:center;'>Sample size calculations using <img src='../man/figures/logo.svg' style='vertical-align:middle; width:10%; margin-right:10px;'/></h1>" In epiR: Tools for the Analysis of Epidemiological Data

\setmainfont{Calibri Light}

# If you want to create a PDF document paste the following after line 9 above:
#   pdf_document:
#     toc: true
#     highlight: tango
#     number_sections: no
#     latex_engine: xelatex
#    - \usepackage{fontspec}

knitr::opts_chunk$set(collapse = TRUE, comment = "#>") options(tibble.print_min = 4L, tibble.print_max = 4L)  The EpiTools app for iPhone and Android devices provides access to many of the sample size functions in epiR using a smart phone. ### Prevalence estimation A review of sample size calculations in (veterinary) epidemiological research is provided by @stevenson:2021. The expected seroprevalence of brucellosis in a population of cattle is thought to be in the order of 15%. How many cattle need to be sampled and tested to be 95% certain that our seroprevalence estimate is within 20% of the true population value. That is, from 15 - (0.20$\times$0.15) to 15 + (0.20$\times$0.15 = 0.03) i.e., from 12% to 18%. Assume the test you will use has perfect sensitivity and specificity. The population size is unknown so we set N = NA: library(epiR) epi.sssimpleestb(N = NA, Py = 0.15, epsilon = 0.20, error = "relative", se = 1, sp = 1, nfractional = FALSE, conf.level = 0.95)  A total of 545 cows need to be sampled to meet the requirements of the study. Let's say we have a reasonable estimate of the size of the cattle population at risk, 4000. Re-run epi.sssimpleestb, specifying N: epi.sssimpleestb(N = 4000, Py = 0.15, epsilon = 0.20, error = "relative", se = 1, sp = 1, nfractional = FALSE, conf.level = 0.95)  If the size of the population at risk is 4000, 480 cows need to be sampled to meet the requirements of the study. When a value for N is provided in epi.sssimpleestb the function automatically applies a finite correction factor. In this example if we assumed an infinite (i.e., very large) population size the required sample size was 545. When the population size was (only) 4000, the required sample size reduced to 480. ### Prospective cohort study A prospective cohort study of dry food diets and feline lower urinary tract disease (FLUTD) in mature male cats is planned. A sample of cats will be selected at random from the population of cats in a given area and owners who agree to participate in the study will be asked to complete a questionnaire at the time of enrollment. Cats enrolled into the study will be followed for at least 5 years to identify incident cases of FLUTD. The investigators would like to be 0.80 certain of being able to detect when the risk ratio of FLUTD is 1.4 for cats habitually fed a dry food diet, using a 0.05 significance test. Previous evidence suggests that the incidence risk of FLUTD in cats not on a dry food (i.e., 'other') diet is around 50 per 1000. Assuming equal numbers of cats on dry food and other diets are sampled, how many cats should be sampled to meet the requirements of the study? epi.sscohortt(irexp1 = 70/1000, irexp0 = 50/1000, FT = 5, n = NA, power = 0.80, r = 1, design = 1, sided.test = 2, nfractional = FALSE, conf.level = 0.95)$n.total


A total of 2080 subjects are required (1040 exposed and 1040 unexposed).

It's important to remember that you can use the epi.sscohortt (and other sample size functions in epiR) to return the study power if a value for n is provided. Continuing the example above, imagine that only 1500 cats were enrolled into the study. What is the expected study power? Here we set n = 1500 and power = NA in epi.sscohortt:

epi.sscohortt(irexp1 = 70/1000, irexp0 = 50/1000, FT = 5, n = 1500, power = NA, r = 1,
design = 1, sided.test = 2, nfractional = FALSE, conf.level = 0.95)$power  If only 1500 cats are enrolled into the study the expected study power is 0.66. ### Case-control study A case-control study of the relationship between white pigmentation around the eyes and ocular squamous cell carcinoma in Hereford cattle is planned. A sample of cattle with newly diagnosed squamous cell carcinoma will be compared for white pigmentation around the eyes with a sample of controls. Assuming an equal number of cases and controls, how many study subjects are required to detect an odds ratio of 2.0 with 0.80 power using a two-sided 0.05 test? Previous surveys have shown that around 0.30 of Hereford cattle without squamous cell carcinoma have white pigmentation around the eyes. epi.sscc(OR = 2.0, p1 = NA, p0 = 0.30, n = NA, power = 0.80, r = 1, phi.coef = 0, design = 1, sided.test = 2, conf.level = 0.95, method = "unmatched", nfractional = FALSE, fleiss = FALSE)$n.total


If the true odds for squamous cell carcinoma in exposed subjects relative to unexposed subjects is 2.0, we will need to enroll 141 cases and 141 controls (282 cattle in total) to reject the null hypothesis that the odds ratio equals one with probability (power) 0.80. The Type I error probability associated with the test of this null hypothesis is 0.05.

### Non-inferiority trial

Suppose a pharmaceutical company would like to conduct a clinical trial to compare the efficacy of two antimicrobial agents when administered orally to patients with skin infections. Assume the true mean cure rate of the treatment is 0.85 and the true mean cure rate of the control is 0.65. We consider a difference of less than 0.10 in cure rate to be of no clinical importance (i.e., delta = 0.10). Assuming a one-sided test size of 5% and a power of 80%, how many subjects should be included in the trial?

epi.ssninfb(treat = 0.85, control = 0.65, delta = 0.10, n = NA,
r = 1, power = 0.80, nfractional = FALSE, alpha = 0.05)$n.total  A total of 50 subjects need to be enrolled in the trial, 25 in the treatment group and 25 in the control group. ### One-stage cluster sampling An aid project has distributed cook stoves in a single province in a resource-poor country. At the end of three years, the donors would like to know what proportion of households are still using their donated stove. A cross-sectional study is planned where villages in a province will be sampled and then all households (approximately 75 per village) will be visited to determine if their donated stove is still in use. A pilot study of the prevalence of stove usage in five villages showed that 0.46 of householders were still using their stove and the intracluster correlation coefficient (ICC) for stove use within villages is in the order of 0.20. If the donor wanted to be 95% confident that the survey estimate of stove usage was within 10% of the true population value, how many villages (primary sampling units) need to be sampled? epi.ssclus1estb(b = 75, Py = 0.46, epsilon = 0.10, error = "relative", rho = 0.20, conf.level = 0.95)$n.psu


A total of 96 villages need to be sampled to meet the requirements of the study.

### One-stage cluster sampling (continued)

Continuing the example above, we are now told that the number of households per village varies. The average number of households per village is 75 with a 0.025 quartile of 40 households and a 0.975 quartile of 180. Assuming the number of households per village follows a normal distribution the expected standard deviation of the number of households per village is in the order of (180 - 40) $\div$ 4 = 35. How many villages (primary sampling units) need to be sampled?

## Try the epiR package in your browser

Any scripts or data that you put into this service are public.

epiR documentation built on June 22, 2024, 10:57 a.m.