PoolPrev: Estimation of prevalence based on presence/absence tests on...

View source: R/PoolPrev.R

PoolPrevR Documentation

Estimation of prevalence based on presence/absence tests on pooled samples


Estimation of prevalence based on presence/absence tests on pooled samples


  prior.alpha = NULL,
  prior.beta = NULL,
  prior.absent = 0,
  level = 0.95,
  reproduce.poolscreen = FALSE,
  verbose = FALSE,
  cores = NULL,
  iter = 2000,
  warmup = iter/2,
  chains = 4,
  control = list(adapt_delta = 0.9)



A data.frame with one row for each pooled sampled and columns for the size of the pool (i.e. the number of specimens / isolates / insects pooled to make that particular pool), the result of the test of the pool. It may also contain additional columns with additional information (e.g. location where pool was taken) which can optionally be used for stratifying the data into smaller groups and calculating prevalence by group (e.g. calculating prevalence for each location)


The name of column with the result of each test on each pooled sample. The result must be stored with 1 indicating a positive test result and 0 indicating a negative test result.


The name of the column with number of specimens/isolates/insects in each pool


Optional name(s) of columns with variables to stratify the data by. If omitted the complete dataset is used to estimate a single prevalence. If included, prevalence is estimated separately for each group defined by these columns

prior.alpha, prior.beta, prior.absent

The default prior for the prevalence is the uninformative Jeffrey's prior, however you can also specify a custom prior with a beta distribution (with parameters prior.alpha and prior.beta) modified to have a point mass of zero i.e. allowing for some prior probability that the true prevalence is exactly zero (prior.absent). Another popular uninformative choice is prior.alpha = 1, prior.beta = 1, prior.absent = 0, i.e. a uniform prior.


Defines the confidence level to be used for the confidence and credible intervals. Defaults to 0.95 (i.e. 95% intervals)


(defaults to FALSE). If TRUE this changes the way that likelihood ratio confidence intervals are computed to be somewhat wider and more closely match those returned by Poolscreen. We recommend using the default (FALSE). However setting to TRUE can help to make comparisons between PoolPrev and Poolscreen.


Logical indicating whether to print progress to screen. Defaults to false (no printing to screen).


The number of CPU cores to be used. By default one core is used

iter, warmup, chains

MCMC options for passing onto the sampling routine. See stan for details.


A named list of parameters to control the sampler's behaviour. Defaults to default values as defined in stan, except for adapt_delta which is set to the more conservative value of 0.9. See stan for details.


A data.frame with columns:

  • PrevMLE (the Maximum Likelihood Estimate of prevalence)

  • CILow and CIHigh - lower and upper confidence intervals using the likelihood ratio method

  • PrevBayes the (Bayesian) posterior expectation

  • CrILow and CrIHigh – lower and upper bounds for credible intervals

  • ProbAbsent the posterior probability that prevalence is exactly 0 (i.e. disease marker is absent). NA if using default Jeffrey's prior or if prior.absent = 0.

  • NumberOfPools – number of pools

  • NumberPositive – the number of positive pools

If grouping variables are provided in ... there will be an additional column for each grouping variable. When there are no grouping variables (supplied in ...) then the output has only one row with the prevalence estimates for the whole dataset. When grouping variables are supplied, then there is a separate row for each group.

PoolTestR documentation built on July 1, 2022, 9:06 a.m.