PoolPrev | R Documentation |
Estimation of prevalence based on presence/absence tests on pooled samples
PoolPrev(
data,
result,
poolSize,
...,
bayesian = TRUE,
prior = NULL,
robust = TRUE,
level = 0.95,
all.negative.pools = "zero",
reproduce.poolscreen = FALSE,
verbose = FALSE,
cores = NULL,
iter = 2000,
warmup = iter/2,
chains = 4,
control = list(adapt_delta = 0.98)
)
data |
A |
result |
The name of column with the result of each test on each pooled sample. The result must be stored with 1 indicating a positive test result and 0 indicating a negative test result. |
poolSize |
The name of the column with number of specimens/isolates/insects in each pool |
... |
Optional name(s) of columns with variables to stratify the data by. If omitted the complete dataset is used to estimate a single prevalence. If included, prevalence is estimated separately for each group defined by these columns |
bayesian |
Logical indicating whether Bayesian calculations should be calculated. If TRUE (the default) calculates frequentist and Bayesian estimates of prevalence, otherwise only calculates frequentist estimates (MLE and likelihood ratio confidence intervals). |
prior |
Prior for prevalence, ignored if |
robust |
Logical. If |
level |
Defines the confidence level to be used for the confidence and credible intervals. Defaults to 0.95 (i.e. 95% intervals) |
all.negative.pools |
The kind of point estimate and interval to use when
all pools are negative (Bayesian estimates only). If |
reproduce.poolscreen |
(defaults to FALSE). If TRUE this changes the way that likelihood ratio confidence intervals are computed to be somewhat wider and more closely match those returned by Poolscreen. We recommend using the default (FALSE). However setting to TRUE can help to make comparisons between PoolPrev and Poolscreen. |
verbose |
Logical indicating whether to print progress to screen.
Defaults to false (no printing to screen). Ignored if |
cores |
The number of CPU cores to be used. By default one core is used.
Ignored if |
iter , warmup , chains |
MCMC options for passing onto the sampling routine.
See stan for details. Ignored if |
control |
A named list of parameters to control the sampler's behaviour.
Defaults to default values as defined in stan, except for
|
An object of class PoolPrevOutput
, which inherits from
class tbl
.
The output includes the following columns:
PrevMLE
– (the Maximum Likelihood Estimate of prevalence)
CILow
and CIHigh
- lower and upper confidence
intervals using the likelihood
ratio method
PrevBayes
– the (Bayesian) posterior expectation. Omitted
if bayesian == FALSE
.
CrILow
and CrIHigh
– lower and upper bounds for
credible intervals. Omitted if bayesian == FALSE
.
ProbAbsent
– the posterior probability that prevalence is
exactly 0 (i.e. disease marker is absent). NA if using default
Jeffrey's prior or if prior$absent == 0
. Omitted if
bayesian == FALSE
.
NumberOfPools
– number of pools
NumberPositive
– the number of positive pools
If grouping variables are provided in ...
there will be an
additional column for each grouping variable. When there are no grouping
variables (supplied in ...
) then the output has only one row with
the prevalence estimates for the whole dataset. When grouping variables are
supplied, then there is a separate row for each group.
The custom print method summarises the output data frame by representing
the prevalence and credible intervals as a single column in the form
"Prev (CLow - CHigh)"
where Prev
is the prevalence,
CLow
is the lower confidence/credible interval and CHigh
is
the upper confidence/credible interval. In the print method, prevalence is
represented as a percentage (i.e., per 100 units)
HierPoolPrev
, getPrevalence
#Try out on a synthetic dataset consisting of pools (sizes 1, 5, or 10) taken
#from 4 different regions and 3 different years. Within each region specimens
#are collected at 4 different villages, and within each village specimens are
#collected at 8 different sites.
# Start by calculate frequentist estimates only (much faster)
#Prevalence across the whole (synthetic) dataset
PoolPrev(SimpleExampleData, Result, NumInPool, bayesian = FALSE)
#Prevalence in each Region
PoolPrev(SimpleExampleData, Result, NumInPool, Region, bayesian = FALSE)
#Prevalence for each year
PoolPrev(SimpleExampleData, Result, NumInPool, Year, bayesian = FALSE)
#Prevalence for each combination of region and year
PoolPrev(SimpleExampleData, Result, NumInPool, Region, Year, bayesian = FALSE)
#Prevalence across the whole (synthetic) dataset, also including Bayesian Estimates - slower
PoolPrev(SimpleExampleData, Result, NumInPool)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.