knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) set.seed(000)
The sequential.pops
package is designed to simplify the development, evaluation
and analysis of sequential designs for testing hypotheses about population
density. Sequential analyses are particularly useful for decision-making in situations
where estimation is not the goal, but where critical actions should be triggered if
population sizes reach or are above or below predefined levels. This situations include
deciding whether a pest control action should be applied, whether an area should be
declared free from an invasive species, or whether early capture numbers of a pest
are consistent with an outbreak. Sequential analyses can also be useful in fields
outside ecology and pest management, such as quality control, fraud detection and
adaptive clinical trials, but those applications are out of the scope of this package.
Sequential data arrive in sampling bouts over time and are processed as they come,
in order to evaluate hypotheses on which practical decisions are made once sufficient
evidence has been collected. Sequential designs are typically more cost-efficient
than conventional fixed-sample-size approaches, and can be purely sequential (one-at-a-time)
or group sequential (n > 1
per sampling bout). The analysis of biological population
densities is often carried out by collecting count data from traps or field observations
or binomial records from structured clusters of sampling units. The sequential.pops
package focuses on count and binomial data with or without overdispersion to test hypotheses
about non-negative population sizes.
There are other R packages that deal with sequential designs. Most existing
packages, such as GroupSeq
, Grpseq
, gsbDesign
and seqmon
, focus on adaptive clinical
trials for normally distributed variables. Others, such as gscounts
, gsDesign
, ASSISTant
,
Sequential
and BayesOrdDesign
can take non-normal variables but are also focused
on clinical trials and can hardly be adapted to process count or binomial data from
ecological or agricultural settings and evaluate hypotheses about population densities.
Other packages with a more general scope, such as MSPRT
and sprtt
, include a limited
selection of probability distributions and lack functions to evaluate tests' operating
characteristics.
The sequential.pops
is the first R package devoted to test hypotheses on biological
population densities. Two approaches are included: the Sequential Test of Bayesian
Posterior Probabilities (STBP) (Rincon et al. 2025), and the Sequential Probability
Ratio Test (SPRT) (Wald 1945). Both the STBP and the SPRT have their own pros and cons
but are the most popular and state-of-the-art approaches for sequential analyses in
ecology and pest management. This tutorial walks you through the development, evaluation
and analysis of sequential designs with STBP and SPRT using the sequential.pops
package.
This test is specifically designed to test hypotheses about population densities, although it may have applications in other fields. It works by sequentially updating the conditional probability of the hypothesis being tested as new data is processed. The two main benefits of using STBP is that is distribution-agnostic (works with virtually any probability distribution), and that is one of the few sequential analyses that is "likelihood ratio-free" meaning that it can test single hypotheses without having to specify a non-complementary alternative. As we'll see later, most sequential tests, such as the SPRT (section 2), require two hypotheses, rather than one, because they are based on likelihood ratios. Another advantage of the STBP is that it can process purely sequential (one-at-a-time) or group sequential (in batches of samples), balanced or unbalanced, and static (single-value) or dynamic (population models) hypotheses without major modifications.
The main disadvantage of the STBP is that it is so new that has not been as tested as its counterparts in the rough streets of real datasets. The STBP is also pretty sensitive to sample sizes within sampling bouts. In theory, it outperforms other sequential analysis as long as several samples are collected within each sampling bout, but type I error rate (accepting H when is false) gets ugly in purely sequential designs.
Let's start with a simple case in which one wants to test the hypothesis that certain species is absent in sampled area. As Carl Sagan said "the absence of evidence is not evidence of absence". In other words, we can never be sure about the absence of an species in an area, but we can calculate the probability of the species being absent given the collected data.
Most small population sizes, like those of recently introduced species, are well
described by Poisson distributions. So, in this case, we want to test if H: mu = 0
assuming random sampling and counts following a Poisson distribution. If we are lucky
and can collect say 30 random samples each time, and they all result in absences (zeros),
we should be able to track the (posterior) probability of H being true after each sampling bout
of 30.
Data structure in sequential.pops
for counts is based on matrices, where columns
represent time (sampling bouts) and rows samples within bouts. So, for example, if we
have collected 4 sampling bouts each with 30 samples, data should be arranged in a
30 x 4
matrix. In this case, we have all zeros:
a30 <- matrix(rep(0, 120), 30, 4) head(a30, 3) # only first three rows out of 30 are shown
To run the test, we use stbp_simple
, which is specially designed to test hypotheses
about species absence, so we do not need to specify the hypothesis. We do need to
provide the matrix with the data
, the distribution family (as a character string)
in density_func
, the initial prior
probability (often 0.5), and upper_criterion
and lower_criterion
, the upper and lower criteria to stop sampling and make a decision.
The argument upper_bnd
is almost always Inf
, since it represents the upper
limit of the parameter space for mu
.
library(sequential.pops) test30 <- stbp_simple(data = a30, density_func = "poisson", prior = 0.5, upper_bnd = Inf, lower_criterion = 0, upper_criterion = 0.9999) test30
Given that all samples are zero and assuming a completely efficient detection rate,
we can say that after 3 sampling bouts of 30 samples there is a pretty high change that
H: mu = 0
is true. How spaced in time sampling bouts should be depends on the species
being sampled and on what we consider a different sampling time. Notice that upper_criterion
and lower_criterion
are set close to 1 and 0, respectively. In reality, to test species
absence, only upper_criterion
is important to minimize type II error, since the
posterior probability of H will become 0 as soon as we get the first detection (data > 0
).
If sample size within sampling bouts is reduced, we should expect to loose power and require more bouts to collect enough evidence to achieve similar levels of certainty that the species is absent (in case we keep getting zeros, of course). So, if we can only collect 3 samples per sampling bout, and we get all zeros the matrix with the data should be specified as:
a3 <- matrix(rep(0, 30), 3, 10) a3
for 10 sampling bouts. After running the test with:
test3 <- stbp_simple(data = a3, density_func = "poisson", prior = 0.5, upper_bnd = Inf, lower_criterion = 0, upper_criterion = 0.9999) test3
we can see that we now require 9 sampling bouts to achieve similar levels of confidence
to declare that our species is absent from the sampled area. The number of bouts with
all zeros required to accept H keeps increasing as sampling size within bouts is reduced.
In fact, when sample size is one (purely sequential), the test looses power and the posterior
remains unchanged regardless the number of bouts. You can check the progression of
the posterior probabilities for different sample sizes by calling plot
, for the
sequential design with 30 samples:
plot(test30)
and for the sequential design with 3 samples:
plot(test3)
The STBP can also assist in situations where one wants to test if a given population
is above or below a relevant threshold for management. This is useful to determine,
for example, if a pest population has reached an economic threshold or if an endangered
species is below the Allee threshold. For the remaining of this tutorial, we will focus
on hypotheses about populations being above a threshold, but the procedure is similar
for hypotheses with the opposite direction. In fact this is specified through a single
argument in the function stbp_composite
.
For this example, we will use the case study of the tomato leafminer in greenhouse tomatoes (Rincon et al. 2021). The tomato leafminer is a significant pest of tomatoes but prefers mostly to feed on leaves where damage is minimum and will only turn to fruits when population density increases. The economic threshold varies depending on whether tomato fruits are present or not, but for now let's assume the threshold is static (constant) at 9 larvae per plant and we want to test through sequential sampling if a population has exceeded such threshold.
When population densities are not so small, chances are that the Poisson distribution is no longer appropriate to describe counts and that a probability distribution a that allow for overdispersion (or aggregation) is required. As populations increase, very often they aggregate in clusters which increase the dispersion of the data to levels that are not accounted for by the Poisson. This is the case for the tomato leafminer, so the STBP should be defined with a negative binomial distribution and some specification for the dispersion, which is obtained from studies of the counts observed in the field. For more details about how to conduct these studies the reader is referred to textbooks like Binns et al. (2000).
Let's start by generating some negative binomial data in a purely sequential design
(n = 1
per sampling bout):
counts1 <- rnbinom(20, mu = 5, size = 4.5) counts1
This is 20 random counts from a population of 5 larvae per sample unit (plant) in size
that follows a negative binomial distribution with an overdispersion of 4.5. We can now
build a test for the hypothesis of H: mu > 9
larvae per plant with:
test_counts1 <- stbp_composite(data = counts1, greater_than = TRUE, hypothesis = 9, density_func = "negative binomial", overdispersion = 4.5, prior = 0.5, lower_bnd = 0, upper_bnd = Inf, lower_criterion = 0.01, upper_criterion = 0.99) test_counts1
Notice that this time we use the function stbp_composite
, which is designed to test
threshold-style hypotheses. The direction of the hypothesis (whether is "greater than" or
"less than") is controlled by the argument greater_than
and the threshold is included in
the argument hypothesis
. lower_bnd
and upper_bnd
are the parameter space boundaries for
the hypothesis, mu
, and is almost always 0 and Inf
, respectively. lower_criterion
and
upper_criterion
are the posterior probabilities above or below which a decision is
made.
Many times, however, overdispersion is not this simple and should be specified as a
function of the mean, rather than as a constant. This is the case for most animal and
plant populations because overdispersion is highly tied to the variance, and variance
of population counts tends to increase exponentially with the mean. There are several
ways to empirically describe variance (and overdispersion) as a function of the mean,
but the most popular one is Taylor's Power Law. In the sequential.pop
package, you can
use whatever function you want, as it just must be specified before running the test.
For example, the Taylor's Power Law parameters that describe the variance-mean relationship
of tomato leafminer counts are a = 1.83
and b = 1.21
, so overdispersion, k
, is
given by:
estimate_k <- function(mean) { a = 1.83 b = 1.21 (mean^2) / ((a * mean^(b)) - mean) }
we can also incorporate more realistic values of overdispersion to the generated counts
by including estimate_k
in the count generation function:
counts1a <- rnbinom(20, mu = 5, size = estimate_k(5)) counts1a
and the test can be specified as:
test_counts1a <- stbp_composite(data = counts1a, greater_than = TRUE, hypothesis = 9, density_func = "negative binomial", overdispersion = "estimate_k", prior = 0.5, lower_bnd = 0, upper_bnd = Inf, lower_criterion = 0.01, upper_criterion = 0.99) test_counts1a
the overdispersion function is specified as a character string with the function name which must be added to the R environment before running the test. Although it is not the case for this instance of randomly generated numbers, there are noticeable gains in efficiency (fewer average number of sampling bouts to make a decision) and accuracy (less bad decisions) when an appropriate the overdispersion function is included as part of a STBP (Rincon et al. 2025).
As with stbp_simple
you can call plot
to visualize the change in posterior
probabilities with sequential sampling bouts for stbp_composite
tests.
For the test with a constant overdispersion:
plot(test_counts1)
and with overdispersion as a function of the mean:
plot(test_counts1a)
Sometimes data on populations do not come in counts but as binary outcomes when sampling
units are evaluated as presence/absence or infested/uninfected, for example. For those
cases, stbp_composite
is equipped with binomial
and beta-binomial
options for
the argument density_func
. Like count data, the selection of one or the other depends
on whether there is overdispersion in the data, with beta-binomial
being the option for
overdispersed binomial variables. For these kind of data, two pieces of information are
required from each sample unit: the number of successes (e.g., present or infected)
and the total number of samples from which observations where made. For this reason, data
for binomial variables should be structured as a list()
of n x 2
matrices, where
each matrix corresponds to a sampling bout, and n
is the number of sample units (clusters
of samples examined). The number of successes must be included in column 1 and the
number of samples in column 2 of each matrix.
For example, if a sequential sampling is designed to collect 3 clusters (sampling units) each made of 10 samples and 7 sampling bouts have been collected data should be introduced like:
library(emdbook) # for rbetabinom() binom1 <- list() for(i in 1: 7) { binom1[[i]] <- matrix(c(emdbook::rbetabinom(3, prob = 0.2, size = 10, theta = 6.5), rep(10, 3)), 3, 2) } head(binom1, 3) # only first three sampling bouts out of seven are shown.
The data generated in the list binom1
is overdispersed with a constant overdispersion
parameter, theta
and mean incidence of prob = 0.2
. All the required support
to work with the beta-binomial distribution is in the emdbook
package, so it must
be loaded before running any sequential tests that involve this distribution. The
STBP can be run on the data binom1
with:
test_bin1 <- stbp_composite(data = binom1, greater_than = TRUE, hypothesis = 0.15, density_func = "beta-binomial", overdispersion = 6.5, prior = 0.5, lower_bnd = 0, upper_bnd = 1, lower_criterion = 0.001, upper_criterion = 0.999) test_bin1
where the STBP correctly infers that the data in binom1
is consistent with
H: mu > 0.15
. Like overdispersed count data, overdispersion for binomial data
can also be included as a function of mean proportion (sometines called incidence).
Again, the overdispersion model is derived from the variance-mean (variance-incidence)
relationship, and the most popular form is the one provided by Madden et al. (2007),
although the sequential.pops
package allows you to pick whatever function you want.
The function that describes overdispersion, theta
, as a function of the mean proportion
for the tomato leafminer is defined as (Madden et al. 2007):
estimate_theta <- function(mean) { A = 780.72 b = 1.61 R = 10 (1 / (R - 1)) * (((A * R^(1 - b))/ ((mean * (1 - mean))^(1 - b))) - 1) }
where A
and b
are parameters from the variance-incidence model and R
is the
number of samples in each sampling unit (cluster). We can generate a new binomial
dataset that includes variable overdispersion:
binom2 <- list() for(i in 1: 7) { binom2[[i]] <- matrix(c(rbetabinom(3, prob = 0.2, size = 10, theta = estimate_theta(0.2)), rep(10, 3)), 3, 2) } head(binom2, 3) # only first three sampling bouts out of seven are shown.
and then we can specify and run a STBP as:
test_bin2 <- stbp_composite(data = binom2, greater_than = TRUE, hypothesis = 0.15, density_func = "beta-binomial", overdispersion = "estimate_theta", prior = 0.5, lower_bnd = 0, upper_bnd = 1, lower_criterion = 0.001, upper_criterion = 0.999) test_bin2
Notice that the argument upper_bnd
in the tests test_bin1
and test_bin2
is set
to 1, rather than Inf
like in the tests that processed counts, test_counts1
and
test_counts1a
. This is because the parameter space of population density, mu
, for
binomial data is [0, 1]
but it is [0, Inf]
for count data.
The sequential.pops
package is equipped with tools to evaluate the accuracy and
efficiency of specifications for the STBP. For a perfectly accurate sequential test,
the acceptance rates of H: mu > psi
are zero when the true population is < psi and
1 when it is > psi, with a minimum (efficient) average number of samples. Thus,
the evaluation of STBP involves the analysis of the proportion of times H is accepted
across a range of true population densities and the corresponding number of sampling
bouts required to make a decision. Sometimes these analyses are called "operating
characteristics".
All you need to run an evaluation of a STBP is a STBP
object, which is created every
time a test is run through stbp_composite
, an evaluation range, which is a
sequence of population densities over which the evaluation is to be performed, and the
sample size within bouts. You can add additional levels of sophistication by playing with
the initial prior probability or by adding stochasticity to the overdipersion parameter,
in case you are using one.
To demonstrate the evaluation of STBP we will use the STBP
objects created above.
The simplest one is test_counts1
, which is a purely sequential design that tests
H: mu > 9
with constant overdispersion of k = 4.5
. For the evaluation, the only
relevant information that is extracted from test_counts1
has to do with the test
specification, and the data is not really used. To run evaluation of test_counts1
we call the function STBP.eval
:
eval1 <- STBP.eval(test_counts1, eval.range = seq(3, 12), n = 1, prior = 0.5, N = 20) eval1
Notice that the evaluation range of true population densities (eval.range
) includes
psi, the threshold (hypothesis), because we are particularly interested in examining
the behavior of the test when population size is close the threshold. Also, that
the purely sequential approach was kept for this analysis (n = 1
), but we could
also try other samples sizes. The argument N
is the number of simulations run for each
true population size. In this case is 20, so a total of 200 simulations were run
(20 * length(seq(3, 12))
), which is pretty small for a formal analysis. You should run,
at least, 1000 simulation per population size in more formal test.
The output of STBP.eval
is a list()
with two vectors: $AvgSamples
and $AcceptRate
.
Both have as many elements as length(eval.range)
and the first is the mean number
of sampling bouts required to make a decision about H and the second is the proportion
of simulation runs that resulted in acceptance of H. Both can be visualized across
the evaluated range of true population densities to the test performance around the
threshold of 9:
plot(seq(3, 12), eval1$AvgSamples, type = "o", xlab = "True population size", ylab = "Average number of bouts") abline(v = 9, lty = 2) plot(seq(3, 12), eval1$AcceptRate, type = "o", xlab = "True population size", ylab = "Acceptance rate of H") abline(v = 9, lty = 2)
There is an argument in the STBP.eval
function called overdispersion.sim
that was
omitted in eval1
. Through overdispersion.sim
, you can specify overdispersion
functions or values different from those specified in the test. So, in eval1
, the
overdispersion used to run the simulations was the same as in the test test_counts1
.
However, more realistic test evaluations can be performed if (1) varying overdispersion
is included and (2) stochasticity about the variance-mean relationship is considered. The
first can either be specified in the model (as in test_counts1a
or in test_bin2
) or
directly in STBP.eval
through overdispersion.sim
, in case is not in the model.
To add stochasticity, however, you should declare a new overdispersion function
with allowance for variability. This new function is similar to the one used for
the models but includes a random normal variable with mean = 0
and standard
deviation sd
as a new factor. Conventionally, sd
is the square root
of the mean square error for the regression used to fit the variance-mean model:
Taylor's Power Law, for count data, and Madden's model for binomial data
(Binns et al. 2000).
For example, if we want to add stochasticity to the evaluation of model test_counts1a
,
we should incorporate a normal random variable in function estimate_k
by:
estimate_k_stoch <- function(mean) { a = 1.83 b = 1.21 RMSE = 0.32 (mean^2) / ((a * mean^(b) * exp(truncdist::rtrunc(1, "norm", a = log(1 / (a * mean^(b - 1))), b = Inf, mean = 0, sd = RMSE))) - mean) }
Notice that estimate_k_stoch
is similar to estimate_k
except that the former
includes the exponential of a normal random variable as a factor in the denominator.
Here we use a truncated normal distribution, form the truncdist
package, to prevent
zero denominators or negative values for k
, but you can also use rnorm
and a
different alternative to stay away from undefined or negative values for the
overdispersion parameter. To run the evaluation including stochasticity, we just
specify estimate_stoch
as a character string in the argument overdispersion.sim
.
eval1a <- STBP.eval(test_counts1, eval.range = seq(3, 12), n = 1, prior = 0.5, overdispersion.sim = "estimate_k_stoch", N = 20) eval1a
Test evaluation is a useful tool to compare different sequential analysis approaches, but it can also be useful to calibrate sample size, decision criteria or the impact of (incorrect) initial priors of sequential designs. For example, you can compare both accuracy and efficiency of tests with different sample sizes within bouts.
eval3a <- STBP.eval(test_counts1, eval.range = seq(3, 12), n = 3, prior = 0.5, overdispersion.sim = "estimate_k_stoch", N = 20) eval6a <- STBP.eval(test_counts1, eval.range = seq(3, 12), n = 6, prior = 0.5, overdispersion.sim = "estimate_k_stoch", N = 20)
Here eval3a
includes not 1 but 3 samples per sampling bout and eval6a
includes
6 samples. We can now visualize how sample size within bouts affects both accuracy
and efficiency:
plot(seq(3, 12), eval1a$AvgSamples, type = "o", xlab = "True population size", ylab = "Average number of bouts") points(seq(3, 12), eval3a$AvgSamples, type = "o", pch = 19) points(seq(3, 12), eval6a$AvgSamples, type = "o", pch = 0) abline(v = 9, lty = 2) plot(seq(3, 12), eval1a$AcceptRate, type = "o", xlab = "True population size", ylab = "Acceptance rate of H") points(seq(3, 12), eval3a$AcceptRate, type = "o", pch = 19) points(seq(3, 12), eval6a$AcceptRate, type = "o", pch = 0) abline(v = 9, lty = 2)
This very limited analysis (only 20 simulations per population density) shows that sample sizes of 3 (filled circles) and 6 (empty squares) outperform in accuracy and efficiency a purely sequential design (empty circles). Some gains in accuracy can be achieved by increasing sample size from 3 to 6.
Test evaluation is similar for binomial data, except that the evaluation range is
on the [0, 1]
interval. The analog of estimate_k_stoch
for the binomial world
for the tomato leafminer examples is defined as (Binns et al. 2000):
estimate_theta_stoch <- function(mean) { A = 780.72 b = 1.61 R = 10 RMSE = 0.41 ((1 / (R - 1)) * (((A * R^(1 - b) * exp(rnorm(1, mean = 0, sd = RMSE))) / ((mean * (1 - mean))^(1 - b))) - 1)) }
Now, we can run a similar analysis to compare sample sizes for a binomial design.
One important difference between binomial and count sampling is that sample sizes
for binomial designs typically require more samples, although they are quicker and
easier to examine. Simulations within STBP.eval
con only consider binomial sampling
with sample units or clusters made of a single observation (size = 1
). So, within
STBP.eval
, n = 1
means a single presence/absence or infected/uninfected observation.
To compensate, the argument n
in STBP.eval
should reflect the total number of
observations considering all the clusters. For example, if you plan to design a binomial
sequential sampling with three clusters each of 10 observations (samples), sample size
for evaluation should be set to 30, n = 30
. Here we test 10, 20 and 30 total
observations.
evalB10 <- STBP.eval(test_bin2, eval.range = seq(0.01, 0.25, 0.02), n = 10, prior = 0.5, overdispersion.sim = "estimate_theta_stoch", N = 20) evalB20 <- STBP.eval(test_bin2, eval.range = seq(0.01, 0.25, 0.02), n = 20, prior = 0.5, overdispersion.sim = "estimate_theta_stoch", N = 20) evalB30 <- STBP.eval(test_bin2, eval.range = seq(0.01, 0.25, 0.02), n = 30, prior = 0.5, overdispersion.sim = "estimate_theta_stoch", N = 20)
Notice that the evaluation range are proportions on [0.01, 0.25]
, excluding 0
to avoid warnings
, and considering that H was set to H: mu > 0.15
in test_bin2
.
We can also visualize acceptance rate of H and average number of bouts across true
incidences:
plot(seq(0.01, 0.25, 0.02), evalB10$AvgSamples, type = "o", xlab = "True incidence", ylab = "Average number of bouts") points(seq(0.01, 0.25, 0.02), evalB20$AvgSamples, type = "o", pch = 19) points(seq(0.01, 0.25, 0.02), evalB30$AvgSamples, type = "o", pch = 0) abline(v = 0.15, lty = 2) plot(seq(0.01, 0.25, 0.02), evalB10$AcceptRate, type = "o", xlab = "True incidence", ylab = "Acceptance rate of H") points(seq(0.01, 0.25, 0.02), evalB20$AcceptRate, type = "o", pch = 19) points(seq(0.01, 0.25, 0.02), evalB30$AcceptRate, type = "o", pch = 0) abline(v = 0.15, lty = 2)
Again, this is a very limited analysis made of only 20 simulations per true incidence.
However, it shows how the number of bouts required to make a decision decreases
with sample size, with designs with 10 samples (empty circles) struggling to make quick
decision when incidence is far from the the threshold of 0.15, compared with sample
sizes of 20 (filled circles) and 30 (empty squares). Accuracy is also affected by
sample size with n = 20
and n = 30
producing less incorrect decisions,
particularly when the true incidence is below the threshold.
Sometimes hypotheses about population densities are not single numbers, but entire population trajectories. This is the case when monitoring programs are designed to track a target population over time and decide as early as possible whether a management action is necessary to prevent an outbreak. Population models often assist the prediction of densities over time based on current densities and growth rates, so the profile of population dynamics that result in problematic population sizes is often known.
An example with real data is presented by Rincon et al. (2025). Here we use the
logistic population growth model, pop.outB
, to make up an outbreak population
trajectory of 20 weekly population densities that results in densities that cannot
be tolerated. We will call such trajectory OB_pop <- pop.outB(t = seq(1, 20))
,
where t
is given in weeks.
pop.outB <- function(t) { N0 = 15 K = 90 r = 0.25 K / (1 + (K / N0 - 1) * exp(-r * t)) } OB_pop <- pop.outB(t = seq(1, 20)) plot(seq(1, 20), OB_pop, xlab = "Time (weeks)", ylab = "Population density", lwd = 2, type = "o", pch = 19)
The goal is to determine, with the fewest possible number of sampling bouts collected
over time, if a sampled population is consistent with one equivalent or greater than
pop.outB
. One complication that the threshold here is a moving target and the
differences between outbreak and non-outbreak population configurations during the
first few weeks (when we want to make an informed decision) are often tiny, but
tend to amplify over time. Thus, the test should be able to detect those early tiny
differences and produce correct early warnings.
Every cycle/season the population may behave differently but we can use the outbreak configuration as a reference to quantify how severe they are. For example, non-outbreaks are a portion of the outbreak, and very bad seasons/cycles could even take larger population sizes than a the reference outbreak. We can produce several variants of population trajectories based on the outbreak:
plot(seq(1, 20), OB_pop, ylim = c(0, 95), type = "o", lwd = 2, ylab = "Population size", xlab = "Time (weeks)", pch = 19) points(seq(1, 20), OB_pop * 0.9, type = "o") points(seq(1, 20), OB_pop * 0.7, type = "o", pch = 0) points(seq(1, 20), OB_pop * 0.5, type = "o", pch = 2) points(seq(1, 20), OB_pop * 1.1, type = "o", pch = 5)
The trajectory in bold is the outbreak population configuration, OB_pop
, and the
rest are outbreak and non-outbreak variants. The pattern with triangles is a configuration
of 50% of an outbreak, the one with squares one of 70% of an outbreak, circles 90% of
an outbreak, and diamonds represents one that surpasses the outbreak by 10%. Modeling
variants using the outbreak as a reference (with proportions) is more realistic than
varying single parameters in the pop.outB
function, since there are multiple
mechanisms that drive simultaneously outbreak severity. As we'll see below, the
evaluation range to evaluate STBP for dynamic thresholds in STBP.eval
is provided
as factors of the threshold trajectory.
Like for static hypotheses, data for dynamic hypotheses come in sequential bouts of counts or proportions with or without overdispersion, and the data is structured in the same way. The only difference is that true density varies over time according to the population model. For example, data collected sequentially in bouts of 10 samples each on a population that is 70% of an outbreak ca be generated as:
pop70 <- OB_pop * 0.7 count70 <- matrix(NA, 10, 20) for(i in 1:20){ count70[, i] <- rnbinom(10, mu = pop70[i], size = estimate_k(pop70[i])) } head(count70, 3) # only first three samples out of ten are shown
Notice that we are using the same negative binomial specification of example 2,
with varying overdispersion described with the function estimate_k
, but a constant
overdispersion could also be used. We can visualize the sample means per sampling bout
(means across columns of counts70
) along with the outbreak trajectory, OB_pop
:
plot(seq(1, 20), OB_pop, ylim = c(0, 95), type = "o", lwd = 2, ylab = "Population size", xlab = "Time (weeks)", pch = 19) points(seq(1, 20), colMeans(count70), type = "o")
To run the test on count70
, we only have to specify the outbreak trajectory (not
the function) in the hypothesis
argument of stbp_composite
:
test_dyn <- stbp_composite(data = count70, greater_than = TRUE, hypothesis = OB_pop, density_func = "negative binomial", overdispersion = "estimate_k", prior = 0.5, lower_bnd = 0, upper_bnd = Inf, lower_criterion = 0.001, upper_criterion = 0.999) test_dyn
The STBP rejects H: trajectory mu > trajectory OB_pop
after only two bouts. The
procedure to test dynamic hypotheses with binomial data is similar, except that
trajectories should be provided as proportions (incidence) and data should be
structured as list()
of matrices with column 1 with successes and column 2 with
samples.
STBP for dynamic hypotheses can be evaluated the same way as tests for static
hypotheses. The only difference is that eval.range
is provided as factors of the
threshold trajectory. For example, for test_dyn
, the evaluation is run on data
generated for a range of percentages of OB_pop = pop.outB(seq(1, 20))
, such
as OB_pop * 0.5
for 50% an outbreak, OB_pop * 0.7
for 70% an outbreak, or
OB_pop * 1.2
for 20% above the outbreak. We can refer to these factors as
"outbreak intensities".
eval_dyn <- STBP.eval(test_dyn, eval.range = seq(0.2, 1.5, 0.2), n = 10, prior = 0.5, N = 20)
We can visualize the average number of bouts required to make a decision and the acceptance rate of H across true outbreak intensities.
plot(seq(0.2, 1.5, 0.2), eval_dyn$AvgSamples, type = "o", xlab = "True outbreak intensity", ylab = "Average number of bouts") abline(v = 1, lty = 2) plot(seq(0.2, 1.5, 0.2), eval_dyn$AcceptRate, type = "o", xlab = "True outbreak intensity", ylab = "Acceptance rate of H") abline(v = 1, lty = 2)
The pattern is similar to the one shown for static hypotheses (example 2). Average
sampling bouts peaks at the threshold (outbreak intensity = 1) where decision-making
is toughest and more samples are required to distinguish outbreaks from non-outbreaks.
Acceptance rate of H is 0 at low outbreak intensities and reaches 1 above the threshold
(outbreak intensity = 1). Again, the value of this analysis stems from the ability
to check the balance between sampling effort and accuracy that results from varying
decision criteria (arguments lower_criterion
or upper_criterion
) and sampling
size within bouts.
The SPRT was invented in the context of quality control back in the 40s, but it is widely applied in many fields, including population management and clinical trials. In fact, there are other R packages that implement the SPRT, particularly in the field of clinical trials (see introduction for a list). The SPRT is popular among practitioners of Integrated Pest Management, who are often familiar with the charts with two parallel lines on which cumulative pest counts are overlapped to decide if sampling should continue or if there is enough evidence to stop and decide in favor or against the hypothesis being tested. The SPRT has been extensively tested and it has been shown to be optimal under a restricted range of contexts. The SPRT can be easily implemented even without a calculator just by comparing data with stop lines "on the fly".
The main difference between STBP and SPRT is philosophical. While STBP provides
recommendations about H based on Bayesian probabilities, SPRT does so based on
likelihood ratios. Bayesian probabilities are understood as the credibility for H
to be true given the data, and likelihood ratios rely on the frequency in which
observations occur in face of two competing models. For this reason, the decision
criteria for STBP (i.e., arguments lower_criterion
and upper_criterion
) are not
directly associated with type I and type II error rates and the stbp_simple
and
stbp_composite
outputs include the final posterior probability (credibility) for
H. In SPRT, tolerable type I and type II error rates are explicitly set as part of
the test specification and the final likelihood ratio is not as relevant since the
"significance" of the result is stated from the beginning and provided that one of
the two stop lines has been reached.
One important drawback of the SPRT is that it requires the specification of two
non-related hypotheses. You can't test a single hypothesis against its complementary,
like in STBP, because the SPRT requires two simple hypothesis to produce a probability
ratio each time new data come. Also, the SPRT can't process group sequential data,
unless you use the mean, median or mode and in its conventional form cannot be used
to test dynamic hypotheses. If you want to use SPRT for dynamic hypotheses, you
must use a variation called Time-Sequential Probability Ratio Test, which is not
implemented in the sequential.pops
package yet.
Like other sequential tests, the SPRT assist in situations where one wants to test if a given population is above or below a threshold. The main difference with the STBP is that SPRT is specified with two hypotheses instead of one. There are no rules on how those hypotheses should be stated when you only want to test a single threshold-style hypothesis. Conventionally, they are formulated equidistant from the threshold density, using confidence intervals for the threshold, but there are other approaches.
For this example, we will use once again the case study of the tomato leafminer
in greenhouse tomatoes. Let's assume again the economic threshold is static (constant)
at 9 larvae per plant and we want to test, through sequential sampling, if a population
has exceeded such threshold. In the sequential.pops
package, the same probability
distributions available for the STBP are also available for the SPRT, except for
the beta-binomial, for which stop lines have not been derived yet.
Similar to the STBP, distributions without overdispersion (poisson and binomial) are used to test and sample small population sizes. But when population grow they start to aggregate and overdispersion becomes a nuisance that must be accounted for by using appropriate distributions (negative binomial). However, the to specify a SPRT the overdispersion parameter is only used for the threshold density to calculate stop lines. If there a function to obtain reliable values for the overdispersion parameter based on the mean, it can be used for model evaluation to generate more realistic data.
Unlike the STBP, you can specify a SPRT without data, in which case you'll get the a
chart with the corresponding stop lines as output. For example, if we want build
a test specification for H: mu > 9
larvae per plant, we have to provide two
hypotheses that allow for the sequential calculation of probability ratios, so that
would be H0: mu = 8
and H1: mu = 10
. We also must provide a probability distribution,
an overdispersion parameter value (in case the distribution is negative binomial),
and tolerable type I (alpha
) and type II (beta
) error rates:
testPR10 <- sprt(mu0 = 8, mu1 = 10, density_func = "negative binomial", overdispersion = estimate_k(9), alpha = 0.1, beta = 0.1)
Notice that no data have to be provided, and that overdispersion
is not a function
but a value specified through a function for the threshold density of 9 larvae per
plant. A chart with the stop line ready to contrast with cumulative counts is returned
with the call plot
:
plot(testPR10)
and the coefficients for the stops lines by calling the object:
testPR10
Now let's generate some data with the same specifications as in example 2. This is
from a negative binomial distribution and the function estimate_k
to obtain
overdispersion based on the mean:
counts1a <- rnbinom(20, mu = 5, size = estimate_k(5)) counts1a
This is 20 random counts from a population of 5 larvae per sample unit (plant) in
size. If we run the function sprt
with data, a SPRT test is performed and a new
"SPRT"
object is created.
testPR1 <- sprt(data = counts1a, mu0 = 8, mu1 = 10, density_func = "negative binomial", overdispersion = estimate_k(9), alpha = 0.1, beta = 0.1) testPR1
The output is similar to the one generated by the functions stbp_simple
and
stbp_composite
. In this case, H0 is accepted, evidence is compatible with a sampled
population above the threshold of 9 individuals per plant. In contrast to stbp_simple
and stbp_composite
, the output from sprt
does not provide a probability and the
precision of the final decision relies on the selected values for alpha
and beta
.
Once again, note that overdispersion is provided as a value through a function.
There is also a plot
method for "SPRT"
objects, which overlap the stop lines
with cumulative counts:
plot(testPR1)
In this case, the whole sequence of 20 counts is displayed, but the sampling should have been terminated after the 9th bout when cumulative counts touched the area below the lower stop line. The procedure for binary variables is similar, but cluster sampling (i.e., the ability to collect clusters of samples, like in the STBP) is not allowed. Instead, only 1 and 0 values can be introduced and decisions and based on the cumulative number of 1s across sampling bouts.
For example, binary data collected sequentially for a SPRT is structured in a vector
that can only contain 1 and 0. For example, we can generate some binary data for a
population with mean incidence of prob = 0.5
, without cluster sampling, size = 1
:
binPR <- rbinom(40, prob = 0.5, size = 1) binPR
Again, we could get the stop lines in a chart and the corresponding coefficients to
set up a sampling plan for H: mu > 0.4
just by specifying the test as H0: mu = 0.35
vs. H1: mu = 0.45
and omitting the argument data
in the function sprt
:
testPR20 <- sprt(mu0 = 0.35, mu1 = 0.45, density_func = "binomial", alpha = 0.1, beta = 0.1) plot(testPR20)
and obtain the stop lines coefficients with:
testPR20
Now, we can run a SPRT for the data in binPR
with:
testPR2 <- sprt(data = binPR, mu0 = 0.35, mu1 = 0.45, density_func = "binomial", alpha = 0.1, beta = 0.1) testPR2
In this case, 34 binomial sampling bouts made of a single observation were required
to make a decision between H0 or H1. The object testPR2
can also be used to produce
a chart that compares stop lines with the cumulative data:
plot(testPR2)
Like for the STBP, the goal of evaluating SPRT is to check specifications and adjust
to achieve a nice balance between sampling effort and precision. The procedure and
requirements and outputs are similar. For test evaluation, overdispersion for the
negative binomial can be specified as a function with or without stochasticity. The
evaluation with a constant overdispersion (as is in the model testPR1
) is performed
with:
evalPR1 <- SPRT.eval(testPR1, eval.range = seq(4, 12), N = 20)
Notice that there is no need to specify once more overdispersion when it is a constant because the same use for the model is used by default. If we wanted to add a function that describes overdipersion with added stochasticity the scrip would be:
evalPR1a <- SPRT.eval(testPR1, eval.range = seq(4, 12), overdispersion.sim = "estimate_k_stoch", N = 20)
Now we can compare both evaluations, with and without functional description of overdispersion with stochasticity:
plot(seq(4, 12), evalPR1a$AvgSamples, xlab = "True population size", ylab = "Average number of bouts", type = "o") points(seq(4, 12), evalPR1$AvgSamples, type = "o", pch = 2) abline(v = 9, lty = 2) plot(seq(4, 12), evalPR1a$AcceptRate, xlab = "True population size", ylab = "Acceptance rate of H1", type = "o") points(seq(4, 12), evalPR1$AcceptRate, type = "o", pch = 2) abline(v = 9, lty = 2)
Notice the the evaluation for SPRT is performed on the acceptance rate for H1. As
mentioned above, the addition of realistic descriptions of overdispersion across
count of binomial data provides a more realistic expectation for the test when applied
to real data. In the figures produced above, triangles denote the evaluation assuming
constant overdispersion and circles with varying overdispersion and added stochasticity.
Although this is based on far too few simulations (N = 20
), it shows that simulations
that omit the varying nature of overdispersion underestimate the number of sampling
bouts required to make a decision and the rate of incorrect decisions, specially
when the true population size is above the threshold.
Evaluation of tests with binomial data is similar:
evalPR2 <- SPRT.eval(testPR2, eval.range = seq(0.1, 0.8, 0.1), N = 20) plot(seq(0.1, 0.8, 0.1), evalPR2$AvgSamples, xlab = "True incidence", ylab = "Average number of bouts", type = "o") abline(v = 0.4, lty = 2) plot(seq(0.1, 0.8, 0.1), evalPR2$AcceptRate, xlab = "True incidence", ylab = "Acceptance rate of H1", type = "o") abline(v = 0.4, lty = 2)
Although binomial sampling is quicker and easier, it requires significantly more samples than count sampling and precision is often compromised too. However, binomial sampling can be a good alternative when the threshold population density is small [see Binns et al. (2000) for a discussion].
Binns, M.R., Nyrop, J.P. & Werf, W.v.d. (2000) Sampling and monitoring in crop protection: the theoretical basis for developing practical decision guides. CABI Pub., Wallingford, Oxon, UK; New York, N.Y (USA). 284pp.
Madden, L. V., Hughes, G., & Bosch, F.v.d. (2007). The study of plant disease epidemics. American Phytopathological Society. St. Paul, MN (USA). 421pp.
Rincon, D.F., McCabe, I. & Crowder, D.W. (2025) Sequential testing of complementary hypotheses about population density. Methods in Ecology and Evolution. https://doi.org/10.1111/2041-210X.70053
Rincon, D. F., Rivera- Trujillo, H. F., Mojica- Ramos, L., & BorreroEcheverry, F. (2021) Sampling plans promoting farmers' memory provide decision support in Tuta absoluta management. Agronomy for Sustainable Development, 41: 33. https://doi.org/10.1007/s13593-021-00693-0
Wald, A. (1945) Sequential Tests of Statistical Hypotheses. The Annals of Mathematical Statistics 16(2): 117-186.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.