View source: R/allocate_wave.R
allocate_wave | R Documentation |
Determines the adaptive optimum sampling allocation for a new sampling
wave based on results from previous waves. Using Neyman or
Wright (2014) allocation, allocate_wave
calculates the
optimum allocation for the total number of samples
across waves, determines how many were allocated to each strata
in previous waves, and allocates the remaining samples to make
up the difference.
allocate_wave(
data,
strata,
y,
already_sampled,
nsample,
allocation_method = c("WrightII", "WrightI", "Neyman"),
method = c("iterative", "simple"),
detailed = FALSE
)
data |
A data frame or matrix with one row for each
sampling unit, one column specifying each unit's stratum,
one column holding the value of the continuous variable for
which the variance should be minimized, and one column
containing a binary indicator, |
strata |
A character string or vector of character strings specifying the name of columns that indicate the stratum that each unit belongs to. |
y |
A character string specifying the name of the continuous variable for which the variance should be minimized. |
already_sampled |
A character string specifying the name of a
column that contains a binary ( |
nsample |
The desired sample size of the next wave. |
allocation_method |
A character string specifying the method of
optimum sample allocation to use. For details see
|
method |
A character string specifying the method to be used if at least one group was oversampled. Must be one of:
|
detailed |
A logical value indicating whether the output
dataframe should include details about each stratum including
the true optimum allocation without the constraint of
previous waves of sampling
and stratum standard deviations. Defaults to FALSE, unless called within
|
If the optimum sample size in a stratum is smaller than the
amount it was allocated in previous waves, that strata has been
oversampled. When oversampling occurs,
allocate_wave
"closes" the oversampled strata and
re-allocates the remaining samples optimally among the open
strata. Under these circumstances, the total sampling
allocation is no longer optimal, but optimall
will
output the most optimal allocation possible for the next wave.
Returns a dataframe with one row for each stratum and
columns specifying the stratum name ("strata"), population stratum size
("npop"
), cumulative sample in that strata
("nsample_actual"
), prior number sampled in that
strata ("nsample_prior"
), and the optimally allocated
number of units in each strata for the next wave ("n_to_sample"
).
McIsaac MA, Cook RJ. Adaptive sampling in two-phase designs: a biomarker study for progression in arthritis. Statistics in medicine. 2015 Sep 20;34(21):2899-912.
Reilly, M., & Pepe, M. S. (1995). A mean score method for missing and auxiliary covariate data in regression models. Biometrika, 82(2), 299-314.
Wright, T. (2014). A Simple Method of Exact Optimal Sample Allocation under Stratification with any Mixed Constraint Patterns, Research Report Series (Statistics #2014-07), Center for Statistical Research and Methodology, U.S. Bureau of the Census, Washington, D.C.
# Create dataframe with a column specifying strata, a variable of interest
# and an indicator for whether each unit was already sampled
set.seed(234)
mydata <- data.frame(Strata = c(rep(1, times = 20),
rep(2, times = 20),
rep(3, times = 20)),
Var = c(rnorm(20, 1, 0.5),
rnorm(20, 1, 0.9),
rnorm(20, 1.5, 0.9)),
AlreadySampled = rep(c(rep(1, times = 5),
rep(0, times = 15)),
times = 3))
x <- allocate_wave(
data = mydata, strata = "Strata",
y = "Var", already_sampled = "AlreadySampled",
nsample = 20, method = "simple"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.