opt_1sided: Algorithms for Optimum Sample Allocation Under One-Sided...

opt_1sidedR Documentation

Algorithms for Optimum Sample Allocation Under One-Sided Bounds

Description

[Stable]

Functions that implement selected optimal allocation algorithms that compute a solution to the optimal allocation problem defined in the language of mathematical optimization as follows.

Minimize

f(x_1,\ldots,x_H) = \sum_{h=1}^H \frac{A^2_h}{x_h}

subject to

\sum_{h=1}^H c_h x_h = c

and either

x_h \leq M_h, \quad h = 1,\ldots,H

or

x_h \geq m_h, \quad h = 1,\ldots,H,

where c > 0,\, c_h > 0,\, A_h > 0,\, m_h > 0,\, M_h > 0,\, h = 1,\ldots,H, are given numbers. The minimization is on \mathbb R_+^H.

The inequality constraints are optional and user can choose whether and how they are to be added to the optimization problem. If one-sided lower bounds m_h,\, h = 1,\ldots,H, must be imposed, it is then required that c \geq \sum_{h=1}^H c_h m_h. If one-sided upper bounds M_h,\, h = 1,\ldots,H, must be imposed, it is then required that 0 < c \leq \sum_{h=1}^H c_h M_h. Lower bounds can be specified instead of the upper bounds only in case of the LRNA algorithm. All other algorithms allow only for specification of the upper bounds. For the sake of clarity, we emphasize that in the optimization problem consider here, the lower and upper bounds cannot be imposed jointly.

Costs c_h,\, h = 1,\ldots,H, of surveying one element in stratum, can be specified by the user only in case of the RNA and LRNA algorithms. For remaining algorithms, these costs are fixed at 1, i.e. c_h = 1,\, h = 1,\ldots,H.

The following is the list of all the algorithms available to use along with the name of the function that implements a given algorithm. See the description of a specific function to find out more about the corresponding algorithm.

  • RNA - rna()

  • LRNA- rna()

  • SGA- sga()

  • SGAPLUS - sgaplus()

  • COMA - coma()

Functions in this family should not be called directly by the user. Use opt() or optcost() instead.

Usage

rna(
  total_cost,
  A,
  bounds = NULL,
  unit_costs = 1,
  check_violations = .Primitive(">="),
  details = FALSE
)

sga(total_cost, A, M)

sgaplus(total_cost, A, M)

coma(total_cost, A, M)

Arguments

total_cost

(number)
total cost c of the survey. A strictly positive scalar.

A

(numeric)
population constants A_1,\ldots,A_H. Strictly positive numbers.

bounds

(numeric or NULL)
optional lower bounds m_1,\ldots,m_H, or upper bounds M_1,\ldots,M_H, or NULL to indicate that there is no inequality constraints in the optimization problem considered. If not NULL, the bounds is to be treated either as:

  • lower bounds, if check_violations = .Primitive("<="). In this case, it is required that total_cost >= sum(unit_costs * bounds),
    or

  • upper bounds, if check_violations = .Primitive(">="). In this case, it is required that total_cost <= sum(unit_costs * bounds).

unit_costs

(numeric)
costs c_1,\ldots,c_H, of surveying one element in stratum. A strictly positive numbers. Can be also of length 1, if all unit costs are the same for all strata. In this case, the elements will be recycled to the length of bounds.

check_violations

(function)
2-arguments binary operator function that allows the comparison of values in atomic vectors. It must either be set to .Primitive("<=") or .Primitive(">="). The first of these choices causes that bounds are treated as lower bounds and then rna() function performs the LRNA algorithm. The latter option causes that bounds are treated as upper bounds, and then rna() function performs the RNA algorithm. This argument is ignored when bounds is set to NULL.

details

(flag)
should detailed information about strata assignments (either to take-Neyman or take-bound), values of set function s and number of iterations be added to the output?

M

(numeric or NULL)
upper bounds M_1,\ldots,M_H, optionally imposed on sample sizes in strata. If no upper bounds should be imposed, then M must be set to NULL. Otherwise, it is required that total_cost <= sum(unit_costs * M). Strictly positive numbers.

Value

Numeric vector with optimal sample allocations in strata. In case of the rna() only, it can also be a list with optimal sample allocations and strata assignments (either to take-Neyman or take-bound).

Functions

  • rna(): Recursive Neyman Algorithm (RNA) and its twin version, Lower Recursive Neyman Algorithm (LRNA) dedicated to the allocation problem with one-sided lower-bounds constraints. The RNA is described in Wesołowski et al. (2021), while LRNA is introduced in Wójciak (2023).

  • sga(): Stenger-Gabler type algorithm SGA, described in Wesołowski et al. (2021) and in Stenger and Gabler (2005). This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. c_h = 1,\, h = 1,\ldots,H.

  • sgaplus(): modified Stenger-Gabler type algorithm, described in Wójciak (2019) as Sequential Allocation (version 1) algorithm. This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. c_h = 1,\, h = 1,\ldots,H.

  • coma(): Change of Monotonicity Algorithm (COMA), described in Wesołowski et al. (2021). This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. c_h = 1,\, h = 1,\ldots,H.

Note

If no inequality constraints are added, the allocation is given by the Neyman allocation as:

x_h = \frac{A_h}{\sqrt{c_h}} \frac{n}{\sum_{i=1}^H A_i \sqrt{c_i}}, \quad h = 1,\ldots,H.

For stratified \pi estimator of the population total with stratified simple random sampling without replacement design in use, the parameters of the objective function f are:

A_h = N_h S_h, \quad h = 1,\ldots,H,

where N_h is the size of stratum h and S_h denotes standard deviation of a given study variable in stratum h.

References

Wójciak, W. (2023). Another Solution of Some Optimum Allocation Problem. Statistics in Transition new series, 24(5) (in press). https://arxiv.org/abs/2204.04035

Wesołowski, J., Wieczorkowski, R., Wójciak, W. (2021). Optimality of the Recursive Neyman Allocation. Journal of Survey Statistics and Methodology, 10(5), pp. 1263–1275. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/jssam/smab018")}, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2105.14486")}

Wójciak, W. (2019). Optimal Allocation in Stratified Sampling Schemes. MSc Thesis, Warsaw University of Technology, Warsaw, Poland. http://home.elka.pw.edu.pl/~wwojciak/msc_optimal_allocation.pdf

Stenger, H., Gabler, S. (2005). Combining random sampling and census strategies - Justification of inclusion probabilities equal to 1. Metrika, 61(2), pp. 137–156. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s001840400328")}

Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling, Springer, New York.

See Also

opt(), optcost(), rnabox().

Examples

A <- c(3000, 4000, 5000, 2000)
m <- c(50, 40, 10, 30) # lower bounds
M <- c(100, 90, 70, 80) # upper bounds

rna(total_cost = 190, A = A, bounds = M)
rna(total_cost = 190, A = A, bounds = m, check_violations = .Primitive("<="))
sga(total_cost = 190, A = A, M = M)
sgaplus(total_cost = 190, A = A, M = M)
coma(total_cost = 190, A = A, M = M)

wwojciech/stratallo documentation built on Dec. 24, 2024, 10:43 p.m.