knitr::opts_chunk$set(echo = TRUE)
library(Rmpfr)
library(allocation)

Overview

This package implements several exact optimization methods to allocate samples to strata. See the following references for details.

  1. Tommy Wright (2012). The Equivalence of Neyman Optimum Allocation for Sampling and Equal Proportions for Apportioning the U.S. House of Representatives. The American Statistician, 66, pp.217-224.
  2. Tommy Wright (2017), Exact optimal sample allocation: More efficient than Neyman, Statistics & Probability Letters, 129, pp.50-57.

Installation

To install from Github, obtain the devtools package and run the following in R. The Rmpfr package is used - both inside the allocation package and to set up the following examples - for high precision arithmetic.

install.packages("Rmpfr")
devtools::install_github("andrewraim/allocation")

Usage

library(Rmpfr)
library(allocation)

Algorithm III: Sampling with Target Sample Size

Run Algorithm III using an example in Wright (2017).

N_str = c(47, 61, 41)
S_str = sqrt(c(100, 36, 16))
lo_str = c(1,2,3)
hi_str = c(5,6,4)
n = 10

out1 = algIII(n, N_str, S_str, lo_str, hi_str)
print(out1)

To see details justifying each selection, run algIII with the option verbose = TRUE.

Compare the above results to Neyman allocation

out2 = neyman(n, N_str, S_str)
print(out2)

Internally, we work with high precision numbers via the Rmpfr package. We provided an alloc accessor function to extract the allocation as a numeric vector.

alloc(out1)
alloc(out2)

The numerical precision and number of decimal points printed, can be changed by setting a global option for the allocation package.

options(allocation.prec.bits = 256)
options(allocation.print.decimals = 4)

Algorithm IV: Sampling with Target Variance

Run Algorithm IV using an example in Wright (2017). Since our target variance v0 is a very large number, we pass it as an mpfr object to avoid loss of precision.

H = 10
v0 = mpfr(388910760, 256)^2
N_str = c(819, 672, 358, 196, 135, 83, 53, 40, 35, 13)
lo_str = c(3,3,3,3,3,3,3,3,3,13)
S_str = c(330000, 518000, 488000, 634000, 1126000, 2244000, 2468000, 5869000, 29334000, 1233311000)

out1 = algIV(v0, N_str, S_str, lo_str)
print(out1)

To see details justifying each selection, run algIV with the option verbose = TRUE.

Compare the above results to Neyman allocation. Here, we first need to compute a target sample size. This is done with a given cv and revenue data. See Wright (2017) for details. We also exclude the 10th stratum from the allocation procedure, as it is a certainty stratum; its allocation is considered fixed at 13.

cv = 0.042
rev = mpfr(9259780000, 256)
n = sum(N_str[-10] * S_str[-10])^2 / ((cv * rev)^2 + sum(N_str[-10] * S_str[-10]^2))
out2 = neyman(n, N_str[-10], S_str[-10])
print(out2)

Extract the final allocations.

alloc(out1)
alloc(out2)

```



andrewraim/tommysampling documentation built on Sept. 7, 2019, 5:26 a.m.