Description Usage Arguments Value References Examples
This is a fast implementation of the cube method. To have a fixed sample size, include the inclusion probabilities as a balancing variable in Xbal
and make sure the inclusion probabilities sum to a positive integer. Landing is done by dropping balancing variables (from rightmost column, so keep inclusion probabilities in first column to guarantee fixed size).
1 | cube(prob,Xbal)
|
prob |
vector of length N with inclusion probabilities |
Xbal |
matrix of balancing auxiliary variables of N rows and r columns |
Returns a vector of selected indexes in 1,2,...,N.
Deville, J. C. and Till<c3><a9>, Y. (2004). Efficient balanced sampling: the cube method. Biometrika, 91(4), 893-912.
Chauvet, G. and Till<c3><a9>, Y. (2006). A fast algorithm for balanced sampling. Computational Statistics, 21(1), 53-62.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## Not run:
# Example 1
# Select sample
set.seed(12345);
N = 1000; # population size
n = 100; # sample size
p = rep(n/N,N); # inclusion probabilities
X = cbind(p,runif(N),runif(N)); # matrix of auxiliary variables
s = cube(p,X); # select sample
# Example 2
# Check inclusion probabilities
set.seed(12345);
p = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9); # prescribed inclusion probabilities
N = length(p); # population size
ep = rep(0,N); # empirical inclusion probabilities
nrs = 10000; # repetitions
for(i in 1:nrs){
s = cube(p,cbind(p));
ep[s]=ep[s]+1;
}
print(ep/nrs);
# Example 3
# How fast is it?
# Let's check with N = 100 000 and 5 balancing variables
set.seed(12345);
N = 100000; # population size
n = 100; # sample size
p = rep(n/N,N); # inclusion probabilities
# matrix of 5 auxiliary variables
X = cbind(p,runif(N),runif(N),runif(N),runif(N));
system.time(cube(p,X));
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.