lcube: The Local Cube method

View source: R/lcube.R

lcubeR Documentation

The Local Cube method

Description

Selects doubly balanced samples with prescribed inclusion probabilities from a finite population using the Local Cube method.

Usage

lcube(prob, Xspread, Xbal, type = "kdtree2", bucketSize = 50, eps = 1e-12)

lcubestratified(
  prob,
  Xspread,
  Xbal,
  integerStrata,
  type = "kdtree2",
  bucketSize = 50,
  eps = 1e-12
)

Arguments

prob

A vector of length N with inclusion probabilities.

Xspread

An N by p matrix of (standardized) auxiliary variables. Squared euclidean distance is used in the Xspread space.

Xbal

An N by q matrix of balancing auxiliary variables.

type

The method used in finding nearest neighbours. Must be one of "kdtree0", "kdtree1", "kdtree2", and "notree".

bucketSize

The maximum size of the terminal nodes in the k-d-trees.

eps

A small value used to determine when an updated probability is close enough to 0.0 or 1.0.

integerStrata

An integer vector of length N with stratum numbers.

Details

If prob sum to an integer n, and prob is included as the first balancing variable, a fixed sized sample (n) will be produced.

Stratified lcube

For lcubestratified, prob is automatically inserted as a balancing variable.

The stratified version uses the fast flight Cube method and pooling of landing phases.

Value

A vector of selected indices in 1,2,...,N.

Functions

  • lcubestratified():

k-d-trees

The types "kdtree" creates k-d-trees with terminal node bucket sizes according to bucketSize.

  • "kdtree0" creates a k-d-tree using a median split on alternating variables.

  • "kdtree1" creates a k-d-tree using a median split on the largest range.

  • "kdtree2" creates a k-d-tree using a sliding-midpoint split.

  • "notree" does a naive search for the nearest neighbour.

References

Deville, J. C. and Tillé, Y. (2004). Efficient balanced sampling: the cube method. Biometrika, 91(4), 893-912.

Chauvet, G. and Tillé, Y. (2006). A fast algorithm for balanced sampling. Computational Statistics, 21(1), 53-62.

Chauvet, G. (2009). Stratified balanced sampling. Survey Methodology, 35, 115-119.

Grafström, A. and Tillé, Y. (2013). Doubly balanced spatial sampling with spreading and restitution of auxiliary totals. Environmetrics, 24(2), 120-131

See Also

Other sampling: cube(), hlpm2(), lpm(), scps()

Examples

## Not run: 
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
s = lcube(prob, xspr, cbind(prob, x));
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);

set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
strata = c(rep(1L, 100), rep(2L, 200), rep(3L, 300), rep(4L, 400));
s = lcubestratified(prob, xspr, x, strata);
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);

set.seed(12345);
prob = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9);
N = length(prob);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
ep = rep(0L, N);
r = 10000L;
for (i in seq_len(r)) {
  s = lcube(prob, xspr, cbind(prob, x));
  ep[s] = ep[s] + 1L;
}
print(ep / r);

## End(Not run)


BalancedSampling documentation built on May 29, 2024, 10:25 a.m.