stratify: Cumulative square root frequency stratification.

View source: R/stratify.R

stratifyR Documentation

Cumulative square root frequency stratification.

Description

This function implements the "cumulative square root frequency method" (Dalenius & Hodges, 1959) for determining the approximately optimal stratification of elements for stratified random sampling with Neyman allocation.

Usage

stratify(x, strata, breaks)

Arguments

x

An auxiliary variable to be used for stratification.

strata

Number of strata.

breaks

Breaks for the auxiliary variable expressed as a vector of cut points.

Details

See Dalenius and Hodges (1959) or Cochran (1977) for details. Ideally the auxiliary variable should be strongly correlated with the target variable.

Value

A list object including a data frame giving the strata assignment of the elements and the cut points that define the strata in terms of the auxiliary variable.

Source

Cochran, W. G. (1977). Sampling techniques (3rd Edition). New York: Wiley.

Dalenius, T. & Hodges, J. L. Jr. (1959). Minimum variance stratification. Journal of the American Statistical Assocation, 54, 88-101.

Examples

# replication of an example from Cochran (1977)
x <- rep(seq(2.5, 97.5, by = 5), c(3464, 2516, 2157, 1581, 1142, 
  746, 512, 376, 265, 207, 126, 107, 82, 50, 39, 25, 16, 19, 2, 3))
stratify(x, strata = 5, breaks = seq(0, 100, by = 5))
# artificial data with a normally-distributed auxiliary variable
set.seed(101)
x <- rnorm(10000, 20, 3)
stratify(x, strata = 4, breaks = 25)

trobinj/trtools documentation built on Jan. 3, 2025, 4:14 a.m.