stratClust: Estimation from a Stratified Survey Design

View source: R/stratClust.R

stratClustR Documentation

Estimation from a Stratified Survey Design

Description

Estimate the population response from a stratified survey design in which single stage cluster samples were taken from each stratum, and the size of each stratum is known.

Usage

stratClust(elementdf, stratum, cluster, response, sizedf,
  stratum2 = stratum, size)

Arguments

elementdf

A data frame with one row for each sample unit (element), containing variables identifying the stratum, cluster, and response value for each element.

stratum

A character scalar giving the name of the variable in elementdf that identifies the strata.

cluster

A character scalar giving the name of the variable in elementdf that identifies the clusters.

response

A character scalar giving the name of the variable in elementdf that contains the response values.

sizedf

A data frame with one row for each stratum. It should contain variables identifying the stratum and the size of each stratum.

stratum2

A character scalar giving the name of the variable in sizedf that identifies the strata, default is stratum, the same name as in elementdf.

size

A character scalar giving the name of the variable in sizedf that contains the sizes of the strata.

Value

A list of three data frames (tibbles, actually). Cluster has a row for each cluster with four variables:

  • h = stratum (may be character, factor, or numeric)

  • i = cluster (may be character, factor, or numeric)

  • m_hi = number of elements in stratum h, cluster i (numeric)

  • y_hi = sum of the response for stratum h, cluster i (numeric)

Stratum has a row for each stratum with eight variables:

  • h = stratum (may be a character, factor, or numeric)

  • m_h = number of elements in stratum h (numeric)

  • y_h = sum of response in stratum h (numeric)

  • n_h = number of clusters in stratum h (numeric)

  • ybar_h = mean response for stratum h (numeric)

  • s_ybar_h = standard deviation of ybar_h (numeric)

  • A_h = size of stratum h (numeric)

  • W_h = relative size of stratum h (numeric)

Population has one row with five numeric variables:

  • A = total size of all strata in population

  • ybar_str = mean response for population per unit of size

  • s_ybar_str = standard deviation of ybar_str

  • ytot_str = total response for population

  • s_ytot_str = standard deviation of ytot_str

References

Cochran, W.G. 1977. [Sampling Techniques]. Wiley, New York.

Examples

# Example data from a stratified survey design in which
# single stage cluster sampling is used in each stratum.
# Strata are areal regions of a lake, and the response are counts of fish.
counts <- data.frame(
 Stratum=rep(c("A", "B", "C"), c(5, 8, 8)),
 Cluster=rep(1:8, c(3, 2, 3, 2, 3, 2, 3, 3)),
 Element=c(1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 1, 2, 3, 1, 2, 3),
 Count = c(5:1, 6:21)
)
# Surface area (in hectares) corresponding to each lake stratum.
areas <- data.frame(
 Stratum=c("A", "B", "C"),
 A_h=c(10, 20, 40)
)
stratClust(elementdf=counts, stratum="Stratum", cluster="Cluster",
 response="Count", sizedf=areas, size="A_h")


krphillips/EchoNet2Fish documentation built on March 19, 2022, 11:59 p.m.